ZenBrowserMirrors/pdf.js - Unitcore

ZenBrowserMirrors/pdf.js

mirror of https://github.com/zen-browser/pdf.js.git synced 2025-07-08 01:10:08 +02:00

Author	SHA1	Message	Date
Jonas Jenwald	9b41bfc374	Introduce helper functions for parsing /Matrix and /BBox arrays	2024-05-03 22:37:50 +02:00
Jonas Jenwald	52f7ff155d	Validate even more dictionary properties This checks primarily Arrays, but also some other properties, that we'll end up sending (sometimes indirectly) to the main-thread.	2024-05-03 22:37:14 +02:00
Jonas Jenwald	6c05f8b381	Add even more validation of width-data (PR 18017 follow-up) I missed this case in PR 18017, sorry about that.	2024-05-02 11:24:15 +02:00
Jonas Jenwald	d411a072a4	Add more validation of width-data The current `PartialEvaluator.extractWidths` implementation only contains partial validation of the width-data.	2024-04-29 10:51:16 +02:00
Jonas Jenwald	08eb0566f7	Validate additional font-dictionary properties	2024-04-29 08:21:28 +02:00
Calixte Denizet	551e63901c	Simplify the way to pass the glyph drawing instructions from the worker to the main thread and remove the use of eval in the font loader.	2024-04-27 21:28:31 +02:00
Jonas Jenwald	91898e5923	Extend the globally cached image main-thread copying to "complex" images as well (PR 17428 follow-up) In PR 17428 this functionality was limited to "larger" images, to not affect performance negatively. However it turns out that it's also beneficial to consider more "complex" images, regardless of their size, that contain /SMask or /Mask data; see issue 11518.	2024-04-20 11:10:09 +02:00
Calixte Denizet	52ea2333b3	Remove the tag for missing font subset when trying to find a substitution Fixes #17929.	2024-04-11 20:34:28 +02:00
Tim van der Meij	2e5282928f	Merge pull request #17854 from Snuffleupagus/rm-PromiseCapability [api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()`	2024-04-02 15:21:43 +02:00
Jonas Jenwald	e4d0e84802	[api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()` This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does almost the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular: - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state. - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used. - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however not a problem since the value of a Promise cannot be changed once fulfilled or rejected. - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them: - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is removed on its first (valid) invocation. - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks. - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and test-only code. --- [1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).	2024-04-01 11:42:37 +02:00
Jonas Jenwald	07a8836ab2	Ensure that Mesh /Shadings have non-zero width/height (issue 17848)	2024-03-29 22:58:25 +01:00
Calixte Denizet	9c3471dd01	Don't render corrupted inlined images Fixes #17794.	2024-03-15 15:33:18 +01:00
Calixte Denizet	a6eadf8150	Avoid to access to a missing cidSystemInfo property Fixes #17689.	2024-02-19 09:55:23 +01:00
Jonas Jenwald	a7bcc81eb1	Add a dummy `beginMarkedContentProps` operator when optional content parsing fails (issue 17679)	2024-02-17 13:45:16 +01:00
Jonas Jenwald	363dce6744	Use a limit, in more places, when splitting strings This should be a tiny bit more efficient, since it avoids parsing substrings that we don't care about. Please note: I cannot find an ESLint rule to enforce this automatically.	2024-02-02 13:10:52 +01:00
Calixte Denizet	7f2428a77e	Reduce memory use and improve perfs when computing the bounding box of a bezier curve (bug 1875547) It isn't really a fix for the mentioned bug but it slightly improve things. In reducing the memory use, the time spent in the GC is reduced either. The algorithm to compute the bounding box is the same as before but it has just been rewritten to be more efficient.	2024-01-24 23:41:14 +01:00
Jonas Jenwald	fa583427ef	Always export the "raw" /ToUnicode-data from `PartialEvaluator.preEvaluateFont` (PR 13354 follow-up) This, ever so slightly, simplifies the implementation in the `PartialEvaluator.extractDataStructures`-method.	2024-01-22 13:06:32 +01:00
Jonas Jenwald	f21a30dfb4	Convert the `PartialEvaluator.readToUnicode` method to be async	2024-01-22 12:47:06 +01:00
Jonas Jenwald	f5c01188dc	Convert the `PartialEvaluator.extractDataStructures` method to be async	2024-01-22 12:47:06 +01:00
Jonas Jenwald	cf0797dfbd	Use `await` consistently in the `PartialEvaluator.setGState` method	2024-01-22 12:47:06 +01:00
Jonas Jenwald	1cc83c4fdc	Use `await` consistently in the `PartialEvaluator.buildFormXObject` method	2024-01-22 12:47:06 +01:00
Tim van der Meij	49b2d9b5af	Merge pull request #17556 from Snuffleupagus/issue-17554 Ensure that `EvaluatorPreprocessor.opMap` has a null-prototype (issue 17554)	2024-01-21 20:58:09 +01:00
Jonas Jenwald	d7e41d4cb6	Ensure that `EvaluatorPreprocessor.opMap` has a null-prototype (issue 17554) This accidentally regressed in PR 16956, sorry about that!	2024-01-21 19:59:13 +01:00
Jonas Jenwald	3c2c0ecd88	Use the ESLint `arrow-body-style` rule in more spots in `src/core/evaluator.js`	2024-01-21 17:42:33 +01:00
Jonas Jenwald	d1bef8cb86	Use `await` consistently in the `PartialEvaluator.translateFont` method	2024-01-21 17:36:50 +01:00
Jonas Jenwald	fc62eec901	Convert the `handleSetFont` methods, in `src/core/evaluator.js`, to be async	2024-01-21 17:32:05 +01:00
Jonas Jenwald	f9a384d711	Enable the `arrow-body-style` ESLint rule This manually ignores some cases where the resulting auto-formatting would not, as far as I'm concerned, constitute a readability improvement or where we'd just end up with more overall indentation. Please see https://eslint.org/docs/latest/rules/arrow-body-style	2024-01-21 16:20:55 +01:00
Calixte Denizet	405f573d70	Take into account empty lines when extracting text content from the appearance Fixes #17492.	2024-01-14 20:23:29 +01:00
Calixte Denizet	7839e7b495	Preserve the whitespaces when getting text from FreeText annotations (bug 1871353) When the text of an annotation is extracted in using getTextContent, consecutive white spaces are just replaced by one space and. So this patch add an option to make sure that white spaces are preserved when appearance is parsed. For the case where there's no appearance, we can have a fast path to get the correct string from the Content entry. When an existing FreeText is edited, space (0x20) are replaced by non-breakable (0xa0) ones to make to see all of them on screen.	2024-01-05 10:20:32 +01:00
Jonas Jenwald	9f02cc36d4	Attempt to further reduce re-parsing for globally cached images (PR 11912, 16108 follow-up) In PR 11912 we started caching images that occur on multiple pages globally, which improved performance a lot in many PDF documents. However, one slightly annoying limitation of the implementation is the need to re-parse the image once the global-caching threshold has been reached. Previously this was difficult to avoid, since large image-resources will cause cleanup to run on the main-thread after rendering has finished. In PR 16108 we started delaying this cleanup a little bit, to improve performance if a user e.g. zooms and/or rotates the document immediately after rendering completes. Taking those two PRs together, we now have a situation where it's much more likely that the main-thread has "globally used" images cached at the page-level. Hence we can instead attempt to copy a locally cached image into the global object-cache on the main-thread and thus reduce unnecessary re-parsing of large/complex global images, which significantly reduces the rendering time in many cases. For the PDF document in issue 11878, the rendering time of the second page changes as follows (on my computer): - With the `master`-branch it takes >600 ms to render. - With this patch that goes down to ~50 ms, which is one order of magnitude faster. (Note that all other pages are, as expected, completely unaffected by these changes.) This new main-thread copying is limited to "large" global images, since: - Re-parsing of small images, on the worker-thread, is usually fast enough to not be an issue. - With the delayed cleanup after rendering, it's still not guaranteed that an image is available in a page-level cache on the main-thread. - This forces the worker-thread to wait for the main-thread, which is a pattern that you always want to avoid unless absolutely necessary.	2023-12-21 21:26:21 +01:00
Jonas Jenwald	e547b198a3	Compute the length of the final image-bitmap/data on the worker-thread Currently this is done in the API, but moving it into the worker-thread will simplify upcoming changes.	2023-12-21 21:26:21 +01:00
Jonas Jenwald	709d89420e	Re-factor how the `GenericL10n` class fetches localization-data - Re-factor the existing `fetchData` helper function such that it can fetch more types of data, and it now supports "arraybuffer", "json", and "text". This only needed minor adjustments in the `DOMCMapReaderFactory` and `DOMStandardFontDataFactory` classes.[1] - Expose the `fetchData` helper function in the API, such that the viewer is able to access it. - Use the `fetchData` helper function in the `GenericL10n` class, since this should allow fetching of localization-data even if the default viewer is run in an environment without support for the Fetch API. --- [1] While testing this I also noticed a minor inconsistency when handling standard font-data on the worker-thread.	2023-11-14 13:45:14 +01:00
Calixte Denizet	7851c0da8d	[Debugger] Add some info about substitution font When pdfBug is true, the substitution font is used in the text layer in order to be able to know what is the font really used thanks to the devtools. And to be sure that fonts are loaded, the font cache isn't cleaned up when the debugger is active.	2023-10-09 12:06:33 +02:00
Jonas Jenwald	0ac8f33e13	Ignore optional content with missing /Type-entries In the rare situation that an optional content dictionary lacks a /Type-entry we currently throw, which may prevent e.g. Form XObjects from rendering completely. Fixes https://bugs.ghostscript.com/show_bug.cgi?id=707147	2023-09-19 14:11:03 +02:00
Jonas Jenwald	316d1ec5ef	Simplify the `EvaluatorPreprocessor.opMap` getter a little bit Given that this is a shadowed getter, the `opMap` is already lazily initialized and it shouldn't be necessary to also use the `getLookupTableFactory` helper function here. Looking at the history of the code, it seems that this is simply a leftover from before JavaScript classes existed.	2023-09-16 12:26:38 +02:00
Jonas Jenwald	c0fe96b8fe	Additional manual `unicorn/prefer-ternary` changes Not all cases could be automatically fixed, and the changes also triggered a number of `prefer-const` errors that needed to be handled manually.	2023-07-27 09:48:24 +02:00
Jonas Jenwald	674e7ee381	Enable the `unicorn/prefer-ternary` ESLint plugin rule To limit the readability impact of these changes, the `only-single-line` option was used; please find additional details at https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-ternary.md These changes were done automatically, using the `gulp lint --fix` command.	2023-07-27 09:18:26 +02:00
Jonas Jenwald	fee850737b	Enable the `unicorn/prefer-optional-catch-binding` ESLint plugin rule According to MDN this format is available in all browsers/environments that we currently support, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/try...catch#browser_compatibility Please also see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-optional-catch-binding.md	2023-06-12 11:46:11 +02:00
Jonas Jenwald	1f42aaf21b	Improve SMask/Mask lookup when parsing inline images - Don't attempt to lookup an "SM" entry, since we're only using "SMask" in the `PDFImage` code and I also cannot find any mention in the PDF specification about that being a valid abbreviation for a Soft Mask entry. (There's only a `SM = Smoothness Tolerance` Graphics State parameter, which is obviously something completely different.) - Don't lookup the /SMask and /Mask entries unless it's actually an inline image, since it's pointless otherwise. - Last, but most importantly, only check for the existence of /SMask and /Mask entries but don't actually fetch the data. Note that if either one exists it'll contain a Stream, and those cannot be cached on the `XRef`-instance, which leads to unnecessary parsing/allocations and in this case we're not using the actual data for anything.	2023-06-10 13:19:43 +02:00
Jonas Jenwald	459d26edec	Improve handling of mismatching /BaseFont and /FontName entries for non-embedded fonts (issue 7454) This patch is the result of me going through some old issues regarding non-embedded Wingdings support. There's a few different things wrong in the referenced PDF document: - The /BaseFont and /FontName entries don't agree on the name of the fonts, with one font using `/BaseFont /Wingdings-Regular` and `/FontName /wg09np` which obviously makes no sense. To address this we'll compare the font-names against our lists of known ones and ignore /FontName entries that don't make sense iff the /BaseFont entry is a known font-name. - The non-embedded Wingdings font also set an incorrect /Encoding, in this case /MacRomanEncoding, which should have been fixed by PR 16465. However this doesn't work since the font has bogus font-flags, that fail to categorize the font as Symbolic. To address this we'll also compare the font-name against the list of known symbol fonts.	2023-06-02 17:10:25 +02:00
Jonas Jenwald	5a7beb9f30	Attempt to improve non-embedded Wingdings font support (bug 1652224) Now that font-substitution has been implemented, we should be able to do much a better job at supporting non-embedded Wingdings fonts. Given that this is a Windows-specific font, see https://en.wikipedia.org/wiki/Wingdings, this is however not guaranteed to work (well) on other platforms.	2023-05-24 14:59:13 +02:00
Jonas Jenwald	aeed6f2b67	Ignore named encoding for non-embedded symbol fonts (issue 16464) The affected font is non-embedded ZapfDingbats, however the PDF document for some inexplicable reason specifies the encoding as "WinAnsiEncoding" (which is obviously wrong). To work-around this bug in the PDF generator, we'll simply ignore any explicitly specified named encoding for non-embedded symbol fonts.	2023-05-24 10:48:47 +02:00
Calixte Denizet	a76a69e1ed	Take into account the final space if any in the TJ command The final space was just ignored and that led to wrongly position the next chunk of text.	2023-05-23 17:09:32 +02:00
Jonas Jenwald	8c4821ceda	[api-minor] Slightly shorten the marked-content ids used in the textLayer Generally we try to keep the ids that we create short, hence we can slightly shorten the "static" parts of them.	2023-05-18 22:32:10 +02:00
Calixte Denizet	3091e70aad	Flush the current chunk when the font changed because of a restore op (issue #14755 )	2023-05-18 19:37:16 +02:00
Jonas Jenwald	4355e76c60	Simplify the `fontID` handling in `PartialEvaluator.loadFont` The `fontID` handling is quite old and predates the use of the `idFactory` to generate a unique id for each font, hence we can simplify this code a little bit.	2023-05-18 13:09:08 +02:00
Tim van der Meij	ac8032628b	Merge pull request #16424 from Snuffleupagus/core-optional-chaining Introduce more optional chaining in the `src/core/` folder	2023-05-18 12:40:08 +02:00
Jonas Jenwald	bfb374dbf6	Attempt to fallback to a default font, for non-available ones, in more cases (issue 16432) This essentially extends PR 11218 to also apply when looking up the final font-reference, via the XRef-table, fails because the font isn't available. This patch also changes `PartialEvaluator.fallbackFontDict` to simply use "Helvetica" as the default font-name, since that seems generally reasonable given the now existing font-substitution code.	2023-05-17 11:41:08 +02:00
Jonas Jenwald	1b4a7c5965	Introduce more optional chaining in the `src/core/` folder After PR 12563 we're now free to use optional chaining in the worker-thread as well. (This patch also fixes one previously "missed" case in the `web/` folder.) For the MOZCENTRAL build-target this patch reduces the total bundle-size by `1.6` kilobytes.	2023-05-15 12:38:28 +02:00
Calixte Denizet	d4b70ec306	For missing font, use a local font if it exists even if there's no standard substitution If the font foo is missing we just try lo load local(foo) and maybe we'll be lucky.	2023-05-13 21:54:27 +02:00

1 2 3 4 5 ...