ZenBrowserMirrors/pdf.js - Unitcore

ZenBrowserMirrors/pdf.js

mirror of https://github.com/zen-browser/pdf.js.git synced 2025-07-09 01:35:43 +02:00

Author	SHA1	Message	Date
Nicolò Ribaudo	5b29e935e1	Overrride the minimum font size when rendering the text layer Browsers have an accessibility option that allows user to enforce a minimum font size for all text rendered in the page, regardless of what the font-size CSS property says. For example, it can be found in Firefox under `font.minimum-size.x-western`. When rendering the <span>s in the text layer, this causes the text layer to not be aligned anymore with the underlying canvas. While normally accessibility features should not be worked around, in this case it is not improving accessibility: - the text is transparent, so making it bigger doesn't make it more readable - the selection UX for users with that accessibility option enabled is worse than for other users (it's basically unusable). While there is tecnically no way to ignore that minimum font size, this commit does it by multiplying all the `font-size`s in the text layer by minFontSize, and then scaling all the `<span>`s down by 1/minFontSize.	2024-06-25 14:58:08 +02:00
Calixte Denizet	ff6180a4c9	Add an option to enable/disable hardware acceleration (bug 1902012)	2024-06-12 18:41:07 +02:00
Jonas Jenwald	f2e7eee00e	Don't register a pending `TextLayer` until `render` is invoked (PR 18104 follow-up) After the re-factoring in PR 18104 there's now a theoretical risk that a pending `TextLayer` is never removed, which we can avoid by not registering it until `render` is invoked. Note that this doesn't affect the viewer or tests, but if a third-party user calls `new TextLayer(...)` without a following call of either the `render`- or `cancel`-method we'd block global clean-up without this patch.	2024-05-26 18:38:40 +02:00
Aditi	9edca0a5ed	Add `lang` attribute to canvas element Fixes issue #16843. In certain cases, the text layer was misaligned due to a difference between the `lang` attribute of the viewer and the canvas. This commit addresses the problem by adding the `lang` attribute to the canvas. The issue was caused because PDF.js uses serif/sans-serif fonts to generate the text layer and relies on system fonts. The difference in the `lang` attribute led to different fonts being picked, causing the misalignment.	2024-05-21 19:41:24 +05:30
Jonas Jenwald	15b5808eee	[api-minor] Re-factor the basic textLayer-functionality This is very old code, and predates e.g. the introduction of JavaScript classes, which creates unnecessarily unwieldy code in the viewer. By introducing a new `TextLayer` class in the API, similar to how e.g. the `AnnotationLayer` looks, we're able to keep most parameters on the class-instance itself. This removes the need to manually track them in the viewer, and simplifies the call-sites. This also removes the `numTextDivs` parameter from the "textlayerrendered" event, since that's only added to support default-viewer functionality that no longer exists. Finally we try, as far as possible, to polyfill the old `renderTextLayer` and `updateTextLayer` functions since they are exposed in the library API. For simple invocations of `renderTextLayer` the behaviour should thus be the same, with only a warning printed in the console.	2024-05-17 14:20:20 +02:00
Jonas Jenwald	d8e0fca609	Don't invoke `cleanupTextLayer` when there are pending textLayers Please note: This doesn't really affect the viewer, but may affect the library API if multiple PDF documents are opened in parallel. Since we clean-up "global" textLayer-data when destroying a PDF document, this means that other active PDFs could potentially break by invoking `cleanupTextLayer` unconditionally. Note that textLayer rendering is an asynchronous task, and we thus need to ensure those are all finished before running clean-up.	2024-05-17 08:52:10 +02:00
Jonas Jenwald	d5f3829f91	Actually disable `TextLayerRenderTask.prototype.#processItems` when `MAX_TEXT_DIVS_TO_RENDER` is reached (PR 18089 follow-up) I broke this accidentally in PR 18089, sorry about that! Note that since `#processItems` is private we can no longer just "replace" the method as was done in PR 18052.	2024-05-16 11:48:11 +02:00
Jonas Jenwald	036fd11ad7	Improve the `TextLayerRenderTask` implementation - Change all possible semi-private methods into properly private ones. Note that this code is old enough to predate standard classes. - Move the `appendText` helper function into `TextLayerRenderTask`, as a private method, to avoid having to manually pass in the scope. - Simplify `#layoutText` by directly passing in all necessary data. This is possible after the changes PR 18052.	2024-05-14 14:10:17 +02:00
Jonas Jenwald	6d523c316c	[api-minor] Include the document /Lang attribute in the textContent-data - These changes will allow a simpler way of implementing PR 17770. - The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done once per PDF document (and most PDFs don't included this data). - This makes the /Lang attribute directly available in the `textLayer`, which has the following advantages: - We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer). - Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770. Please note: This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).	2024-05-14 12:44:41 +02:00
Jonas Jenwald	8d86e18a32	Restore the `MAX_TEXT_DIVS_TO_RENDER` limit in the textLayer This limit is currently completely non-functional, since the check happens after the entire textLayer has been parsed and appended to the DOM. It seems that this has been accidentally broken ever since the introduction of `ReadableStream` support. The reason that this hasn't caused noticeable textLayer-related performance issues in practice is probably because we nowadays manage to coalesce the textLayer into fewer overall DOM elements, whereas years ago many PDF documents ended up with one DOM element per glyph. By moving this check, and thus restoring the functionality, we're also able to remove the `render` helper function and simplify the code.	2024-05-07 13:04:00 +02:00
Jonas Jenwald	30840e411e	Ensure that the textLayer `styleCache` is always cleared, even on failure By also moving it to the `TextLayerRenderTask`-instance, we can avoid a bit of manual parameter passing.	2024-05-07 13:04:00 +02:00
Jonas Jenwald	049848ba00	Unify the `ReadableStream` and `TextContent` code-paths in `src/display/text_layer.js` The only reason that this code still accepts `TextContent` is for backward-compatibility purposes, so we can simplify the implementation by always using a `ReadableStream` internally.	2024-05-07 13:03:57 +02:00
Jonas Jenwald	e4d0e84802	[api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()` This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does almost the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular: - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state. - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used. - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however not a problem since the value of a Promise cannot be changed once fulfilled or rejected. - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them: - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is removed on its first (valid) invocation. - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks. - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and test-only code. --- [1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).	2024-04-01 11:42:37 +02:00
Calixte Denizet	f84f48b5d0	Avoid to have the text layer mismatching the rendered text with mismatching locales (bug 1869001) The system locale (used in OffscreenCanvas) can be different from the one guessed by Fluent, consequently, in order to avoid any mismatch, we just use an attached canvas element. The original issue can easily be reproduced locally in adding a lang="ja" in viewer.html (or with an other language for Japanese users).	2024-01-04 19:20:20 +01:00
Calixte Denizet	7851c0da8d	[Debugger] Add some info about substitution font When pdfBug is true, the substitution font is used in the text layer in order to be able to know what is the font really used thanks to the devtools. And to be sure that fonts are loaded, the font cache isn't cleaned up when the debugger is active.	2023-10-09 12:06:33 +02:00
Jonas Jenwald	f87ec67ab1	[api-major] Remove various deprecated functionality and options	2023-09-23 17:44:09 +02:00
Jonas Jenwald	317abd6d07	Change the `createPromiseCapability` helper function into a `PromiseCapability` class This is not only slightly more compact, but it also simplifies the handling of the `settled` getter.	2023-04-29 13:43:24 +02:00
Jonas Jenwald	4bf8e5c13d	Tweak the `--scale-factor` CSS-variable warning threshold (issue 16254) This is apparently needed to account for the rounding used in Chromium-browsers, such that the warning message isn't displayed unnecessarily.	2023-04-06 13:11:12 +02:00
Jonas Jenwald	8bf5e96af9	Only warn about missing `--scale-factor` CSS-variable for visible textLayers (PR 16162 follow-up) This is something that I completely overlooked in PR 16162, which in some cases cause the default viewer to incorrectly print warnings. This can be reproduced with the PAGE scrolling-mode, and/or the PresentationMode, and this patch simply work-around it by checking the visibility as well (since the warning is a best-effort solution anyway).	2023-03-20 12:51:26 +01:00
Jonas Jenwald	0e54a3c37a	Warn about missing/incorrect `--scale-factor` CSS-variable in `renderTextLayer` (issue 16139) Unfortunately I don't believe that we can simply add a default `--scale-factor` CSS-variable to the `container`-element, since that might not be entirely appropriate/correct in all cases.[1] However, we can at least print a console-error to hopefully make this situation more apparent to users. (This is purposely not using the `warn` helper-function, since those messages can be disabled.) --- [1] One example is in our reference-tests, where we don't need to add it to the `container`-element itself.	2023-03-16 11:53:12 +01:00
Jonas Jenwald	5075d0495b	Use `OffscreenCanvas` as intended for all code-paths in `src/display/text_layer.js` (PR 15722 follow-up) Currently some `getCtx` calls will have `isOffscreenCanvasSupported === undefined` set, meaning that `OffscreenCanvas` isn't being used as intended, since no `TextLayerRenderTask._isOffscreenCanvasSupported` property exists. Please note: This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.	2023-02-24 11:29:58 +01:00
Jonas Jenwald	cafdc48147	[api-minor] Add a new `PageViewport`-getter to access the original, un-scaled, viewport dimensions While reviewing recent patches, I couldn't help but noticing that we now have a lot of call-sites that manually access the `PageViewport.viewBox`-property. Rather than repeating that verbatim all over the code-base, this patch adds a lazily computed and cached getter for this data instead.	2022-12-11 18:37:35 +01:00
Calixte Denizet	a989b5a879	Set the dimensions of the various layers at their creation - Use a unique helper function in display/display_utils.js; - Move those dimensions in css' side.	2022-12-10 14:35:06 +01:00
Jonas Jenwald	0274245e90	Remove the unused `TextLayerRenderTask._renderingDone` property (PR 15259 follow-up) This is yet another property that I forgot to remove in PR 15259.	2022-12-05 11:49:14 +01:00
Jonas Jenwald	fe8fded23b	[api-minor] Combine the `textContent`/`textContentStream` parameters Rather than handling these parameters separately, which is a left-over from back when streaming of textContent was originally added, we can simply pass either data directly to the `TextLayer` and let it handle things accordingly. Also, improves a few JSDoc comments and `typedef`-imports.	2022-12-04 21:22:14 +01:00
Calixte Denizet	eed9bf71c5	Refactor the text layer code in order to avoid to recompute it on each draw The idea is just to resuse what we got on the first draw. Now, we only update the scaleX of the different spans and the other values are dependant of --scale-factor. Move some properties in the CSS in order to avoid any updates in JS.	2022-12-01 18:42:43 +01:00
Jonas Jenwald	7c25b1b455	[api-minor] Remove the TextLayer `timeout` parameter (PR 15742 follow-up) The deprecation is included in the current release, i.e. version `3.1.81`, and given the edge-case nature of this option I really don't think that we need to keep it deprecated for multiple releases.	2022-11-29 19:57:38 +01:00
Jonas Jenwald	b3e161c328	[api-minor] Deprecate the TextLayer `timeout` parameter This has never really been used anywhere within the PDF.js library[1], and when streaming of textContent was introduced this parameter was effectively made redundant. Note that when streaming of textContent is used, all text-layout has already happened by the time that this `timeout`-functionality is actually invoked (thus making it pointless). While the `timeout`-functionality may still "work" when the textContent is provided upfront, although it's never been used/tested, streaming will generally perform better (in e.g. a viewer setting). Please note: While unrelated here, also removes a now unused property that I forgot in PR 15259. --- [1] At least not since the code was moved into its current file, which happened in PR 6619 and landed seven years ago.	2022-11-24 23:08:39 +01:00
Jonas Jenwald	1e7274e9c6	[api-minor] Move the handling of unbalanced markedContent to the worker-thread (PR 15630 follow-up)	2022-10-27 11:14:54 +02:00
Jonas Jenwald	980acddbfa	Prevent textLayer errors in documents with unbalanced beginMarkedContent/endMarkedContent operators (issue 15629)	2022-10-26 18:35:48 +02:00
Jonas Jenwald	60f6272ed9	Use more `for...of` loops in the code-base Most, if not all, of this code is old enough to predate the general availability of `for...of` iteration.	2022-10-03 13:08:38 +02:00
Jonas Jenwald	571ce13dd6	[api-major] Remove the `enhanceTextSelection` functionality (PR 15145 follow-up) For the `gulp mozcentral` command, this reduces the size of the built `pdf.js` file by `> 10` kB.	2022-08-28 15:04:47 +02:00
Calixte Denizet	51c8e2f3ab	Fix text selection with hdpi screens (#15229 )	2022-07-28 19:44:13 +02:00
Jonas Jenwald	815c28da0e	[api-minor] Deprecate the `enhanceTextSelection` functionality	2022-07-07 16:15:31 +02:00
Jonas Jenwald	c21f4faaf8	Reduce unnecessary usage of `Array.prototype.concat()` There are obviously cases where using `concat` makes perfect sense, since that method doesn't change any of the existing Arrays; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/concat However, in a few cases throughout the code-base that's not an issue and using `concat` only leads to unnecessary intermediate allocations. With modern JavaScript we can thus replace those with a combination of `push` and spread-syntax, which wasn't originally possible when the code was written.	2022-06-19 13:40:52 +02:00
Jonas Jenwald	8129815538	Enable the `unicorn/prefer-dom-node-append` ESLint plugin rule This rule will help enforce slightly shorter code, especially since you can insert multiple elements at once, and according to MDN `Element.append()` is available in all browsers that we currently support. Please find additional information here: - https://developer.mozilla.org/en-US/docs/Web/API/Element/append - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-dom-node-append.md	2022-06-12 13:07:03 +02:00
Tim van der Meij	a57a4bc6c2	Merge pull request #15018 from Snuffleupagus/issue-15016 Expose `TextLayerRenderTask` in the TypeScript definitions (issue 15016, PR 14013 follow-up)	2022-06-10 22:18:35 +02:00
Tim van der Meij	f0b5aee6b8	Merge pull request #15014 from Snuffleupagus/prefer-at Enable the `unicorn/prefer-at` ESLint plugin rule (PR 15008 follow-up)	2022-06-10 22:12:35 +02:00
Jonas Jenwald	e046b811b7	Expose `TextLayerRenderTask` in the TypeScript definitions (issue 15016, PR 14013 follow-up) While `TextLayerRenderTask` apparently makes sense in TypeScript environments, given that it's being returned by the `renderTextLayer`-function in the API, we really don't want to extend the public API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually. Hence we follow the same pattern as in PR 14013, and add some very basic unit-tests to ensure that `renderTextLayer` always returns a `TextLayerRenderTask`-instance as expected.	2022-06-10 22:12:32 +02:00
jerry1100	b716e82d18	Extend TextLayerRenderParameters.container type to include HTMLElement. In PR #14717, the type was changed from a HTMLElement to a DocumentFragment. This broke TypeScript projects that use a HTMLElement container. To remedy this, we extend the type of container to also include HTMLElement.	2022-06-10 06:50:47 -07:00
Jonas Jenwald	9ac4536693	Enable the `unicorn/prefer-at` ESLint plugin rule (PR 15008 follow-up) Please find additional information here: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-at.md	2022-06-09 21:21:19 +02:00
Jonas Jenwald	af5789125f	Try to remove the `mozOpaque` canvas-property (PR 6551 follow-up) According to MDN, see https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/mozOpaque, the `mozOpaque` canvas-property is not only non-standard (obviously) but it's also been deprecated. Instead it's recommended to use `alpha = false` when getting the canvas-context, see https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext#contextattributes, which all of our affected code is already doing.	2022-05-09 13:03:08 +02:00
Jonas Jenwald	7f0589c74a	Change the type of the `container` property, in the `TextLayerRenderParameters` typedef (issue 14716) Given that the textLayer-code has been using a `DocumentFragment` ever since PR 3356 (back in 2013), simply updating the type of the `container` property should be fine. This patch also tries to, ever so slightly, improve the grammar of a couple of other properties in the typedef.	2022-03-24 22:42:37 +01:00
Calixte Denizet	61d1063276	Fix issues in text selection - PR #13257 fixed a lot of issues but not all and this patch aims to fix almost all remaining issues. - the idea in this new patch is to compare position of new glyph with the last position where a glyph has been drawn; - no space are "drawn": it just moves the cursor but they aren't added in the chunk; - so this way a space followed by a cursor move can be treated as only one space: it helps to merge all spaces into one. - to make difference between real spaces and tracking ones, we used a factor of the space width (from the font) - it was a pretty good idea in general but it fails with some fonts where space was too big: - in Poppler, they're using a factor of the font size: this is an excellent idea (<= 0.1 * fontSize implies tracking space).	2021-10-17 16:27:05 +02:00
Jonas Jenwald	4c1b586dd2	Reduce the size of `TextLayerRenderTask._textDivProperties` in "regular" text-selection mode While these changes will obviously not have a significant effect on overall memory usage, it cannot hurt as far as I'm concerned. This patch makes the following changes: - Clear out `_textDivProperties` once rendering is done, since those properties are only necessary to keep alive when enhanced text-selection is being used. - Reduce the size of the `_textDivProperties`-entries by default, since a majority of the properties are only relevant when enhanced text-selection is being used.	2021-09-05 12:12:34 +02:00
Jonas Jenwald	1df9da949e	Prevent "Uncaught promise" messages in the console when cancelling (some) `ReadableStream`s While fixing issue 13794, I noticed that cancelling the `ReadableStream` returned by the `PDFPageProxy.streamTextContent`-method could lead to "Uncaught promise" messages in the console.[1] Generally speaking, we don't really care about errors when cancelling a `ReadableStream` and it thus seems reasonable to simply suppress any output in those cases. --- [1] Although, after that issue was fixed you'd now need to set the API-option `stopAtErrors = true` to actually trigger this.	2021-07-30 14:27:38 +02:00
Jonas Jenwald	8943bcd3c3	Account for formatting changes in Prettier version `2.3.0` With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`. Please find additional information at: - https://github.com/prettier/prettier/releases/tag/2.3.0 - https://prettier.io/blog/2021/05/09/2.3.0.html	2021-05-16 11:44:05 +02:00
Jonas Jenwald	9a1758c6b8	Remove unnecessary closure in `src/display/text_layer.js`, and use standard classes With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap all of the code in a closure.[1] This patch also tries to clean-up/improve a couple of the existing JSDoc-comments. --- [1] This reduces the size, even of the built `pdf.js` file, since there's now a lot less unnecessary whitespace.	2021-05-05 18:44:56 +02:00
calixteman	af4dc55019	[api-minor] Fix the way to chunk the strings (#13257 ) - Improve chunking in order to fix some bugs where the spaces aren't here: * track the last position where a glyph has been drawn; * when a new glyph (first glyph in a chunk) is added then compare its position with the last saved one and add a space or break: - there are multiple ways to move the glyphs and to avoid to have to deal with all the different possibilities it's a way easier to just compare positions; - and so there is now one function (i.e. "compareWithLastPosition") where all the job is done. - Add some breaks in order to get lines; - Remove the multiple whites spaces: * some spaces were filled with several whites spaces and so it makes harder to find some sequences of words using the search tool; * other pdf readers replace spaces by one white space. Update src/core/evaluator.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-04-30 14:41:13 +02:00
Jonas Jenwald	da22146b95	Replace a bunch of `Array.prototype.forEach()` cases with `for...of` loops instead Using `for...of` is a modern and generally much nicer pattern, since it gets rid of unnecessary callback-functions. (In a couple of spots, a "regular" `for` loop had to be used.)	2021-04-24 13:00:19 +02:00

1 2 3