Attempt to further reduce re-parsing for globally cached images (PR 11912, 16108 follow-up)

In PR 11912 we started caching images that occur on multiple pages globally, which improved performance a lot in many PDF documents.
However, one slightly annoying limitation of the implementation is the need to re-parse the image once the global-caching threshold has been reached. Previously this was difficult to avoid, since large image-resources will cause cleanup to run on the main-thread after rendering has finished. In PR 16108 we started delaying this cleanup a little bit, to improve performance if a user e.g. zooms and/or rotates the document immediately after rendering completes.

Taking those two PRs together, we now have a situation where it's much more likely that the main-thread has "globally used" images cached at the page-level. Hence we can instead attempt to *copy* a locally cached image into the global object-cache on the main-thread and thus reduce unnecessary re-parsing of large/complex global images, which significantly reduces the rendering time in many cases.

For the PDF document in issue 11878, the rendering time of *the second page* changes as follows (on my computer):
 - With the `master`-branch it takes >600 ms to render.
 - With this patch that goes down to ~50 ms, which is one order of magnitude faster.

(Note that all other pages are, as expected, completely unaffected by these changes.)

This new main-thread copying is limited to "large" global images, since:
 - Re-parsing of small images, on the worker-thread, is usually fast enough to not be an issue.
 - With the delayed cleanup after rendering, it's still not guaranteed that an image is available in a page-level cache on the main-thread.
 - This forces the worker-thread to wait for the main-thread, which is a pattern that you always want to avoid unless absolutely necessary.
This commit is contained in:
Jonas Jenwald 2023-12-14 21:57:48 +01:00
parent e547b198a3
commit 9f02cc36d4
3 changed files with 95 additions and 11 deletions

View file

@ -2704,11 +2704,11 @@ class WorkerTransport {
messageHandler.on("commonobj", ([id, type, exportedData]) => {
if (this.destroyed) {
return; // Ignore any pending requests if the worker was terminated.
return null; // Ignore any pending requests if the worker was terminated.
}
if (this.commonObjs.has(id)) {
return;
return null;
}
switch (type) {
@ -2750,6 +2750,23 @@ class WorkerTransport {
this.commonObjs.resolve(id, font);
});
break;
case "CopyLocalImage":
const { imageRef } = exportedData;
assert(imageRef, "The imageRef must be defined.");
for (const pageProxy of this.#pageCache.values()) {
for (const [, data] of pageProxy.objs) {
if (data.ref !== imageRef) {
continue;
}
if (!data.dataLen) {
return null;
}
this.commonObjs.resolve(id, structuredClone(data));
return data.dataLen;
}
}
break;
case "FontPath":
case "Image":
case "Pattern":
@ -2758,6 +2775,8 @@ class WorkerTransport {
default:
throw new Error(`Got unknown common object type ${type}`);
}
return null;
});
messageHandler.on("obj", ([id, pageIndex, type, imageData]) => {
@ -3166,6 +3185,15 @@ class RenderTask {
* @type {function}
*/
this.onContinue = null;
if (typeof PDFJSDev === "undefined" || PDFJSDev.test("TESTING")) {
// For testing purposes.
Object.defineProperty(this, "getOperatorList", {
value: () => {
return this.#internalRenderTask.operatorList;
},
});
}
}
/**