Skip to content

Improve cache rate with request normalization #434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dmichon-msft opened this issue Mar 29, 2025 · 1 comment
Open

Improve cache rate with request normalization #434

dmichon-msft opened this issue Mar 29, 2025 · 1 comment

Comments

@dmichon-msft
Copy link

I wrote up this issue over on enhanced-resolve, the same approach would probably help somewhat for oxc-resolver, at least with reducing cache size, though since the overhead of talking to the filesystem is a bit smaller from Rust the performance impact of reducing file system calls may be less relevant:
webpack/enhanced-resolve#449

TL;DR, in general, if you store your requests in the cache as being relative to your nearest parent package.json (which you are already paying to find), it improves cache hit rate for any packages with multiple subfolders.

@Boshen Boshen self-assigned this Mar 29, 2025
@Boshen
Copy link
Member

Boshen commented Apr 29, 2025

Reposting it here:

Within the boundary of a npm package (defined as being in a subpath of a folder containing a package.json), barring exotic circumstances, the following types of module resolution are invariant to the requesting file's relativePath from the descriptionFileRoot:

Absolute paths. These may only be altered by alias or other plugins, and are generally unaffected by the requesting package at all.
Bare specifiers, e.g. react, @nodelib/fs.scandir. These are only impacted by the presence of node_modules folders, which no normal package manager will create in subfolders of a package without there being a nested package.json. As such, resolution can ignore the relativePath component of the requesting file.
Imports that map through the package.json imports field via ala #some/request. These do not provide any means for the relativePath component of the requesting file to impact resolution.
A relative import ../foo from a/bar is exactly the same as a relative import ./foo from a.
Based on the above, we can optimize the caching layer for resolution requests as follows:

Assume that a cache key contains internal, path, request, query, fragment (there may be other fields that it still splits on, but these are generally the majority of the differentiation)
If the request is a relative path, update request = joinRelativePreservingLeadingDot(relativePath, request)
Set path = descriptionFileRoot in the cache key in all cases
Testing in a large local project, this mapping reduced the number of unique requests seen by the resolver by about 33%, which both reduces the size of the cache and improves the cache hit rate.

This optimization is particularly relevant for packages that contain a lot of subfolders.

One can further optimize the cache hit rate by having a cache layer and performing request normalization any time the descriptionFileRoot property of the request object changes, for example upon resolving the request to a target package.

@Boshen Boshen removed their assignment May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants