Skip to content

match new hash pattern, log when replacing URI paths #595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

TomConner
Copy link
Contributor

Paths look like this now: /assets/index-4au49BA-.js

So run an extra regex to handle paths like that.

Also add a log message whenever a path is replaced via regex.

@TomConner TomConner marked this pull request as draft January 30, 2025 22:03
@TomConner TomConner marked this pull request as ready for review February 5, 2025 21:20
@TomConner TomConner requested a review from sarahgibs February 5, 2025 21:20
zap/src/scan.py Outdated
r=r._replace(path=URI_HASH_REGEX.sub('', r.path))
r_prev = r
r=r._replace(path=URI_HASH_REGEX1.sub('', r.path))
r=r._replace(path=URI_HASH_REGEX2.sub('index-hash-.js', r.path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the only problem index.js?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only one I saw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry to say I have pushed a change here to remove the hash as discussed in the meeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the timing but not the change.

I put the replacement strings together with the regexes. Because URI_HASH_REGEX2 always matches "index-XXXXXXX-.js" it gets replaced with "index.js", so it should be consistent with how the previous regex works -- just removing the hash.

@TomConner
Copy link
Contributor Author

I'm sorry to have combined a reformatting of this code with the real change which was around URI_HASH_REGEX

@TomConner
Copy link
Contributor Author

Now it only removes the hash.

I put the replacement strings together with the regexes so it's easier to follow what happens: Because URI_HASH_REGEX2 always matches "index-XXXXXXX-.js" it gets replaced with "index.js", so it should be consistent with how the previous regex works -- just removing the hash.

Copy link

sonarqubecloud bot commented Feb 5, 2025

Copy link
Contributor

@sarahgibs sarahgibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. There's definitely some future changed we can make to the regex to make it a bit more precise, but I think this is a good addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants