Skip to content

fix(hiding_ci): race condition in write syscall #5338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kalyazin
Copy link
Contributor

@kalyazin kalyazin commented Jul 31, 2025

Changes

Fix a race condition in the write syscall by implementing write_iter instead that takes care about folio locking and potential prefaulting of the user buffer.

Reason

The original write implementation was dropping the folio lock before copying the data in which was leading to a race with the fault handler that was being able to start clearing the page when the write was already writing to it.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • [ ] I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • [ ] I have mentioned all user-facing changes in CHANGELOG.md.
  • [ ] If a specific issue led to this PR, this PR closes the issue.
  • [ ] When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • [ ] I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • [ ] I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

Copy link

codecov bot commented Jul 31, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.94%. Comparing base (ca6a13b) to head (874a10f).
⚠️ Report is 2 commits behind head on feature/secret-hiding.

Additional details and impacted files
@@                  Coverage Diff                   @@
##           feature/secret-hiding    #5338   +/-   ##
======================================================
  Coverage                  81.94%   81.94%           
======================================================
  Files                        250      250           
  Lines                      27612    27612           
======================================================
  Hits                       22626    22626           
  Misses                      4986     4986           
Flag Coverage Δ
5.10-c5n.metal 82.09% <ø> (ø)
5.10-m5n.metal 82.08% <ø> (-0.01%) ⬇️
5.10-m6a.metal 81.26% <ø> (+<0.01%) ⬆️
5.10-m6g.metal 77.91% <ø> (ø)
5.10-m6i.metal 82.08% <ø> (ø)
5.10-m7a.metal-48xl 81.24% <ø> (+<0.01%) ⬆️
5.10-m7g.metal 77.91% <ø> (ø)
5.10-m7i.metal-24xl 82.05% <ø> (+<0.01%) ⬆️
5.10-m7i.metal-48xl 82.09% <ø> (+0.04%) ⬆️
5.10-m8g.metal-24xl 77.91% <ø> (+<0.01%) ⬆️
5.10-m8g.metal-48xl 77.91% <ø> (ø)
6.1-c5n.metal 82.13% <ø> (ø)
6.1-m5n.metal 82.13% <ø> (-0.05%) ⬇️
6.1-m6a.metal 81.30% <ø> (ø)
6.1-m6g.metal 77.91% <ø> (ø)
6.1-m6i.metal 82.13% <ø> (ø)
6.1-m7a.metal-48xl 81.28% <ø> (ø)
6.1-m7g.metal 77.91% <ø> (ø)
6.1-m7i.metal-24xl 82.14% <ø> (ø)
6.1-m7i.metal-48xl 82.14% <ø> (+<0.01%) ⬆️
6.1-m8g.metal-24xl 77.90% <ø> (-0.01%) ⬇️
6.1-m8g.metal-48xl 77.91% <ø> (-0.01%) ⬇️
6.16-c5n.metal 82.17% <ø> (-0.01%) ⬇️
6.16-m5n.metal 82.16% <ø> (ø)
6.16-m6a.metal 81.34% <ø> (ø)
6.16-m6g.metal 77.95% <ø> (-0.01%) ⬇️
6.16-m6i.metal 82.16% <ø> (ø)
6.16-m7a.metal-48xl 81.32% <ø> (ø)
6.16-m7g.metal 77.95% <ø> (ø)
6.16-m7i.metal-24xl 82.18% <ø> (+<0.01%) ⬆️
6.16-m7i.metal-48xl 82.18% <ø> (-0.01%) ⬇️
6.16-m8g.metal-24xl 77.94% <ø> (-0.01%) ⬇️
6.16-m8g.metal-48xl 77.95% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

+ *offset += PAGE_SIZE;
+ }
+ else
+ filemap_remove_folio(folio);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this scenario, I think we need to return 0 instead of copied, because we're throwing away the data written

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, fixed, thanks!

@kalyazin kalyazin marked this pull request as ready for review July 31, 2025 11:18
Do this by implementing write_iter instead that takes care about folio
locking and potential prefaulting of the user buffer.

Signed-off-by: Nikita Kalyazin <[email protected]>
@roypat
Copy link
Contributor

roypat commented Jul 31, 2025

picked up as part of #5340

@roypat roypat closed this Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants