-
Notifications
You must be signed in to change notification settings - Fork 204
feat(device/cpu): aggregate multi-socket zones into single zone #2183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(device/cpu): aggregate multi-socket zones into single zone #2183
Conversation
00d830d
to
5c778cb
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## reboot #2183 +/- ##
==========================================
- Coverage 92.41% 92.39% -0.02%
==========================================
Files 37 38 +1
Lines 3809 3907 +98
==========================================
+ Hits 3520 3610 +90
- Misses 230 236 +6
- Partials 59 61 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5c778cb
to
fc6c018
Compare
fc6c018
to
b5e6898
Compare
This commit implements `AggregatedZone` to consolidate multiple EnergyZones with same name (e.g., package zones in multi-socket systems) into single zone. The aggregation also handles counter wrapping and is transparent to the caller. This enables power attribution across multi-socket systems while maintaining compatibility with single-socket deployments. Additionally it also solves the zone label which used to have the index suffix as a hack. Key changes: - New `AggregatedZone` type that sums energy from multiple zones of the same type - Proper handling of counter wrapping at MaxEnergy - Thread-safe energy aggregation with overflow protection Signed-off-by: Sunil Thaha <[email protected]>
79f25bd
to
bdce7f3
Compare
Tested on multi-socket machine. Available RAPL domains on machine:
Kepler creating aggregated zone:
Comparing against node exporter |
@sthaha Right now, if I want to filter only the package-0 zone, it's not possible? |
Thats right ... but if you run a version before this you should be able to see the values. |
|
||
} | ||
|
||
// Multiple zones with same name - create AggregatedZone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an option to not aggregate in some rare case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question and I think we will need that when/if kepler supports measuring individual socket consumption and is able to attribute that to process running on those sockets.
Or are you talking about some other requirement that we may have at the moment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some reason we cant think of now. the default can be to aggregate, but if needed a config change (unexposed perhaps) will stop the aggregation.
az.mu.Lock() | ||
defer az.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need mutex? sysfsRaplZone
didn't need this
This commit implements
AggregatedZone
to consolidate multiple EnergyZones with same name (e.g., package zones in multi-socket systems) into single zone. The aggregation also handles counter wrapping and is transparent to the caller. This enables power attribution across multi-socket systems while maintaining compatibility with single-socket deployments.Additionally it also solves the zone label which used to have the index suffix as a hack.
Key changes:
AggregatedZone
type that sums energy from multiple zones of the same type