Skip to content

changing rocm or alpaka-rocm tools does not cause the recompilation of alpaka modules #48054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fwyzard opened this issue May 11, 2025 · 15 comments

Comments

@fwyzard
Copy link
Contributor

fwyzard commented May 11, 2025

I would expect that modifying the rocm and alpaka-rocm tools and then running scram b checkdeps would cause all alpaka-based modules to be checked out.

Instead, only the modules that use rocm explicitly are checked out:

$ scram b checkdeps
>> Local Products Rules ..... started
>> Local Products Rules ..... done
>> Local Products Rules ..... started
>> Local Products Rules ..... done
Tool changed: alpaka-rocm
Tool changed: rocm

Checking out packages
HeterogeneousCore/AlpakaServices
HeterogeneousCore/AlpakaTest
HeterogeneousCore/ROCmServices
HeterogeneousCore/ROCmUtilities
HeterogeneousTest/ROCmDevice
HeterogeneousTest/ROCmKernel
HeterogeneousTest/ROCmOpaque
HeterogeneousTest/ROCmWrapper
Checking out these packages: 0
@fwyzard
Copy link
Contributor Author

fwyzard commented May 11, 2025

assign core

@cmsbuild
Copy link
Contributor

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @fwyzard.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@smuzaffar
Copy link
Contributor

thanks @fwyzard for reporting this. This happens as the extra alpaka backend products are not directly registered in the SCRAM caches (which are used by the script tracking the external tools deps) and instead directly dumped int he Makefile fragments. I will update build rules to also add alpaka-<backend> dependency if cmssw package has enabled ALPAKA_BACKENDS

@makortel
Copy link
Contributor

@smuzaffar With cms-sw/cmsdist#9861 can we close the issue?

@smuzaffar
Copy link
Contributor

@fwyzard , with new build rules (in latest 15.1.X IB) , changing rocm/cuda tools now also look for cmssw packages which have alpaka dependency and properly checkout all those package. For some reason we have alpaka dependency propogated in over 650 packages and all of these will be checked out when rocm and/or cuda tools are changed (version update or tool file definition update)

Can you please try thsi for CMSSW_15_1_X_2025-05-14-2300 IB?

[a]

> scram b checkdeps
>> Local Products Rules ..... started
>> Local Products Rules ..... done
Tool changed: rocm
Tool changed: alpaka
....
....

@fwyzard
Copy link
Contributor Author

fwyzard commented May 15, 2025

Mhm, I just tried but I got a weird error:

.../CMSSW_15_1_X_2025-05-14-2300$ scram b disable-rocm
>> Local Products Rules ..... started
>> Local Products Rules ..... done
Alpaka backend rocm is disable.

.../CMSSW_15_1_X_2025-05-14-2300$ scram b checkdeps
>> Local Products Rules ..... started
>> Local Products Rules ..... done
error: unknown switch `r'
usage: git diff --no-index [<options>] <path> <path>

@fwyzard
Copy link
Contributor Author

fwyzard commented May 15, 2025

It does work for a smaller change, like changing the CUDA architecture to build for.
The scram b checkdeps checks out over 600 packages :-)

@smuzaffar
Copy link
Contributor

as scram b disable-rocm does not change the rocm toolfile that is why checkdeps not working for it. I am looking what can be done in this case

@makortel
Copy link
Contributor

For some reason we have alpaka dependency propogated in over 650 packages

That seems to be the cost of placing Alpaka-based data formats into existing packages (such as DataFormats/BeamSpot, DataFormats/{Ecal,Hcal}{Digi,RecHit}, CondFormats/{Ecal,Hcal}Objects).

@smuzaffar
Copy link
Contributor

@fwyzard , hopefull cms-sw/cmssw-config@ab46158 should fix this. Can you please try getting https://raw.githubusercontent.com/cms-sw/cmssw-config/refs/heads/scramv3/SCRAM/GMake/Makefile.checkdeps and copy it in your config/SCRAM/GMake/Makefile.checkdeps file and then try scram b checkdeps again?

@fwyzard
Copy link
Contributor Author

fwyzard commented May 15, 2025

Yes, this also works for scram b disable rocm.

@fwyzard
Copy link
Contributor Author

fwyzard commented May 15, 2025

That seems to be the cost of placing Alpaka-based data formats into existing packages (such as DataFormats/BeamSpot, DataFormats/{Ecal,Hcal}{Digi,RecHit}, CondFormats/{Ecal,Hcal}Objects).

Should we consider moving them to separate directories ?

@makortel
Copy link
Contributor

That seems to be the cost of placing Alpaka-based data formats into existing packages (such as DataFormats/BeamSpot, DataFormats/{Ecal,Hcal}{Digi,RecHit}, CondFormats/{Ecal,Hcal}Objects).

Should we consider moving them to separate directories ?

From dependency management point of view that would be a good move.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants