Skip to content

Performance impact of disabling MKLDNN in Tensorflow #47991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
makortel opened this issue Apr 30, 2025 · 17 comments
Open

Performance impact of disabling MKLDNN in Tensorflow #47991

makortel opened this issue Apr 30, 2025 · 17 comments

Comments

@makortel
Copy link
Contributor

This issue is a followup to @gartung's talk in Core Software meeting 2025-04-29 https://indico.cern.ch/event/1543134/#17-profiling-and-tensorflow to record the performance impact and discuss how to proceed.

@makortel
Copy link
Contributor Author

assign core, ml

@cmsbuild
Copy link
Contributor

New categories assigned: core,ml

@Dr15Jones,@makortel,@smuzaffar,@valsdav,@y19y19 you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 30, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @makortel.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@gartung
Copy link
Member

gartung commented Apr 30, 2025

Workflow 29834.21 80 events single threaded

CMSSW_15_1_X_2025-04-21-2300
-- Tensorflow with MKLDNN

CMSSW_15_1_MKLDNN0_X_2025-04-21-2300
-- Tensorflow without MKLDNN

RECO Step

Without MKLDNN	With MKLDNN	
0.0271875	0.0270157	
0.0274473	0.027512	
0.0274606	0.0275002	
0.0275526	0.0274082	
0.0274737	0.0276156	
0.027548	0.0274871	
0.0275574	0.0276823	
0.0274444	0.0269425	
0.0273316	0.0266684	
0.0266689	0.0272566	
		
0.0273672	0.0273089	0.2136303 %
0.0002702	0.0003310

PAT step

Without MKLDNN	With MKLDNN	
0.293232	0.28897	
0.284163	0.290577	
0.288738	0.293765	
0.262822	0.289624	
0.291147	0.291533	
0.290975	0.29029	
0.292249	0.288425	
0.292383	0.285111	
0.29239	        0.28795	
0.291838	0.290082	
		
0.2879937	0.2896327	-0.5658891%
0.0092291	0.0022946	

@makortel
Copy link
Contributor Author

@gartung What CPU did the node have?

@gartung
Copy link
Member

gartung commented Apr 30, 2025

AMD EPYC-Genoa Processor
cmsdev31

@makortel
Copy link
Contributor Author

I'm a bit surprised the JIT for AVX512 ("With MKLDNN") does not result in bigger difference wrt. AVX2 binaries ("Without MKLDNN").

@gartung
Copy link
Member

gartung commented Apr 30, 2025

The Genoa cores have avx-512 instructions.
The eigen matrix operations use the available instruction set.

@gartung
Copy link
Member

gartung commented Apr 30, 2025

On an Intel CPU with deep learning instructions OneDNN might be faster.

@gartung
Copy link
Member

gartung commented Apr 30, 2025

@gartung
Copy link
Member

gartung commented Apr 30, 2025

The inclusion of VNNI optimization only occurs in the OneDNN included in Tensorflow 2.17.0.

@makortel
Copy link
Contributor Author

The Genoa cores have avx-512 instructions. The eigen matrix operations use the available instruction set.

Right, but only when JITted in "with MKLDNN" setup, right? The "without MLKDNN" should use only the x86-64-v3 instructions.

@gartung
Copy link
Member

gartung commented Apr 30, 2025

If I recall correctly, yes.

@gartung
Copy link
Member

gartung commented May 16, 2025

@makortel Since compiling the stack with frame pointers seems to resolve the segfaults in libunwind when profiling workflows with Tensorflow/OneDNN JITing, should this issue be closed?

@makortel
Copy link
Contributor Author

Yes, let's close

@gartung
Copy link
Member

gartung commented May 20, 2025

After changing the profiling script to NOT configure the build with the Tensorflow with the MKLDNN0 IB's the profiling jobs are again segfaulting. I was mistaken in assuming that enabling frame pointers was enough to prevent the segfaults.
@makortel you would need to reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants