A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
-
Updated
Apr 20, 2024 - C++
A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
An Efficient Pipelined Data Parallel Approach for Training Large Model
Hydrodynamic Cytoskeleton Simulator
Multiple Sequence Aligner using hybrid parallel computing
High-performance n-body solver achieving 128× speedup via checkerboard domain decomposition, toroidal neighbor communication, and sweep-and-prune for scalable uniform/gaussian workloads.
Add a description, image, and links to the hybrid-parallelism topic page so that developers can more easily learn about it.
To associate your repository with the hybrid-parallelism topic, visit your repo's landing page and select "manage topics."