Skip to content

Add a reference to the nativelink benchmarks repository #1744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skyrpex
Copy link

@skyrpex skyrpex commented May 4, 2025

Description

Adds a reference to the nativelink benchmarks repository, currently https://github.com/skyrpex/nativelink-benchmarks.

The repository contains some GitHub actions and a website that gathers and shows data using plots, pretty much in the same fashion as https://benchmarks.mikemccandless.com/. You can find some documentation in the README.md file, but basically the project is prepared to read any amount of .jsonl files containing the benchmark data and present them using plots, organized by categories. There is a limited amount of customization for every plot, but we could make it more versatile if needed.

The benchmark.yml GitHub workflow that runs the benchmarks will run nightly, but can also be manually triggered. After running the benchmarks and updating the .jsonl files with the new data, it'll push the changes to its main branch. Now, I'd suggest to just run the nightly benchmarks and call it a day, since it doesn't require the nativelink to do any additional setup. The benchmarking will keep track of the last nativelink commit that was benchmarked, and will skip the process if there aren't new commits to benchmark. Once new benchmarking data is commited to the main branch, the website should be built and deployed. This can be done with connecting the repository to Vercel, for example, and it'll handle the rest as the project is just a simple static website.

Now, I'm ashamed to say that I'm like fish out of the bowl when trying to compile anything with nativelink, yet alone configuring a local version of nativelink and running some benchmarks with it. That's why I left out the part of actual benchmarking, which anybody can implement in the scripts/benchmark.ts file. The repository comes with a bunch of sample data that can be removed or adapted to the real benchmark data.

I understand my attempt doesn't completely fulfill the bounty so I'm happy to take a cut, split it, discuss it, or worst case scenario to deliver this partial contribution for free.

/claim #1700

Closes #1700.


This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented May 4, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotta say, I think this looks pretty great!

The one table we should probably add as well here is a "no-cache" table for some rough intuition on the general speedup. That would probably need to run as just a cronjob because it's a pretty big workload to build tensorflow.

@Evaan2001 or @jaroeichler could one of you help out setting up the actual benchmark data in the third-party repo? Especially for @jaroeichler it could be interesting to add Chromium to those benchmarks as well (but let's focus on TensorFlow first here).

@skyrpex Setting up NL for remote building and/or execution is actually not that hard as long as a toolchain image is already present. Fortunately, TensorFlow already publishes their toolchain image under the devel tag: https://www.tensorflow.org/install/source#docker_linux_builds. I think you could more or less just use the very recently published template here: https://nativelink.com/docs/rbe/nix-templates as a reference point, remove the LRE logic, add an http_archive_override to the MODULE.bazel to pull in tensorflow and set the --default_remote_exec_property (or @jaroeichler however this was called) to the devel container image (just make sure to pin the tensorflow commit hash and the devel sha256).

cc @kubevalet IIRC we already have a TODO for building tensorflow in our cloud. We could potentially reuse that configuration for that as well.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: 0 of 2 LGTMs obtained, and all files reviewed, and pending CI: Web Platform Deployment / macos-15, Web Platform Deployment / ubuntu-24.04, pre-commit-checks, and 2 discussions need to be resolved


-- commits line 4 at r2:
Squash commits


CONTRIBUTING.md line 484 at r1 (raw file):

## Independent benchmarks

The [nativelink-benchmarks](https://github.com/skyrpex/nativelink-benchmarks) repository gathers benchmarking data for every commit in the `nativelink` repository.

Let's clarify that this is a best-effort third-party repository (just to be on the safe side for legal reasons and to not confuse contributors if they see that it's not a repo in the trace org).

@skyrpex
Copy link
Author

skyrpex commented May 4, 2025

Gotta say, I think this looks pretty great!

Awesome, thanks :)

The one table we should probably add as well here is a "no-cache" table for some rough intuition on the general speedup. That would probably need to run as just a cronjob because it's a pretty big workload to build tensorflow.

Sounds good to me! In that case, I'd suggest to run benchmarks nightly rather than per commit.

Another big task related to the benchmarking is adding the ability to run different benchmarks in parallel (using different GitHub workflows) that generate different .jsonl files, then having a final workflow that aggregates all of them together and commits them at the end. I could work on if that sounds good to you.

@Evaan2001 or @jaroeichler could one of you help out setting up the actual benchmark data in the third-party repo? Especially for @jaroeichler it could be interesting to add Chromium to those benchmarks as well (but let's focus on TensorFlow first here).

🙏🏻

@skyrpex Setting up NL for remote building and/or execution is actually not that hard as long as a toolchain image is already present. Fortunately, TensorFlow already publishes their toolchain image under the devel tag: https://www.tensorflow.org/install/source#docker_linux_builds. I think you could more or less just use the very recently published template here: https://nativelink.com/docs/rbe/nix-templates as a reference point, remove the LRE logic, add an http_archive_override to the MODULE.bazel to pull in tensorflow and set the --default_remote_exec_property (or @jaroeichler however this was called) to the devel container image (just make sure to pin the tensorflow commit hash and the devel sha256).

cc @kubevalet IIRC we already have a TODO for building tensorflow in our cloud. We could potentially reuse that configuration for that as well.

I'll have a look!

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: 0 of 2 LGTMs obtained, and all files reviewed, and pending CI: Web Platform Deployment / macos-15, Web Platform Deployment / ubuntu-24.04, pre-commit-checks, and 2 discussions need to be resolved

-- commits line 4 at r2: Squash commits

CONTRIBUTING.md line 484 at r1 (raw file):

## Independent benchmarks

The [nativelink-benchmarks](https://github.com/skyrpex/nativelink-benchmarks) repository gathers benchmarking data for every commit in the `nativelink` repository.

Let's clarify that this is a best-effort third-party repository (just to be on the safe side for legal reasons and to not confuse contributors if they see that it's not a repo in the trace org).

I've just thrown some more words at it. IMO you should clone the repository into TraceMachina's org after it works as you expect.

@MarcusSorealheis
Copy link
Collaborator

MarcusSorealheis commented May 4, 2025

This is pretty good. We need to also check to ensure that there was a warm cache here. I think that if there is no warm cache, it could prove difficult for some users to understand the delta. Or maybe, we should have both.

I also think it would make your life easier if you invite us as users to the benchmark account you have on Nativelink.

@aaronmondal what do you think about cold vs warm cache? We need to see both, I feel. There will obviously be some confounding variables related to how Tensorflow might've changed with respect to how Nativelink has changed.

@skyrpex
Copy link
Author

skyrpex commented May 4, 2025

I have one doubt though. As far as I understand, we want to clone NativeLink in a GitHub workflow, build it, spin it up, and compile other projects with it, right? Or are you thinking of using NativeLink in production to run the compilations? In that case, it'd be necessary to be able to retrieve the commit hash from NativeLink that lies in production, so we can tie the benchmark data to the actual NL commit.

@MarcusSorealheis
Copy link
Collaborator

So, @skyrpex, I think we need to cold cache and then a second run after the cache is warmed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Benchmarking on Per-Commit Basis
4 participants