[RLlib] Add docs for Implicit Q-Learning. #55422

simonsays1980 · 2025-08-08T15:22:02Z

Why are these changes needed?

This adds IQL to the RLlib docs.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: simonsays1980 <[email protected]>

gemini-code-assist

Summary of Changes

Hello @simonsays1980, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've added comprehensive documentation for the Implicit Q-Learning (IQL) algorithm to the RLlib section of the Ray documentation. This update ensures that users can easily find information about IQL, understand its core principles, and access relevant resources like its research paper, implementation, and configuration options. The goal is to make IQL more accessible and understandable for developers working with RLlib.

Highlights

Algorithm Table Update: I've updated the main RLlib algorithms overview table to include Implicit Q-Learning (IQL), ensuring it's discoverable alongside other algorithms.
New IQL Documentation Section: I've added a comprehensive new section dedicated to Implicit Q-Learning (IQL), detailing its architecture, linking to its paper and implementation, and providing configuration details.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds documentation for the Implicit Q-Learning (IQL) algorithm to the RLlib documentation. The changes include adding IQL to the algorithm table and creating a new section with a description, links to the paper and implementation, a tuned example, and the configuration class. My review identifies a couple of minor formatting issues in the reStructuredText that could affect rendering.

gemini-code-assist · 2025-08-08T15:23:14Z

doc/source/rllib/rllib-algorithms.rst

+Implicit Q-Learning (IQL)
+-------------------------
+`[paper] <https://arxiv.org/abs/2110.06169>`__
+`[implementation] <https://github.com/ray-project/ray/blob/master/rllib/algorithms/iql/iql.py`__


There's an extra backtick before the __ at the end of the implementation link, which will likely cause a rendering issue. Please remove it.

`[implementation] <https://github.com/ray-project/ray/blob/master/rllib/algorithms/iql/iql.py>`__

gemini-code-assist · 2025-08-08T15:23:14Z

doc/source/rllib/rllib-algorithms.rst

+    which yields a conservative estimate of returns. This allows policy improvement through advantage-weighted behavior cloning,
+    ensuring safer generalization without explicit exploration.
+
+    The `IQLLearner`` replaces the usual TD-based value loss with an expectile regression loss, and trains the policy to imitate


There's an extra backtick after IQLLearner, which will likely cause a rendering issue. Please remove it.

The `IQLLearner` replaces the usual TD-based value loss with an expectile regression loss, and trains the policy to imitate

Signed-off-by: simonsays1980 <[email protected]>

sven1977 · 2025-08-14T13:49:00Z

doc/source/rllib/rllib-algorithms.rst

@@ -39,6 +39,10 @@ as well as multi-GPU training on multi-node (GPU) clusters when using the `Anysc
 +-----------------------------------------------------------------------------+------------------------------+------------------------------------+--------------------------------+
 | :ref:`BC (Behavior Cloning) <bc>`                                           | |single_agent|               | |multi_gpu| |multi_node_multi_gpu| | |cont_actions| |discr_actions| |
 +-----------------------------------------------------------------------------+------------------------------+------------------------------------+--------------------------------+
+| :ref:`CQL (Conservative Q-Learning) <cql>`                                  | |single_agent|               | |multi_gpu| |multi_node_multi_gpu| | |cont_actions|                 |


Huh! Great catch! :)

sven1977

LGTM! Thanks for the PR @simonsays1980 !

commit a86bb60df41987bfee65b227fcce69a7eee44b9e Author: Justin Yu <[email protected]> Date: Tue Aug 19 08:58:10 2025 -0700 [core] Fix actor import error message for async actors (#55722) When the Ray actor class fails to import upon actor creation, we create a TemporaryActor in its place to emit an error message. However, for async actors, the TemporaryActor creation fails to initialize due having no async methods. This PR adds a dummy async method to handle this case. ```python Traceback (most recent call last): File "<string>", line 1, in <module> File "<string>", line 35, in <module> File "/Users/justin/Developer/ray/python/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/Users/justin/Developer/ray/python/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/justin/Developer/ray/python/ray/_private/worker.py", line 2896, in get values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/justin/Developer/ray/python/ray/_private/worker.py", line 970, in get_objects raise value ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::Foo.__init__() (pid=42078, ip=127.0.0.1, actor_id=7000b00899a3a8b1d05bbdc601000000, repr=<__main__.FunctionActorManager._create_fake_actor_class.<locals>.TemporaryActor object at 0x10732dc10>) ray.exceptions.ActorDiedError: The actor died unexpectedly before finishing this task. class_name: TemporaryActor actor_id: 7000b00899a3a8b1d05bbdc601000000 Failed to create actor. You set the async flag, but the actor does not have any coroutine functions. (TemporaryActor pid=42078) The original cause of the RayTaskError (<class 'ray.exceptions.ActorDiedError'>) isn't serializable: cannot pickle 'google._upb._message.Descriptor' object. Overwriting the cause to a RayError. ``` --------- Signed-off-by: Justin Yu <[email protected]> commit f53e38b119ab19c27db6f32d76710a3dd8c6e9c1 Author: tannerdwood <[email protected]> Date: Tue Aug 19 08:44:44 2025 -0700 [Core] Update DLAMI Information in aws.md (#55702) Signed-off-by: Tanner Wood <[email protected]> Co-authored-by: Tanner Wood <[email protected]> commit c4482d2fc6d7956104c5b0208a7cc14120737652 Author: Ibrahim Rabbani <[email protected]> Date: Tue Aug 19 07:47:57 2025 -0700 [core] Remove job submission code for using JobAgent on a random worker node. (#55718) When a Job is submitted through the SDK/JobClient, the request goes to the dashboard's JobHead. The JobHead submits a request to a JobAgent which has a JobManager. The JobManager creates a JobSupervisor actor which manages the lifecycle of the job. In #47147, the `RAY_JOB_AGENT_USE_HEAD_NODE_ONLY` feature flag to force head node's JobAgent to be used for job submission. The flag was intended to be a temporary kill switch if head_node only scheduling had issues. Now that #47147 has been merged for over a year, I'm cleaning up the flag in this PR and making it the default (and only behavior). --------- Signed-off-by: irabbani <[email protected]> commit f797480b014262ffdf7b33a431fcbc34c0d95b2f Author: Dhyey Shah <[email protected]> Date: Tue Aug 19 00:10:43 2025 -0700 [core] Correct bytes in flight when objects <5mb (#54349) Signed-off-by: dayshah <[email protected]> commit be33b6fb411b21d2bb2cadfc8755a66c195d2272 Author: avigyabb <[email protected]> Date: Mon Aug 18 21:41:43 2025 -0700 [Core] Bind runtime env agent and dashboard agent http server to specified ip instead of 0.0.0.0 (#55431) Signed-off-by: avigyabb <[email protected]> Signed-off-by: avibasnet31 <[email protected]> Co-authored-by: avibasnet31 <[email protected]> Co-authored-by: Jiajun Yao <[email protected]> commit 69bc6c1e8394ef5846bf3d3d36a7fd384441c5a1 Author: Ibrahim Rabbani <[email protected]> Date: Mon Aug 18 21:38:58 2025 -0700 [core] ray.put returns an ObjectRef without an owner_address. (#55636) Signed-off-by: irabbani <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> commit 28d1dc9fbdc57b3c33dcc244924e520fa158104b Author: Rui Qiao <[email protected]> Date: Mon Aug 18 21:36:12 2025 -0700 [Serve.llm] Support colocating local DP ranks in DPRankAssigner (#55720) Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: Rui Qiao <[email protected]> commit 30c8122962dcb1285fd4324313770a53693ce863 Author: Lonnie Liu <[email protected]> Date: Mon Aug 18 16:46:06 2025 -0700 [image] refactor apt package installation (#55701) avoid reinstalling packages that are already installed in the base image also rename the saved requirements file to `extra-test-requirements.txt` Signed-off-by: Lonnie Liu <[email protected]> commit 6993ba79da529a44fb23b1717acac3d83aa5dcef Author: Jeffrey Wang <[email protected]> Date: Mon Aug 18 16:19:27 2025 -0700 [data.llm] Adjust LLM engine timing logic (#55595) Signed-off-by: jeffreyjeffreywang <[email protected]> Co-authored-by: jeffreyjeffreywang <[email protected]> commit 7424ffbdc7b5df15141c66c23c1adafa36cd431b Author: vincenthhan <[email protected]> Date: Tue Aug 19 07:18:28 2025 +0800 [llm] support custom s3 endpoint when downloading models from remote (#55458) Signed-off-by: vincenthhan <[email protected]> Co-authored-by: vincenthhan <[email protected]> commit e9160b72338c4d682af2eb0249f442bd1ff4992d Author: Qiaolin Yu <[email protected]> Date: Mon Aug 18 15:39:46 2025 -0700 [core] Not overriding accelerator id env vars when num_accelerators is 0 or not set (#54928) commit fd3f23593de38fec41c8321da7c169b08eb768cc Author: Edward Oakes <[email protected]> Date: Mon Aug 18 17:32:25 2025 -0500 [core] Remove unnecessary dependency of raylet->gcs (#55710) The raylet binary was depending on all of the `gcs/` directory for absolutely no reason :( --------- Signed-off-by: Edward Oakes <[email protected]> commit 3ea021227eaeb0404c42cf09015bc685eb097cfb Author: Edward Oakes <[email protected]> Date: Mon Aug 18 17:28:48 2025 -0500 [core] Separate targets for pubsub interfaces (#55681) Move publisher & subscriber interfaces into their own header files & build targets. Update relevant callsites to use them. Unfortunately, `reference_count_test` reaches into internal implementation details of the publisher and this dependency was a little tricky to break, so not touching it here. --------- Signed-off-by: Edward Oakes <[email protected]> commit 1cb4c2c212e5a153e74d86f1e0d2e48942a19502 Author: Cuong Nguyen <[email protected]> Date: Mon Aug 18 15:12:37 2025 -0700 [core] rename ray/telemetry to ray/observability (#55703) As title. According to @edoakes, ray telemetry has a different meaning in the ray eco-system. Observability directory will consists for metrics, events and log related infra. Test: - CI Signed-off-by: Cuong Nguyen <[email protected]> commit 9128c40da7a8166bb7a9ca7025b01d8a7a5e38db Author: Sven Mika <[email protected]> Date: Mon Aug 18 22:40:55 2025 +0200 [RLlib] Fix MetricsLogger/Stats throughput bugs. (#55696) commit 01b9e5b1a6b913041b299d7cd262254cfc99503a Author: Lonnie Liu <[email protected]> Date: Mon Aug 18 12:33:07 2025 -0700 [ci] release test: use rayci build id for image tags (#55619) rather than using commit based tags. this avoids runs across different runs on the same commit to crosstalk to each other. --------- Signed-off-by: Lonnie Liu <[email protected]> commit 6326b2c539d4337019dc5107b569e272fc8a8fcf Author: Sagar Sumit <[email protected]> Date: Tue Aug 19 00:34:29 2025 +0530 [core] Call `__ray_shutdown__` method during actor graceful shutdown (#54584) This PR introduces a new `__ray_shutdown__ ` method mechanism for Ray actors to perform deterministic resource cleanup before actor termination. This addresses issue #53169 by providing a reliable alternative to `__del__` methods for critical cleanup operations. The new `__ray_shutdown__ ` method can be explicitly overriden and provides: - Deterministic execution: Called explicitly by Ray during actor shutdown. - Reliable timing: Executes at the exact right moment before process termination. - Optionality: Actors without the method continue to work normally. Main changes: 1. `core_worker.cc` - Add cleanup call in Shutdown() 2. `_raylet.pyx` - Add callback registration 3. `worker.py` - Register callback when actor is created Closes #53169 --------- Signed-off-by: Sagar Sumit <[email protected]> commit ec4056ea67e4226fea2f11abaf4e16bf5a3aba14 Author: Edward Oakes <[email protected]> Date: Mon Aug 18 13:53:47 2025 -0500 [ci] Add ability for users to include `.user.bazelrc` file (#55698) I wanted a way to turn on `--incompatible_strict_action_env` by default without having an untracked change in my `.bazelrc` constantly and without needing to pass the `--config` flag all the time. This PR allows users to define a `.user.bazelrc` file for such changes. For example, to turn on `--incompatible_strict_action_env` by default, I've added this file: ``` build --config=strict test --config=strict ``` Signed-off-by: Edward Oakes <[email protected]> commit 5afa2abcb65980e2ab558076b39e9a44bd2e3566 Author: Potato <[email protected]> Date: Tue Aug 19 02:46:17 2025 +0800 [Data]Fix sort_benchmark url not found error (#55692) The url is invalid as we changed the name for `sort.py` in https://github.com/ray-project/ray/pull/49017 --------- Signed-off-by: Potato <[email protected]> commit 81856dfad0ab26dffc5d9209ae297f8acd16ce9a Author: Lonnie Liu <[email protected]> Date: Mon Aug 18 11:28:28 2025 -0700 [wheel] when `RAY_DISABLE_EXTRA_CPP=1`, do not build cpp stuff (#55697) this gives us a way to safely skip the ray-cpp building parts when building ray wheel. Signed-off-by: Lonnie Liu <[email protected]> commit b95fc3e0757a89dea38f243c9a29f3768f82b98f Author: Sampan S Nayak <[email protected]> Date: Mon Aug 18 23:47:39 2025 +0530 [core] Add logic to convert TaskProfileEvent to RayEvent before sending to event aggregator (#55138) As part of oneEvent effort, all individual task event objects (such as task definition event, task execution event, etc) are being consolidated under one type: RayEvent. This pr adds the translation logic to convert the `TaskProfileEvent` ->` rpc::events::RayEvent object` + tests to verify that the translation and subsequent section of the `TaskEventBufferImpl` correctly deal with the constructed RayEvent. Signed-off-by: sampan <[email protected]> Signed-off-by: Sampan S Nayak <[email protected]> Co-authored-by: sampan <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Mengjin Yan <[email protected]> commit 6c90c0de34f5b3f618db076c4f3197f78aefc8bf Author: yi wang <[email protected]> Date: Tue Aug 19 02:00:27 2025 +0800 [Data] explain API for dataset (#55482)   Introduce explain() for dataset, which output logical plan and physical plan.  part of #55052  - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: my-vegetable-has-exploded <[email protected]> Signed-off-by: Richard Liaw <[email protected]> Co-authored-by: Richard Liaw <[email protected]> commit a81c7c9fbd5ed70e8aaae5cc2f1bc3e284ec8723 Author: Timothy Seah <[email protected]> Date: Mon Aug 18 10:38:44 2025 -0700 [train][tune] Train Controller is always actor + fix tune integration to enable this (#55556) In the past, we used `RUN_CONTROLLER_AS_ACTOR_ENV_VAR` to toggle whether to run the controller as a separate actor (we want this in most cases) or on the current actor (we wanted this in Tune so we can propagate `ray.train.report` from Train to Tune using the `TuneReportCallback`). However, in order to implement `get_all_reported_checkpoints` (https://github.com/ray-project/ray/pull/54555), we need to pass the Train Controller actor to all the Train Worker actors. This method wouldn't work when using Train from Tune because the Train Controller actor handle would be the Tune Trainable actor handle which does not have the async `get_all_reported_checkpoints` method. This PR gets rid of `RUN_CONTROLLER_AS_ACTOR_ENV_VAR` once and for all by making all communication between Train and Tune happen through a lightweight `ray.util.Queue` actor instead of forcing Train and Tune to happen on the same process. --------- Signed-off-by: Timothy Seah <[email protected]> Co-authored-by: Timothy Seah <[email protected]> commit 796858a91c9a98b7785fdf012096b4a3e5f22cca Author: simonsays1980 <[email protected]> Date: Mon Aug 18 19:10:31 2025 +0200 [RLlib] Set default to 'log_gradients=False' to stabilize tests (#55695)   Right now `log_gradients` is by default `True` and this appears to destabilize tests (see #47717). This PR switches the default to `False`. Closes #47717 - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: simonsays1980 <[email protected]> commit ae0d4fc04f7d56e77c080a24bf998a67a3e88631 Author: Doyoung Kim <[email protected]> Date: Mon Aug 18 09:58:37 2025 -0700 [Serve] Update test_deploy_2.py with get_application_url (#55665) We remove the hardcoded url within the test to use `get_application_url()` --------- Signed-off-by: doyoung <[email protected]> commit be423b042d0370456a8abead58fd6502eeb6c6d4 Author: Elliot Barnwell <[email protected]> Date: Mon Aug 18 09:37:47 2025 -0700 [ci] allowing spaces in append args field on depsets (3/4) (#55625) - Allowing for spaces in append args (splitting append arg flags before appending) - adding a couple unit tests --------- Signed-off-by: elliot-barn <[email protected]> Co-authored-by: Lonnie Liu <[email protected]> commit faf06e09e55558fb36c72e91a5cf8a7e3da8b8c6 Author: Dhyey Shah <[email protected]> Date: Mon Aug 18 07:33:47 2025 -0700 [core] Follow-up to address comments of BaseException PR #55602 (#55690) Address comments from #55602 - Moving the base exception and exception group tests into their own file so they can use a shared fixture - Adding comment for SystemExit and KeyboardInterrupt behavior - Adding tests to test behavior if user code raises SystemExit or KeyboardInterrupt --------- Signed-off-by: dayshah <[email protected]> commit e0d8e6f46a8734e16f28831941d937d5961c1d12 Author: simonsays1980 <[email protected]> Date: Mon Aug 18 15:26:58 2025 +0200 [RLlib] - Fix `TensorType` (#55694) commit 1e5094fd5cbfef1de738243b84436b94a7499304 Author: simonsays1980 <[email protected]> Date: Mon Aug 18 15:13:05 2025 +0200 [RLlib - Offline RL] Fix bug in `return_iterator` in multi-learner settings. (#55693) commit b830b8d3ee64f7c661d4bfa5fb0e7be99ff871a5 Author: simonsays1980 <[email protected]> Date: Mon Aug 18 12:30:24 2025 +0200 [RLlib - Offline] Fix some bugs in the docs for IQL and CQL (#55614) commit dde4dbad440ada233d5b3e13a990cf25c20ec60e Author: Rui Qiao <[email protected]> Date: Sun Aug 17 21:33:48 2025 -0700 [Serve.llm] Fix DPServer allocation to CPU node (#55688) Signed-off-by: Rui Qiao <[email protected]> commit 7321aeed2957a5a71ccb34c2212cd8f4c63a9fab Author: Edward Oakes <[email protected]> Date: Sun Aug 17 18:34:27 2025 -0500 [core] Remove unnecessary publisher dependency from raylet (#55678) Signed-off-by: Edward Oakes <[email protected]> commit 6561061f79b31be4f7cecb20e34bdc92e374ef16 Author: Jason Li <[email protected]> Date: Sun Aug 17 11:56:41 2025 -0700 Fixing Circular Import in ray.train.v2.lightning.lightning_utils (#55668) Importing `RayTrainReportCallback` from `ray.train.lightning._lightning_utils` in `ray.train.v2.lightning.lightning_utils` causes a circular import in the case that `ray.train.v2.lightning.lightning_utils` is loaded before `ray.train.lightning`. This PR removes the `ray.train.v2.lightning` module and migrates the changes upstream to the original `RayTrainReportCallback` class. --------- Signed-off-by: JasonLi1909 <[email protected]> commit 03b07db82ab52c5886edd94885fa12d7c30b7b39 Author: Dhyey Shah <[email protected]> Date: Sat Aug 16 19:41:24 2025 -0700 [core] Fix test_failure on windows (#55687) Mixing ray_start_regular and ray_start_regular_shared in the same file can lead to unexpected behavior where cluster state can unexpectedly carry over into setup for another test. Here on windows *test_put_error1, test_put_error2,* and *test_version_mismatch are* skipped so *test_export_large_objects* runs directly after *test_baseexception_actor_creation* causing issues during its setup. In a follow up will just create another test file for all basexception related tests so they can use a shared cluster. Signed-off-by: dayshah <[email protected]> commit 9ae08276c6c466557281dca28477e9ad1d374687 Author: Dhyey Shah <[email protected]> Date: Sat Aug 16 11:16:44 2025 -0700 [core] Update base exception group tests (#55684) Signed-off-by: dayshah <[email protected]> commit a44df1655f3031860f3afd4cc81fc0dc6ab5d6f0 Author: Lonnie Liu <[email protected]> Date: Fri Aug 15 23:38:03 2025 -0700 [ci] release test: fix to use small for test init (#55677) otherwise the permission is incorrect Signed-off-by: Lonnie Liu <[email protected]> commit 418f56258e2085a3f370696930a04ae83e7e0103 Author: kourosh hakhamaneshi <[email protected]> Date: Sat Aug 16 03:19:20 2025 +0200 [serve.llm] Add reset_prefix_cache remote method to llm server (#55658) Signed-off-by: Kourosh Hakhamaneshi <[email protected]> commit 628df247832fa0e51274a6d53ae750eb9b54a794 Author: Rui Qiao <[email protected]> Date: Fri Aug 15 17:12:20 2025 -0700 [serve.llm] Handle push telemetry race conditions (#55558) Signed-off-by: Rui Qiao <[email protected]> commit 4c6993ee347e3a4d1ff9a26fb3daddd9bf50783c Author: Balaji Veeramani <[email protected]> Date: Fri Aug 15 16:51:13 2025 -0700 [Data] Decouple actor and node autoscaling (#55673)   Actor pool autoscaling and node autoscaling are currently tied together in a single `Autoscaler` base class, even though they work mostly independently. This coupling makes testing harder (you have to mock unused dependencies), complicates the interface, and forces you to touch unrelated code when extending one type of autoscaling. This PR splits `Autoscaler` into `ActorAutoscaler` and `ClusterAutoscaler` to simplify testing, reduce complexity, and make future extensions easier.  - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: Balaji Veeramani <[email protected]> commit 9fdea0314ef90cedc341285398bb51d79475b6fd Author: Rui Qiao <[email protected]> Date: Fri Aug 15 15:16:27 2025 -0700 [Serve.llm] Support multi-node data parallel with set_dp_master_info() (#55653) Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: kourosh hakhamaneshi <[email protected]> commit 5fbeff61f889af7eddb7ca7b55ec6a6c8939bc2b Author: Edward Oakes <[email protected]> Date: Fri Aug 15 16:48:53 2025 -0500 [core] Unify test directory layout on `.../tests/` (#55652) We currently have multiple different patterns for test files: - `*_test.cc` in the same file as the implementation. - `test/*_test.cc` (with `BUILD.bazel` in the test dir or sometimes in the parent dir). - `tests/*_test.cc` (with `BUILD.bazel` in the test dir or sometimes in the parent dir). Unifying on: - `tests/*_test.cc` - `tests/BUILD.bazel` for test targets --------- Signed-off-by: Edward Oakes <[email protected]> commit fc967a55018cf85d7f73381985273f429d14cb81 Author: Jiajun Yao <[email protected]> Date: Fri Aug 15 13:30:44 2025 -0700 [Core] Simplify get_event_aggregator_grpc_stub to not depend on webui_url (#55640) Signed-off-by: Jiajun Yao <[email protected]> commit d6dce722f0ff25a55a3b3a4749bd32821bcccbec Author: Edward Oakes <[email protected]> Date: Fri Aug 15 14:54:28 2025 -0500 [serve] Fix easy `ray._private` dependency (#55659) Signed-off-by: Edward Oakes <[email protected]> commit 10af9d897bbdaae4202580ba14dea1d6efcb525b Author: Elliot Barnwell <[email protected]> Date: Fri Aug 15 12:37:21 2025 -0700 [ci] raydepsets: generating llm lock files (4/4) (#55500) - generating llm lock files with raydepsets --------- Signed-off-by: elliot-barn <[email protected]> commit b819ed4add79492dcdc58d7df277bbd1d438f11b Author: Dhyey Shah <[email protected]> Date: Fri Aug 15 12:07:22 2025 -0700 [core] Fix objects_valid with except from BaseException (#55602) We would encounter a ray check failure on `objects_valid` whenever we get a function throws an exception that extends from `BaseException` instead of `Exception`. Fixing that by just excepting `BaseException` instead of `Exception` when we are vulnerable to exceptions thrown from user Python code. We still have to special case `SystemExit` and `KeyboardInterrupt` because we can consider those as critical errors ourselves and treat them as worker shutdown or task cancellation signals respectively. Closes https://github.com/ray-project/ray/issues/43411 Signed-off-by: dayshah <[email protected]> commit 44e0aea628f1f221345aeddaafce3b82d91cf9fa Author: simonsays1980 <[email protected]> Date: Fri Aug 15 20:44:34 2025 +0200 [RLlib] Fix `ImportError` in Atari examples. (#54967)   Running Atari with RLlib results in error described in #53836 . This related to the version of `gymnasium` installed when calling `ray[rllib]` and then later installing `gymnasium[atari,accept-rom-license]`. Using `gymnasium=1.1.1` resolves this error. Closes #53836 - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: simonsays1980 <[email protected]> commit 2cdb27e49d3c4935fe90236f9affa15b5696a42f Author: Doyoung Kim <[email protected]> Date: Fri Aug 15 11:07:19 2025 -0700 [Serve] Update route prefix assignment for ReplicaBase.reconfigure() (#55657) Update assigning value that was slipped from #55407 --------- Signed-off-by: doyoung <[email protected]> commit 616b9a19b42305ba5602e4f3bcab81c1e19cf3a0 Author: Edward Oakes <[email protected]> Date: Fri Aug 15 13:05:36 2025 -0500 [core] Clean up `RayletIpcClientInterface` (#55651) Splits out `raylet_ipc_client_interface.h` into its own target. Sub-interfaces that use the client should only depend on this interface, not the full `raylet_ipc_client` target. This improves incremental builds. For example, now if `raylet_ipc_client.{h,cc}` changes (including any of its transitive dependencies), the core worker `store_provider` targets no longer need to be recompiled. They'll only be recompiled if `raylet_ipc_client_interface.h` changes, which should be much less frequent. I've also moved the `FakeRayletIpcClient` into the source tree. --------- Signed-off-by: Edward Oakes <[email protected]> commit 8d6d9fa4c63e7d1e7ecd7f14347c1a565efe4d95 Author: Seiji Eicher <[email protected]> Date: Fri Aug 15 10:56:53 2025 -0700 [serve.llm] Correct Pyright lints for Ray Serve LLM examples (#55284) Signed-off-by: Seiji Eicher <[email protected]> commit 0b77c72a0133d407fb58a9114764e652a37e963c Author: Justin Yu <[email protected]> Date: Fri Aug 15 10:48:54 2025 -0700 [data] Wrap batch index in a `BatchMetadata` class (#55643) Wrap batch metadata in a dataclass that we can extend in the future. Signed-off-by: Justin Yu <[email protected]> commit a39bc679bace4dfaa334c88572effbc5b952a59f Author: Lonnie Liu <[email protected]> Date: Fri Aug 15 10:14:33 2025 -0700 [serve] pin the version of wrk used in serve ci base (#55650) and clone with depth=1 Signed-off-by: Lonnie Liu <[email protected]> commit 20c84e6193d22d29f25cc36e76ea455417349562 Author: akyang-anyscale <[email protected]> Date: Fri Aug 15 09:56:09 2025 -0700 [serve] Add model composition serve benchmarks (#55549) Model composition is a common paradigm we should also track performance for. --------- Signed-off-by: akyang-anyscale <[email protected]> commit c5a16768c71c354738fc4bef552bd4a58c6b3089 Author: Doyoung Kim <[email protected]> Date: Fri Aug 15 09:43:09 2025 -0700 [Serve] Update test_http_routes to use get_application_url (#55623) Updates one of the serve tests, test_http_routes, so it can start using get_application_url instead of hardcoded urls. --------- Signed-off-by: doyoung <[email protected]> Signed-off-by: Doyoung Kim <[email protected]> commit c7a7d41b4bbd7509b0cb7cc112fd5ac9af5e55af Author: Aleksei Starikov <[email protected]> Date: Fri Aug 15 18:42:41 2025 +0200 [serve] Add a function with a Warning to migrate constants that use `or` expression. (#55464) In the `serve` package some of the constants which are initialized from environment variables are silently replaced empty values as `0` with their default values even if a user set them to `0` explicitly. In addition, they are also can be set to negative values which is likely not expected. The list of the constants: ``` PROXY_HEALTH_CHECK_TIMEOUT_S PROXY_HEALTH_CHECK_PERIOD_S PROXY_READY_CHECK_TIMEOUT_S PROXY_MIN_DRAINING_PERIOD_S -- RAY_SERVE_KV_TIMEOUT_S ``` It happens because of the `or value` structure. This PR introduces: - temporary function `get_env_float_non_zero_with_warning` with `FutureWarning`. The function is showing a warning in the following format in case of unexpected value: ``` FutureWarning: Got unexpected value `0.0` for `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` environment variable! Starting from version `2.50.0`, the environment variable will require a positive value. Setting `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` to `10.0`. PROXY_HEALTH_CHECK_TIMEOUT_S = get_env_float_non_zero_with_warning( -- or FutureWarning: Got unexpected value `-1.0` for `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` environment variable! Starting from version `2.50.0`, the environment variable will require a positive value. Setting `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` to `-1.0`. PROXY_HEALTH_CHECK_TIMEOUT_S = get_env_float_non_zero_with_warning( -- or FutureWarning: Got unexpected value `0.0` for `RAY_SERVE_KV_TIMEOUT_S` environment variable! Starting from version `2.50.0`, the environment variable will require a positive value. Setting `RAY_SERVE_KV_TIMEOUT_S` to `None`. RAY_SERVE_KV_TIMEOUT_S = get_env_float_non_zero_with_warning( ``` If the input value is positive, no warning will be emit. - `None` default value support for env variables (introduced for the `RAY_SERVE_KV_TIMEOUT_S`) - `todo` comment for removing the function: `todo: replace this function with 'get_env_float_positive' for the '2.50.0' release.`  Closes #55454 - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: axreldable <[email protected]> commit de1494e57497b6c57037edf83044ee507fb80159 Author: akyang-anyscale <[email protected]> Date: Fri Aug 15 09:34:30 2025 -0700 [serve] Refactor the router and handle (#55635) Refactor Serve deployment handle and router. --------- Signed-off-by: akyang-anyscale <[email protected]> commit d95ef0c74138e5a529b5f4b0134177d5aa9bdee0 Author: Lonnie Liu <[email protected]> Date: Fri Aug 15 00:00:49 2025 -0700 [ci] release test: use rayci to perform test init (#55629) so that rayci buildid can be populated Signed-off-by: Lonnie Liu <[email protected]> commit fe54c9554106b1e4b89c52833b6251143b0092e5 Author: Qiaolin Yu <[email protected]> Date: Thu Aug 14 22:02:24 2025 -0700 [ci] Add hook to clean the Ray address file before the test run starts (#54715) Co-authored-by: Jiajun Yao <[email protected]> commit 486935db5ede79b419623f29e2593c76a0df57c9 Author: Cuong Nguyen <[email protected]> Date: Thu Aug 14 19:25:41 2025 -0700 [core] add test rules for container tests (#55622) The `core: container` test is pretty flaky on premerge and block PRs from time to time. This PR add a test rule to only run this test on a change that touches `python/ray/runtime_env`. Test: - CI Signed-off-by: Cuong Nguyen <[email protected]> commit c62889c8d2c72e4e3466f31995c43d2f0189b10e Author: goutamvenkat-anyscale <[email protected]> Date: Thu Aug 14 18:53:49 2025 -0700 [Train] - Bump up test size for test_data_integration (#55633) Signed-off-by: Goutam V <[email protected]> commit c7c7e7c8fb99bd1081fe4949ccdff2614e6ce8ca Author: Elliot Barnwell <[email protected]> Date: Thu Aug 14 17:45:05 2025 -0700 [ci] upgrading uv binary and updating test (2/4) (#55626) - upgrading uv from 0.7.20 -> 0.8.10 to gain parity with uv used compile llm lock files job - updating unit test Signed-off-by: elliot-barn <[email protected]> commit 69f421884419c8c39a363eeb6b459bd77b6f0017 Author: Doyoung Kim <[email protected]> Date: Thu Aug 14 17:35:01 2025 -0700 [Serve] Add route_prefix field to DeploymentVersion (#55407) This PR adds `route_prefix` to `DeploymentVersion` class to allow robust light weight config update with `route_prefix`. --------- Signed-off-by: doyoung <[email protected]> commit f8ee5c9629f99c88af1e919a8ba2191a0c07f607 Author: Lonnie Liu <[email protected]> Date: Thu Aug 14 16:44:58 2025 -0700 [ci] pipe through `RAYCI_DISABLE_JAVA` for manylinux base image building (#55606) so that when we do not need java, we can skip installing JDK in the image. Signed-off-by: Lonnie Liu <[email protected]> commit 078d055ad2520b433db28ddc5e48a45bdc0d64a2 Author: Elliot Barnwell <[email protected]> Date: Thu Aug 14 16:44:08 2025 -0700 [ci] raydepsets changing load to build (1/4) (#55627) updating cli command from load to build Signed-off-by: elliot-barn <[email protected]> commit 21bc4528339420623c2f2a1958c7fb68b5dd8a8c Author: Dhyey Shah <[email protected]> Date: Thu Aug 14 14:42:57 2025 -0700 [core] Fix ubsan for publisher_test (#55621) Signed-off-by: dayshah <[email protected]> commit 1c55991ce455632e1ab9839cb4c25f3e4ddc379c Author: Cuong Nguyen <[email protected]> Date: Thu Aug 14 14:10:44 2025 -0700 [core][otel] change+simplify the feature flag for open telemetry (#55592) Change and simplify the feature flag to enable open telemetry. This will enable us to enable open telemetry for the next Ray release version, without worrying about messing up previous Ray release versions. Test: - CI Signed-off-by: Cuong Nguyen <[email protected]> commit fc4ace25a81cf68b71e21c00f1be2532d5c6c148 Author: Kevin H. Luu <[email protected]> Date: Thu Aug 14 13:59:45 2025 -0700 [release] Script to build custom BYOD image (#55577) Add `custom_byod_build` as a python binary that the Buildkite jobs can call to build & push custom BYOD images --------- Signed-off-by: kevin <[email protected]> commit 61bc2e8139e21429d487b0824391c26dcd596cc3 Author: Lonnie Liu <[email protected]> Date: Thu Aug 14 12:56:37 2025 -0700 [ci] read gce credentials file from global config when building anyscale images (#55580) rather than using the hard-coded filename Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: Lonnie Liu <[email protected]> commit 49d336cb332da4cdfff894e95ea6f0189f1b05ff Author: Seiji Eicher <[email protected]> Date: Thu Aug 14 11:53:36 2025 -0700 [Serve.llm] Improve PrefixCacheAffinityRouter text normalization compat (#55588) Signed-off-by: Seiji Eicher <[email protected]> commit 37158a22a44edb10d499b53d1f38f00315234a14 Author: harshit-anyscale <[email protected]> Date: Fri Aug 15 00:21:29 2025 +0530 skip test task processor for windows (#55616) - skipping test task processor for windows to unblock Signed-off-by: harshit <[email protected]> commit 400ea7716c50afe006ab69a5398fa5d3c2e08373 Author: Seiji Eicher <[email protected]> Date: Thu Aug 14 11:46:59 2025 -0700 [serve.llm][docs] Documentation for prefix cache-aware router (#55218) Signed-off-by: Seiji Eicher <[email protected]> Signed-off-by: Seiji Eicher <[email protected]> Co-authored-by: angelinalg <[email protected]> commit 6d7234b1b54ebc8d77ed9a127ce02b9ff4f9854c Author: coqian <[email protected]> Date: Thu Aug 14 11:06:05 2025 -0700 [Data] Update the export API to refresh the dataset and operator states (#55355)   This PR is a revert of [#55333](https://github.com/ray-project/ray/pull/55333) and resolves conflict by [#55163](https://github.com/ray-project/ray/pull/55163) Original description: Some frequently used metadata fields are missing in the export API schema: - For both dataset and operator: state, execution start and end time These fields are important for us to observe the lifecycle of the datasets and operators, and can be used to improve the accuracy of reported metrics, such as throughput, which relies on the duration.  Summary of change: - Add state, execution start and end time at the export API schema - Add a new state enum `PENDING` for dataset and operator, to represent the state when they are not running yet. - Refresh the metadata when ever the state of dataset/operator gets updated. And the event will always contains the latest snapshot of all the metadata.  - [X] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [X] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [X] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: cong.qian <[email protected]> commit 6a9938a73ff6d39ee72dcb68667a52b0ba658e8b Author: Mengjin Yan <[email protected]> Date: Thu Aug 14 11:05:39 2025 -0700 [Core] Add Logic to Check Label Selector in PG Scheduling (#55599) Signed-off-by: Mengjin Yan <[email protected]> commit c4d990cafe01ce4f6caec38e814217310fcc0a1c Author: Lonnie Liu <[email protected]> Date: Thu Aug 14 11:02:48 2025 -0700 [ci] add rayci build id tags for release test images (#55605) in addition to current tags. first step to migrate to use rayci build id tags to stop release test jobs from cross-talking to each other Signed-off-by: Lonnie Liu <[email protected]> commit af41960a49e85863709ef36fb4968f0021d730b3 Author: Stephanie Wang <[email protected]> Date: Thu Aug 14 10:02:16 2025 -0700 [core][gpu-object] Add a user-facing call to wait for tensor to be freed (#55076) This adds a call `ray.experimental.wait_tensor_freed` that allows user code to check when a tensor that it put into Ray's GPU object store has been freed. Unlike the normal Ray object store, the GPU object store is just a Python data structure on the actor, which allows us to avoid copying. This means that the actor can keep a reference to an object in its store. The API call allows the actor to check when the object has been freed from the store, so that it can safely write to the tensor again. Closes #52341. --------- Signed-off-by: Stephanie wang <[email protected]> Signed-off-by: Stephanie Wang <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]> commit f0b0aadd65b3a842ed42ef870ac3067ea42f30af Author: Lonnie Liu <[email protected]> Date: Thu Aug 14 10:01:39 2025 -0700 [image] add base-extra for aarch64 images (#55586) for easier use on ray cluster hosters like anyscale. Signed-off-by: Lonnie Liu <[email protected]> commit ea27578265182b3b721b0b6b5a9f2d6a49e6e61b Author: Lonnie Liu <[email protected]> Date: Thu Aug 14 10:01:25 2025 -0700 [ci] remove unused `use_base_extra` (#55604) added incorrectly in a past change Signed-off-by: Lonnie Liu <[email protected]> commit 7518fd8be262c5f1bdc8246e0a3c5cc7db5d1bd6 Author: Jun-Hao Wan <[email protected]> Date: Fri Aug 15 00:09:47 2025 +0800 [Doc][KubeRay] Add InteractiveMode description for `ray-job-quick-start.md` (#55570) Signed-off-by: win5923 <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]> commit 6afaeda7dc7eb700076ae98b5b356568a293cde2 Author: simonsays1980 <[email protected]> Date: Thu Aug 14 17:08:01 2025 +0200 [RLlib] Add docs for Implicit Q-Learning. (#55422) commit 4b6dba34d50d647a7929b1e9079954511a69c759 Author: Lonnie Liu <[email protected]> Date: Thu Aug 14 00:59:20 2025 -0700 [ci] fix incorrect ml-baseextra depends_on (#55596) to depends on the right wanda job Signed-off-by: Lonnie Liu <[email protected]> commit e4410d09cd0de2a7b2e6e507c12b92d2741cd6ea Author: Nikhil G <[email protected]> Date: Wed Aug 13 22:52:11 2025 -0700 [serve.llm] fix: improve error handling for invalid model_id (#55589) Signed-off-by: Nikhil Ghosh <[email protected]> commit 02340e1f402b8ebde104e92c9941b149e5555acb Author: harshit-anyscale <[email protected]> Date: Thu Aug 14 10:21:53 2025 +0530 add support for async inference (#54824) This PR aims to provide basic support for asynchronous inference in the ray serve. RFC can be found at: https://github.com/ray-project/ray/issues/54652 The PR doesn't contains all the implementation pieces as having all the code changes in a single PR would be very difficult to review. Missing pieces are - implementation of failed and unprocessed task queue for the celery task processor - add more detailed and thorough tests for the same. These missing pieces will be taken care of in the subsequent PRs. --------- Signed-off-by: harshit <[email protected]> commit 4dd73213096635cf78a1a69db84f244bb05ec50f Author: lkchen <[email protected]> Date: Wed Aug 13 21:39:54 2025 -0700 [data.llm] Add FAQ to doc, explain STRICT_PACK strategy used in data.llm (#55505) Signed-off-by: Linkun <[email protected]> commit 15887001ded1eca621f6890952c5c2a90d4e58a8 Author: Joshua Lee <[email protected]> Date: Wed Aug 13 20:56:08 2025 -0700 [core] Store local_raylet_rpc_client in raylet_client_pool (#55490) Signed-off-by: joshlee <[email protected]> commit fd681ee6e3a74f08918eec34ea7a5d2f9b502f39 Author: Elliot Barnwell <[email protected]> Date: Wed Aug 13 20:36:49 2025 -0700 [ci] raydepsets: implementing build arg sets (2/2) (#55423) 1/2 here: https://github.com/ray-project/ray/pull/55408 - implementing get depset by name and optional build arg set - adding unit tests --------- Signed-off-by: elliot-barn <[email protected]> Signed-off-by: Elliot Barnwell <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> commit f677f564cc56c07e7c93d29c33e2f7314ef34fa1 Author: Dhyey Shah <[email protected]> Date: Wed Aug 13 19:42:02 2025 -0700 [core] Improve gcs publish perf + clean up publisher in general (#55560) This PR is focused on two things removing a lot of unnecessary copies when publishing from the GCS + when subscribing to the GCS from + cleaning up publisher related code, e.g. publish functions took callbacks that were always nullptr, always returned Status::OK, etc. There's no actual functional changes in this PR. Copy killing that matters: https://github.com/ray-project/ray/blob/4e5f03e7a1d06b9da8f3a9329400d426055f8ea4/src/ray/gcs/gcs_server/pubsub_handler.cc#L49-L59 Every GCS publish will result in an extra copy here because the `pubsub_reply` we create is heap allocated while the actual reply is arena allocated, so the swap will result in a copy of everything every time we publish to every subscriber. Also, there were multiple extra copies of messages inside gcs_pub_sub.cc when the PythonGcsPublisher publishes and when the PythonGcsSubscriber gets messages. --------- Signed-off-by: dayshah <[email protected]> commit 1699dc367f71ac05db8486ac70758090c37403a7 Author: Neil Girdhar <[email protected]> Date: Wed Aug 13 21:33:45 2025 -0400 Suppress type error (#50994) Signed-off-by: Neil Girdhar <[email protected]> Co-authored-by: matthewdeng <[email protected]> commit ceaa4fb6f5db3189f77a1ed0f2c407de47ce4792 Author: Rui Qiao <[email protected]> Date: Wed Aug 13 18:22:54 2025 -0700 [Serve.llm] Use DEFAULT_MAX_ONGOING_REQUESTS for DPServer (#55583) Signed-off-by: Rui Qiao <[email protected]> commit 54ae92386d2b4600e1a9327b4f83c4c48742a412 Author: Timothy Seah <[email protected]> Date: Wed Aug 13 17:40:01 2025 -0700 [train] Change DEFAULT variables from strings to bools (#55581) All of these constants are used as the default value of [`env_bool`](https://github.com/ray-project/ray/blob/master/python/ray/_private/ray_constants.py#L41), which returns a bool. Technically this is a no-op since "1" evaluates to True anyway, but this is misleading because "0" actually also evaluates to True. Signed-off-by: Timothy Seah <[email protected]> Co-authored-by: Timothy Seah <[email protected]> commit 9838ad64d43dbd25b77acfd834500cd96f793e28 Author: yi wang <[email protected]> Date: Thu Aug 14 08:32:54 2025 +0800 [DOC][Tune] fix: remove extra space in tune documentation (#55125) Signed-off-by: my-vegetable-has-exploded <[email protected]> Co-authored-by: matthewdeng <[email protected]> commit 1216e15c32de9ab44cbc9c5532b0571c6499732f Author: Elliot Barnwell <[email protected]> Date: Wed Aug 13 17:06:31 2025 -0700 [ci] raydepsets: implementing build arg sets (1/2) (#55408) - converting build arg sets into a dictionary instead of a list - updating naming convention for depsets with build_arg_sets ( suffix: _${BUILD_ARG_SET} for depset name in the config) - adding unit tests --------- Signed-off-by: elliot-barn <[email protected]> Signed-off-by: Elliot Barnwell <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> commit ecc4c93af0308ccf4b5e08135865766e9a1fbd30 Author: Lonnie Liu <[email protected]> Date: Wed Aug 13 16:35:17 2025 -0700 [image] add base-extra layer (#55513) this the layer required to run on anyscale cloud and for running in ray release tests. we have been sourcing this layer from a tarball in s3; this change builds it from the source. Signed-off-by: Lonnie Liu <[email protected]> commit 3e34885814e4da9a83123e22b042a7ee684074ad Author: Kishanthan Thangarajah <[email protected]> Date: Wed Aug 13 19:21:05 2025 -0400 [serve] Support custom autoscaling at deployment level for ray serve (#55253) This PR adds initial changes to support custom auto scaling with ray serve. Two new classes (AutoscalingContext and AutoscalingPolicy) have been introduced as per discussions in https://docs.google.com/document/d/1KtMUDz1O3koihG6eh-QcUqudZjNAX3NsqqOMYh3BoWA/edit?usp=sharing. Related RFC https://github.com/ray-project/ray/issues/41135#issuecomment-3156717488 The changes will have two phases. Phase1 is to add required changes to support custom autoscaling at deployment level. Phase2 is to extend the changes to support custom autoscaling at application level. This PR is part of Phase1 (deployment level custom autoscaling). Related to #41135 - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Kishanthan Thangarajah <[email protected]> commit 2c7bd7d06930e5cc302a01c5baedef43911e3582 Author: Dhyey Shah <[email protected]> Date: Wed Aug 13 14:35:25 2025 -0700 [core][ci] Kill debug wheel step (#55571) Signed-off-by: dayshah <[email protected]> commit 52bef607fd4349e70a1874fb2d6a8a9f6d447111 Author: Matvei Pashkovskii <[email protected]> Date: Thu Aug 14 00:10:21 2025 +0300 [Serve.llm] Add LMCacheConnectorV1 support for kv_transfer_config (#54579) Signed-off-by: Matvei Pashkovskii <[email protected]> Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Co-authored-by: Kourosh Hakhamaneshi <[email protected]> commit 32304ab50a5f1c94504d2610a338fef1e84ecef7 Author: Lonnie Liu <[email protected]> Date: Wed Aug 13 13:21:41 2025 -0700 [release test] remove "multi" test frequency (#55561) not used anywhere any more Signed-off-by: Lonnie Liu <[email protected]> commit c47048e6ebf1b7a705cdb1be18b027889623e1a4 Author: Cuong Nguyen <[email protected]> Date: Wed Aug 13 12:56:01 2025 -0700 [core][obsclean/02] de-static more internal ray metrics (#55537) Ray core currently offers two APIs for defining internal metrics: a static object-oriented (OO) API and a template/extern-based API. The OO API is also used for defining custom metrics at the Ray application level, and I personally find it easier to read. This series of PRs aims to unify all metric definitions under the OO API. --------- This PR migrates **all** metric from static to runtime definition, as part of the effort to eliminate all statically defined metrics. Currently, the OO interface attempts to register a metric at the same time its first value is recorded, due to the [C++ static initialization order fiasco](https://en.cppreference.com/w/cpp/language/siof.html), which is awkward and potentially inefficient. We can fix this by removing all statically defined metrics. Test: - CI --------- Signed-off-by: Cuong Nguyen <[email protected]> Signed-off-by: Cuong Nguyen <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> commit 6ebd7d013933dfa990b11ffcad63cfd6f78db6cd Author: iamjustinhsu <[email protected]> Date: Wed Aug 13 12:22:56 2025 -0700 [data] Sanitization of Dataset Metadata Export (#55379)   A couple of things that have been improved - updating structs should have string keys - More tests for bytes, bytearrays, dataclasses   - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: iamjustinhsu <[email protected]> commit 3d44e3d17b56e993f1fd7407bdf1288c852c8c41 Author: Mengjin Yan <[email protected]> Date: Wed Aug 13 11:57:54 2025 -0700 [Core][TaskEventFollowup/03] Improve the Target Http Endpoint in Aggregator Agent (#55529) This PR improves the target http endpoint in the aggregator_agent.py: Merge the address and port as one env var to specify the target http endpoint Set the default value of the endpoint to be empty. And only when the endpoint is specified, we send the events out to the endpoint Update corresponding tests ----------- Signed-off-by: Mengjin Yan <[email protected]> Signed-off-by: myan <[email protected]> commit 8d810e2667fc728e45ca990ff7d7dc8547eae99b Author: Alexey Kudinkin <[email protected]> Date: Wed Aug 13 14:32:25 2025 -0400 [Data] Fixing `AutoscalingActorPool` to properly downscale upon completion of the execution (#55565)   In 2.48 change introduced debouncing handling that disallows downscaling for Actor Pool for 30s after latest upscaling to give AP Operator enough time to start utilizing upscaled actor. However, that affected ability of the Actor Pool to downscale upon completion of the execution: when operator completes execution it should start downscaling immediately. This change addresses that.  - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Alexey Kudinkin <[email protected]> commit 64feab4b01583023cec89bc2d199b0ff0de4c3cd Author: Ryan O'Leary <[email protected]> Date: Wed Aug 13 18:09:39 2025 +0000 [Train] Implement a JaxTrainer to support SPMD with TPUs (#55207) This PR builds off previous efforts to add a `JaxTrainer` and the [ray-tpu package](https://github.com/AI-Hypercomputer/ray-tpu/tree/main) to implement support for a `JaxTrainer` in RayTrain that supports SPMD workloads with TPUs. Support for more types of workloads (i.e. better support for CPU and GPU) can be added incrementally. In order to support SPMD locality-aware scheduling at the TPU slice level, we alter the `WorkerGroup` construction in V2 Ray Train to optionally accept multiple placement groups specs to apply to a range of workers. This enables us to reserve the "TPU head" using a placement group with label selectors, retrieve its unique `ray.io/tpu-slice-name`, and then schedule the remaining workers on that slice in a separate placement group. --------- Signed-off-by: Ryan O'Leary <[email protected]> Signed-off-by: Andrew Sy Kim <[email protected]> Co-authored-by: Andrew Sy Kim <[email protected]> commit 6d318ce84ddeacf67dc0c66f6e2fb6f6a8fef2e4 Author: Rui Qiao <[email protected]> Date: Wed Aug 13 10:57:54 2025 -0700 [Serve.llm] Add missing data_parallel/__init__.py (#55573) Signed-off-by: Rui Qiao <[email protected]> commit 3c1314afb82128f30e5a445462c7277717e62863 Author: William Lin <[email protected]> Date: Wed Aug 13 10:55:47 2025 -0700 [docs] Add documentation for using type hints in Ray Core (#55013)     - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: will.lin <[email protected]> Signed-off-by: Richard Liaw <[email protected]> Co-authored-by: Richard Liaw <[email protected]> commit a24defd4c4773879a834762ba414d3c0cea9b1e9 Author: Lonnie Liu <[email protected]> Date: Wed Aug 13 10:51:07 2025 -0700 [release test] remove release image build step from postmerge (#55564) they should be always building from release test pipeline directly we used to run release tests on postmerge; we are no longer doing it any more. also add oss tag for those steps. Signed-off-by: Lonnie Liu <[email protected]> commit dda42b2d97768dbebdbaf766a7ed2e2e2372cc8b Author: William Lin <[email protected]> Date: Wed Aug 13 08:36:19 2025 -0700 [core] Add return type to ActorClass.options (#55563) Currently the following pattern throws many lint errors as `ActorDemoRay.options(name="demo_ray")` returns an instance of `ActorOptionWrapper` which messes with the IDE's static type checker: ```python import ray from ray import ObjectRef from ray.actor import ActorProxy, ActorClass class DemoRay: def __init__(self, init: int): self.init = init @ray.method def calculate(self, v1: int, v2: int) -> int: return self.init + v1 + v2 ActorDemoRay: ActorClass[DemoRay] = ray.remote(DemoRay) def main(): p: ActorProxy[DemoRay] = ActorDemoRay.options(name="demo_ray").remote(1) actor: ActorProxy[DemoRay] = ray.get_actor("demo_ray") a = actor.calculate.remote(1, 2) print(ray.get(a)) return if __name__ == "__main__": main() ``` This PR changes ActorClass[T].options(...) to return a new instance of ActorClass[T] instead, allow IDEs to correct infer the type of subsequent `.remote(...)` calls https://github.com/ray-project/ray/issues/54149 --------- Signed-off-by: will.lin <[email protected]>

Signed-off-by: Andrew Grosser <[email protected]>

Added docs for Implicit Q-Learning.

3a3504c

Signed-off-by: simonsays1980 <[email protected]>

simonsays1980 marked this pull request as ready for review August 8, 2025 15:22

simonsays1980 requested review from a team as code owners August 8, 2025 15:22

gemini-code-assist bot reviewed Aug 8, 2025

View reviewed changes

simonsays1980 added 4 commits August 11, 2025 13:58

Fixed some formatting bugs in docs.

0def7d0

Signed-off-by: simonsays1980 <[email protected]>

Merge branch 'master' into docs-implicit-q-learning

8576fb6

Merge branch 'master' into docs-implicit-q-learning

b2520c0

Fixed two little typos.

bc6f673

Signed-off-by: simonsays1980 <[email protected]>

sven1977 changed the title ~~[RLlib] - Add docs for Implicit Q-Learning~~ [RLlib] Add docs for Implicit Q-Learning. Aug 14, 2025

sven1977 reviewed Aug 14, 2025

View reviewed changes

sven1977 approved these changes Aug 14, 2025

View reviewed changes

sven1977 enabled auto-merge (squash) August 14, 2025 13:49

github-actions bot added the go add ONLY when ready to merge, run all tests label Aug 14, 2025

sven1977 merged commit 6afaeda into ray-project:master Aug 14, 2025
7 checks passed

daiping8 pushed a commit to daiping8/ray that referenced this pull request Aug 15, 2025

[RLlib] Add docs for Implicit Q-Learning. (ray-project#55422)

856dd5a

daiping8 pushed a commit to daiping8/ray that referenced this pull request Aug 15, 2025

[RLlib] Add docs for Implicit Q-Learning. (ray-project#55422)

2086a5f

dioptre pushed a commit to sourcetable/ray that referenced this pull request Aug 20, 2025

[RLlib] Add docs for Implicit Q-Learning. (ray-project#55422)

10aaf03

Signed-off-by: Andrew Grosser <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RLlib] Add docs for Implicit Q-Learning. #55422

[RLlib] Add docs for Implicit Q-Learning. #55422

simonsays1980 commented Aug 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 8, 2025

Uh oh!

gemini-code-assist bot Aug 8, 2025

Uh oh!

sven1977 Aug 14, 2025

Uh oh!

sven1977 left a comment

Uh oh!

Uh oh!

Uh oh!

[RLlib] Add docs for Implicit Q-Learning. #55422

[RLlib] Add docs for Implicit Q-Learning. #55422

Conversation

simonsays1980 commented Aug 8, 2025

Why are these changes needed?

Related issue number

Checks

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

sven1977 Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

sven1977 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!