Skip to content

dq/runtime/output_channel: relax requirement on double watermark/checkpoint #21459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

yumkam
Copy link
Collaborator

@yumkam yumkam commented Jul 21, 2025

Changelog entry

...

Changelog category

  • Not for changelog (changelog entry is not required)

Description for reviewers

Fixes YQ-4351. While two cases are similar, they are handled a bit differently.
In case of checkpoint, we cannot lose checkpoints, and we cannot reorder checkpoints and data. So, if channel contains any untransferred checkpoints, allow adding more checkpoints, but with no data between.
In case of watermarks, we can move watermark behind new data and merge several watermarks.

Copy link

github-actions bot commented Jul 21, 2025

2025-07-21 18:56:17 UTC Pre-commit check linux-x86_64-relwithdebinfo for d952a0e has started.
2025-07-21 18:56:21 UTC Artifacts will be uploaded here
2025-07-21 18:59:40 UTC ya make is running...
🟡 2025-07-21 20:11:31 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
37183 34301 0 4 2850 28

2025-07-21 20:14:52 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-07-21 20:22:57 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
454 (only retried tests) 358 0 0 76 20

🟢 2025-07-21 20:23:05 UTC Build successful.
🟢 2025-07-21 20:23:23 UTC ydbd size 2.2 GiB changed* by +15.4 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 5a230bc merge: d952a0e diff diff %
ydbd size 2 397 668 560 Bytes 2 397 684 312 Bytes +15.4 KiB +0.001%
ydbd stripped size 501 429 928 Bytes 501 433 576 Bytes +3.6 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Jul 21, 2025

2025-07-21 18:56:56 UTC Pre-commit check linux-x86_64-release-asan for d952a0e has started.
2025-07-21 18:57:00 UTC Artifacts will be uploaded here
2025-07-21 19:00:23 UTC ya make is running...
🟡 2025-07-21 20:48:49 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14745 14316 0 114 294 21

🟢 2025-07-21 20:50:10 UTC Build successful.
🟢 2025-07-21 20:50:37 UTC ydbd size 3.9 GiB changed* by +15.5 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 5a230bc merge: d952a0e diff diff %
ydbd size 4 215 503 176 Bytes 4 215 519 000 Bytes +15.5 KiB +0.000%
ydbd stripped size 1 460 960 184 Bytes 1 460 962 808 Bytes +2.6 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Jul 21, 2025

🟢 2025-07-22 14:53:48 UTC The validation of the Pull Request description is successful.

@yumkam yumkam marked this pull request as ready for review July 22, 2025 13:31
@yumkam yumkam requested review from a team as code owners July 22, 2025 13:31
Copy link

github-actions bot commented Jul 22, 2025

2025-07-22 13:32:34 UTC Pre-commit check linux-x86_64-release-asan for 053a1d8 has started.
2025-07-22 13:32:47 UTC Artifacts will be uploaded here
2025-07-22 13:36:02 UTC ya make is running...
🟡 2025-07-22 15:42:35 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14750 14346 0 121 262 21

🟢 2025-07-22 15:43:51 UTC Build successful.
🟢 2025-07-22 15:44:22 UTC ydbd size 3.9 GiB changed* by +41.7 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 6618a16 merge: 053a1d8 diff diff %
ydbd size 4 215 911 000 Bytes 4 215 953 656 Bytes +41.7 KiB +0.001%
ydbd stripped size 1 461 060 696 Bytes 1 461 076 632 Bytes +15.6 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Jul 22, 2025

2025-07-22 13:32:48 UTC Pre-commit check linux-x86_64-relwithdebinfo for 053a1d8 has started.
2025-07-22 13:32:52 UTC Artifacts will be uploaded here
2025-07-22 13:36:20 UTC ya make is running...
🟡 2025-07-22 15:08:43 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
37189 34468 0 3 2691 27

2025-07-22 15:12:21 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-07-22 15:22:58 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
215 (only retried tests) 195 0 0 0 20

🟢 2025-07-22 15:23:06 UTC Build successful.
🟢 2025-07-22 15:23:24 UTC ydbd size 2.2 GiB changed* by +30.5 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 6618a16 merge: 053a1d8 diff diff %
ydbd size 2 397 933 224 Bytes 2 397 964 448 Bytes +30.5 KiB +0.001%
ydbd stripped size 501 468 232 Bytes 501 474 440 Bytes +6.1 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@@ -478,7 +482,7 @@ class TDqOutputChannel : public IDqOutputChannel {
bool Finished = false;

TMaybe<NDqProto::TWatermark> Watermark;
TMaybe<NDqProto::TCheckpoint> Checkpoint;
TDeque<NDqProto::TCheckpoint> Checkpoints;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 чекпойнта возможно только если inflight в конфиге больше 1 ? Сейчас же только одни чекпойнт одновременно ходит по графу.

Copy link
Collaborator Author

@yumkam yumkam Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Есть edge case: в кверях с записью не в топик, результат записывается в канал.
А новый checkpoint запускается после того, как checkpoint прошёл по всем таскам и все sink отчитались от записи.
В данном случае синков нет, последняя таска записала в канал, если канал при этом почему-то застрял -- в него может прилететь второй checkpoint.
Городить ради этого TDeque немножко жалко, но[...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants