Skip to content

refactor: Integrate the materialized CTE into the plan and pipeline #18226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 78 commits into from
Jul 27, 2025

Conversation

SkyFan2002
Copy link
Member

@SkyFan2002 SkyFan2002 commented Jun 23, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR improves the execution of materialized CTEs. Previously, a temporary table was created in the bind phase. This had several drawbacks:

  1. the CTE and main query couldn't be jointly optimized by the optimizer
  2. the information from "explain" and "profile" is inaccurate
  3. automatic memory spilling of materialized CTE at the query level wasn't possible.

The proposed changes allow for better integration of materialized CTEs into the planning and optimization stages.

Example

explain with t1 as materialized (select number as a from numbers(10)), t2 as materialized (select a as b from t1) select t1.a from t1 join t2 on t1.a = t2.b;
----
-[ EXPLAIN ]-----------------------------------
Sequence
├── MaterializedCTE: t1
│   └── TableScan
│       ├── table: default.system.numbers
│       ├── output columns: [number (#3)]
│       ├── read rows: 10
│       ├── read size: < 1 KiB
│       ├── partitions total: 1
│       ├── partitions scanned: 1
│       ├── push downs: [filters: [], limit: NONE]
│       └── estimated rows: 10.00
└── Sequence
    ├── MaterializedCTE: t2
    │   └── CTEConsumer
    │       ├── cte_name: t1
    │       └── cte_schema: [number (#2)]
    └── HashJoin
        ├── output columns: [numbers.number (#0)]
        ├── join type: INNER
        ├── build keys: [t2.b (#1)]
        ├── probe keys: [t1.a (#0)]
        ├── keys is null equal: [false]
        ├── filters: []
        ├── build join filters:
        │   └── filter id:0, build key:t2.b (#1), probe key:t1.a (#0), filter type:bloom,inlist,min_max
        ├── estimated rows: 0.00
        ├── CTEConsumer(Build)
        │   ├── cte_name: t2
        │   └── cte_schema: [number (#1)]
        └── CTEConsumer(Probe)
            ├── cte_name: t1
            └── cte_schema: [number (#0)]

32 rows explain in 0.048 sec. Processed 0 rows, 0 B (0 row/s, 0 B/s)

#18420
#18125

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Jun 23, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jun 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jul 23, 2025
@SkyFan2002 SkyFan2002 marked this pull request as ready for review July 23, 2025 10:20
@SkyFan2002 SkyFan2002 added the ci-cloud Build docker image for cloud test label Jul 23, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-18226-c69732b-1753268114

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added the ci-benchmark Benchmark: run all test label Jul 25, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Jul 25, 2025
@zhang2014 zhang2014 merged commit 005469a into databendlabs:main Jul 27, 2025
86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-benchmark Benchmark: run all test ci-cloud Build docker image for cloud test pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants