Skip to content

fix: better experience for create table part of #92 #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yihong0618
Copy link
Contributor

This patch is part of #92 for better user experience

reason:

  • logic the same as create_fts_index so it maybe fine
  • most of the demo case do not need add index
  • faster the test..

@breezewish
Copy link
Member

breezewish commented Jun 5, 2025

Thank you for the contribution!

Currently we have a plan to transit from the imperative way (e.g. create_fts_index) into a declarative + explicit way (define index in the model), in order to make the behavior more obvious when the code snippet is going to be executed again. The planned syntax is like:

class Model(...)
    col = .... (index=....)

table = db.create_table(Model, mode="overwrite")

# table = db.get_table(Model)
# table = db.create_table --> Always error when table exists

As you can see, the new syntax is very clear that, for each time the code is executed the final schema will be exactly what code is defined (with data cleared of course), and create is going to throw error if something already exists.

Meanwhile, the implicit + imperative API is going to be either deprecated, or enhanced to be more clear what will happen when it is executed for the second time:

table.create_xxx_index(...) # What will happen if it has already been executed once? Ideally
                            # we should only create index *once*, because duplicated index
                            # are not efficient. How to avoid duplication and can be learned
                            # from API? We have not decided yet :)

During the transition, we may not be able to merge this PR very soon, sorry for the inconvenience!


BTW, I see that what you would like is to allow opting-out the vector index. Considering that #92 is going to be fixed soon, do you have some other user cases that no vector index is better? We want to make sure that we have most use cases covered while keeping as simple as possible, so your feedback will be very valueable. Thank you!

@yihong0618
Copy link
Contributor Author

Thank you for the contribution!

Currently we have a plan to transit from the imperative way (e.g. create_fts_index) into a declarative + explicit way (define index in the model), in order to make the behavior more obvious when the code snippet is going to be executed again. The planned syntax is like:

class Model(...)

    col = .... (index=....)



table = db.create_table(Model, mode="overwrite")



# table = db.get_table(Model)

# table = db.create_table --> Always error when table exists

As you can see, the new syntax is very clear that, for each time the code is executed the final schema will be exactly what code is defined (with data cleared of course), and create is going to throw error if something already exists.

Meanwhile, the implicit + imperative API is going to be either deprecated, or enhanced to be more clear what will happen when it is executed for the second time:

table.create_xxx_index(...) # What will happen if it has already been executed once? Ideally

                            # we should only create index *once*, because duplicated index

                            # are not efficient. How to avoid duplication and can be learned

                            # from API? We have not decided yet :)

During the transition, we may not be able to merge this PR very soon, sorry for the inconvenience!


BTW, I see that what you would like is to allow opting-out the vector index. Considering that #92 is going to be fixed soon, do you have some other user cases that no vector index is better? We want to make sure that we have most use cases covered while keeping as simple as possible, so your feedback will be very valueable. Thank you!

make sense thanks for the info.

About no index the only reason is the create table for server less is too slow, after #92 fix I think all need index but for only quite few rows no index for vector is acceptable

@breezewish
Copy link
Member

@yihong0618 Thank you a lot for the info! Nice to hear that our fix could resolve your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants