Skip to content

Provide commands to create database file [Feature Request] #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chrisgulvik opened this issue Sep 17, 2024 · 8 comments
Open

Provide commands to create database file [Feature Request] #45

chrisgulvik opened this issue Sep 17, 2024 · 8 comments

Comments

@chrisgulvik
Copy link

Can you provide example commands to create a database?

I'd like to make a super small db for built-in functionality testing of the software within a container. For example, just adding 2 different species genomes (perhaps just 1 Gram Pos here and 1 Gram Neg here) and querying with 1 of them. This would also speed up any dev work you do and make GitHub Actions a possibility for you. Nothing fancy needed for commands to share, just notes in shell would be helpful.

If you added an example on how to create a GTDB database, that would also take the workload off of you, ensure future utility of the pkg as taxonomy continues to change, and avoid future issues like #40 and #43 .

@chrisgulvik
Copy link
Author

I just saw @jfy133 also ran into this same need here. It looks like simply making a dmd db isn't enough, so any tips?

@jfy133
Copy link

jfy133 commented Sep 19, 2024

It might just be that particular genome being used during testing is not in the database (I'm not 100%) sure, so at least instructions would indeed be helpful :)

@fullama
Copy link
Contributor

fullama commented Jan 20, 2025

I could make a small db for testting yeah.. the thing is that the database versions are tied to a gunc version so opening it up for users to make their own database might cause confusion. maybe it would be fine, i dont know tbh. I could add a small test db in the next version release.. would an addition gunc test command work for ye maybe?

@jfy133
Copy link

jfy133 commented Jan 20, 2025

Thanks for looking into this @fullama !

For clarity, the purpose for me (and I guess @chrisgulvik ) is not to allow users to make a custom database - a small test DB would indeed suffice.

A small database that is compatible with the standard gunc commands rather than a specific gunc test would be perfect, as being as close to simulating a 'real' GUNC run, the better. E.g., ultimately I want to make sure that the wrapper of the standard GUNC commands used in 'real world' applications functions correctly - I just don't need the full database to do so.

So a separate gunc test command would not help here (for me) - as the test I want to make is of the normal GUNC commands themselves.

@fullama
Copy link
Contributor

fullama commented Jan 21, 2025

OK no prob.. ive added it to the milestones for the next release.. ill update here when its ready

@krishn-cpu
Copy link

Install SQLite in the container (if not already available)
apt-get update && apt-get install -y sqlite3

Create a SQLite database and a table for genome entries

sqlite3 test_genomes.db <<EOF
-- Define a table with species name, Gram stain type, and genome data
CREATE TABLE genomes (
id INTEGER PRIMARY KEY,
species_name TEXT NOT NULL,
gram_stain TEXT NOT NULL,
genome_data BLOB NOT NULL
);
EOF

Insert a Gram-positive bacterial genome (Bacillus subtilis as an example)
sqlite3 test_genomes.db <<EOF
INSERT INTO genomes (species_name, gram_stain, genome_data)
VALUES ('Bacillus subtilis', 'Positive', 'GENOME_DATA_HERE');
EOF

Insert a Gram-negative bacterial genome (Escherichia coli as an example)
sqlite3 test_genomes.db <<EOF
INSERT INTO genomes (species_name, gram_stain, genome_data)
VALUES ('Escherichia coli', 'Negative', 'GENOME_DATA_HERE');
EOF

Query the database to verify inserted records
sqlite3 test_genomes.db <<EOF
-- Print all genome records
SELECT * FROM genomes;
EOF

@jfy133
Copy link

jfy133 commented Apr 11, 2025

@krishn-cpu can you clarify? Is this an example work around? Could you maybe provide a more 'real world' example (I'm not sure what you mean by GENOME_DATA_HERE for example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants