This project is a modular C implementation of a hash-based database system supporting both primary and secondary indexing on structured records. It is designed for efficient storage, retrieval, and management of records using static hashing and block-based file management. The system demonstrates practical database indexing, file organization, and extensible design for real-world data applications.
- Block File Layer (BF): Abstracts low-level file operations, block allocation, reading, writing, and error handling. All modules interact with data through this layer.
- Primary Hash Index (HT): Implements a static hash table for fast access to records by primary key (e.g.,
id
). Provides creation, opening, closing, insertion, and search operations. - Secondary Hash Index (SHT): Implements secondary indexes (e.g., on
name
,surname
,address
) using separate static hash tables. Each secondary index maintains references to primary records and supports efficient secondary-key queries. - Record Management: Defines the structure and handling of records, including insertion, deletion, and retrieval. Records are fixed-format for performance and simplicity.
- Statistics Module: Collects and reports statistics on hash file usage, such as block counts, bucket occupancy, and overflow handling.
- Creation and initialization of primary and multiple secondary hash index files for each database.
- Insertion of records via command line or batch file import.
- Search and retrieval by primary key or any supported secondary key.
- Block-level management for efficient storage and access.
- Collection of hash file statistics (block usage, bucket distribution, overflow analysis).
- Batch operations using input data files and automated test scripts.
- Uses fixed-size blocks for all file operations, with metadata stored in the first block of each file.
- Employs static hashing for both primary and secondary indexes, with configurable bucket counts.
- Each database is stored in its own subdirectory under
db/
, with separate files for primary and each secondary index. - Secondary index entries reference the block location of the corresponding primary record.
- Overflow blocks are managed for buckets that exceed their capacity.
- Modular design separates block management, indexing, and record logic for maintainability.
src/
— C source files for all modules (main programs, hashfile, shashfile, etc.)include/
— Header files for all moduleslib/
— Precompiled libraries (e.g., BF_64.a)build/
— Compiled object files and executablesdb/
— Generated database files during execution (organized by database name)data/
— Input data files (e.g., records1K.txt)Makefile
— Build and run instructionstest
— Testing
- Database Creation:
- Use the
create <dbname> [buckets]
command to create a new database. This creates a primary index and three secondary indexes (onname
,surname
, andaddress
) in a subdirectory underdb/
.
- Use the
- Selecting a Database:
- Use the
use <dbname>
command to select a database for operations. Only one database can be in use at a time.
- Use the
- Inserting Records:
- Use
insert <id> <name> <surname> <address>
to add a record to the selected database. The record is inserted into the primary index and all secondary indexes. - Use
insertfile <filename>
to batch import records from a file (e.g., fromdata/records1K.txt
).
- Use
- Searching Records:
- Use
search <id>
to search for a record by primary key. - Use
searchs <field> <value>
to search by a secondary key (name
,surname
, oraddress
).
- Use
- Closing a Database:
- Use
close
to close the current database and release all resources.
- Use
- Exiting:
- Use
exit
to close any open indexes and exit the program.
- Use
$ make
$ ./build/main_kernel
Commands:
create <dbname> [buckets] Create a new database with optional bucket count (default 100)
use <dbname> Select a database to use
close Stop using the current database
insert <id> <name> <surname> <address>
Insert a record into the selected database
insertfile <filename> Insert records from a file into the selected database
search <id> Search for a record by id in the selected database
searchs <field> <value> Search for records by secondary field (name, surname, address)
help Show this help message
exit Exit the program
db> create mydb 100
Databases created.
db> use mydb
mydb> insert 1 John Smith Athens
Record inserted.
mydb> search 1
id: 1 name: John surname: Smith address: Athens
mydb> searchs surname Smith
id: 1 name: John surname: Smith address: Athens
mydb> close
db> exit
- Input files for batch import should be placed in the
data/
directory. - All generated database files are stored in
db/<dbname>/
. - Only one database can be in use at a time; use
close
before switching. - The system is modular and can be extended to support additional secondary indexes or record formats.