Multi Index Hash-based Database System

Overview

This project is a modular C implementation of a hash-based database system supporting both primary and secondary indexing on structured records. It is designed for efficient storage, retrieval, and management of records using static hashing and block-based file management. The system demonstrates practical database indexing, file organization, and extensible design for real-world data applications.

System Architecture

Block File Layer (BF): Abstracts low-level file operations, block allocation, reading, writing, and error handling. All modules interact with data through this layer.
Primary Hash Index (HT): Implements a static hash table for fast access to records by primary key (e.g., id). Provides creation, opening, closing, insertion, and search operations.
Secondary Hash Index (SHT): Implements secondary indexes (e.g., on name, surname, address) using separate static hash tables. Each secondary index maintains references to primary records and supports efficient secondary-key queries.
Record Management: Defines the structure and handling of records, including insertion, deletion, and retrieval. Records are fixed-format for performance and simplicity.
Statistics Module: Collects and reports statistics on hash file usage, such as block counts, bucket occupancy, and overflow handling.

Features & Supported Operations

Creation and initialization of primary and multiple secondary hash index files for each database.
Insertion of records via command line or batch file import.
Search and retrieval by primary key or any supported secondary key.
Block-level management for efficient storage and access.
Collection of hash file statistics (block usage, bucket distribution, overflow analysis).
Batch operations using input data files and automated test scripts.

Technical Approach

Uses fixed-size blocks for all file operations, with metadata stored in the first block of each file.
Employs static hashing for both primary and secondary indexes, with configurable bucket counts.
Each database is stored in its own subdirectory under db/, with separate files for primary and each secondary index.
Secondary index entries reference the block location of the corresponding primary record.
Overflow blocks are managed for buckets that exceed their capacity.
Modular design separates block management, indexing, and record logic for maintainability.

Project Structure

src/ — C source files for all modules (main programs, hashfile, shashfile, etc.)
include/ — Header files for all modules
lib/ — Precompiled libraries (e.g., BF_64.a)
build/ — Compiled object files and executables
db/ — Generated database files during execution (organized by database name)
data/ — Input data files (e.g., records1K.txt)
Makefile — Build and run instructions
test — Testing

How the System Works

Database Creation:
- Use the create <dbname> [buckets] command to create a new database. This creates a primary index and three secondary indexes (on name, surname, and address) in a subdirectory under db/.
Selecting a Database:
- Use the use <dbname> command to select a database for operations. Only one database can be in use at a time.
Inserting Records:
- Use insert <id> <name> <surname> <address> to add a record to the selected database. The record is inserted into the primary index and all secondary indexes.
- Use insertfile <filename> to batch import records from a file (e.g., from data/records1K.txt).
Searching Records:
- Use search <id> to search for a record by primary key.
- Use searchs <field> <value> to search by a secondary key (name, surname, or address).
Closing a Database:
- Use close to close the current database and release all resources.
Exiting:
- Use exit to close any open indexes and exit the program.

Example Usage

$ make
$ ./build/main_kernel
Commands:
  create <dbname> [buckets]         Create a new database with optional bucket count (default 100)
  use <dbname>                      Select a database to use
  close                             Stop using the current database
  insert <id> <name> <surname> <address>
                                    Insert a record into the selected database
  insertfile <filename>             Insert records from a file into the selected database
  search <id>                       Search for a record by id in the selected database
  searchs <field> <value>           Search for records by secondary field (name, surname, address)
  help                              Show this help message
  exit                              Exit the program

db> create mydb 100
Databases created.
db> use mydb
mydb> insert 1 John Smith Athens
Record inserted.
mydb> search 1
id: 1 name: John surname: Smith address: Athens
mydb> searchs surname Smith
id: 1 name: John surname: Smith address: Athens
mydb> close
db> exit

Notes

Input files for batch import should be placed in the data/ directory.
All generated database files are stored in db/<dbname>/.
Only one database can be in use at a time; use close before switching.
The system is modular and can be extended to support additional secondary indexes or record formats.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.vscode		.vscode
data		data
include		include
lib		lib
src		src
tests		tests
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi Index Hash-based Database System

Overview

System Architecture

Features & Supported Operations

Technical Approach

Project Structure

How the System Works

Example Usage

Notes

About

Uh oh!

Languages

kondim23/multi-index-hash-db

Folders and files

Latest commit

History

Repository files navigation

Multi Index Hash-based Database System

Overview

System Architecture

Features & Supported Operations

Technical Approach

Project Structure

How the System Works

Example Usage

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages