Skip to content

Multi-Index Hash Database in C for efficient, flexible Creation, Insertion, and Search operations through kernel.

Notifications You must be signed in to change notification settings

kondim23/multi-index-hash-db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi Index Hash-based Database System

Overview

This project is a modular C implementation of a hash-based database system supporting both primary and secondary indexing on structured records. It is designed for efficient storage, retrieval, and management of records using static hashing and block-based file management. The system demonstrates practical database indexing, file organization, and extensible design for real-world data applications.

System Architecture

  • Block File Layer (BF): Abstracts low-level file operations, block allocation, reading, writing, and error handling. All modules interact with data through this layer.
  • Primary Hash Index (HT): Implements a static hash table for fast access to records by primary key (e.g., id). Provides creation, opening, closing, insertion, and search operations.
  • Secondary Hash Index (SHT): Implements secondary indexes (e.g., on name, surname, address) using separate static hash tables. Each secondary index maintains references to primary records and supports efficient secondary-key queries.
  • Record Management: Defines the structure and handling of records, including insertion, deletion, and retrieval. Records are fixed-format for performance and simplicity.
  • Statistics Module: Collects and reports statistics on hash file usage, such as block counts, bucket occupancy, and overflow handling.

Features & Supported Operations

  • Creation and initialization of primary and multiple secondary hash index files for each database.
  • Insertion of records via command line or batch file import.
  • Search and retrieval by primary key or any supported secondary key.
  • Block-level management for efficient storage and access.
  • Collection of hash file statistics (block usage, bucket distribution, overflow analysis).
  • Batch operations using input data files and automated test scripts.

Technical Approach

  • Uses fixed-size blocks for all file operations, with metadata stored in the first block of each file.
  • Employs static hashing for both primary and secondary indexes, with configurable bucket counts.
  • Each database is stored in its own subdirectory under db/, with separate files for primary and each secondary index.
  • Secondary index entries reference the block location of the corresponding primary record.
  • Overflow blocks are managed for buckets that exceed their capacity.
  • Modular design separates block management, indexing, and record logic for maintainability.

Project Structure

  • src/ — C source files for all modules (main programs, hashfile, shashfile, etc.)
  • include/ — Header files for all modules
  • lib/ — Precompiled libraries (e.g., BF_64.a)
  • build/ — Compiled object files and executables
  • db/ — Generated database files during execution (organized by database name)
  • data/ — Input data files (e.g., records1K.txt)
  • Makefile — Build and run instructions
  • test — Testing

How the System Works

  1. Database Creation:
    • Use the create <dbname> [buckets] command to create a new database. This creates a primary index and three secondary indexes (on name, surname, and address) in a subdirectory under db/.
  2. Selecting a Database:
    • Use the use <dbname> command to select a database for operations. Only one database can be in use at a time.
  3. Inserting Records:
    • Use insert <id> <name> <surname> <address> to add a record to the selected database. The record is inserted into the primary index and all secondary indexes.
    • Use insertfile <filename> to batch import records from a file (e.g., from data/records1K.txt).
  4. Searching Records:
    • Use search <id> to search for a record by primary key.
    • Use searchs <field> <value> to search by a secondary key (name, surname, or address).
  5. Closing a Database:
    • Use close to close the current database and release all resources.
  6. Exiting:
    • Use exit to close any open indexes and exit the program.

Example Usage

$ make
$ ./build/main_kernel
Commands:
  create <dbname> [buckets]         Create a new database with optional bucket count (default 100)
  use <dbname>                      Select a database to use
  close                             Stop using the current database
  insert <id> <name> <surname> <address>
                                    Insert a record into the selected database
  insertfile <filename>             Insert records from a file into the selected database
  search <id>                       Search for a record by id in the selected database
  searchs <field> <value>           Search for records by secondary field (name, surname, address)
  help                              Show this help message
  exit                              Exit the program

db> create mydb 100
Databases created.
db> use mydb
mydb> insert 1 John Smith Athens
Record inserted.
mydb> search 1
id: 1 name: John surname: Smith address: Athens
mydb> searchs surname Smith
id: 1 name: John surname: Smith address: Athens
mydb> close
db> exit

Notes

  • Input files for batch import should be placed in the data/ directory.
  • All generated database files are stored in db/<dbname>/.
  • Only one database can be in use at a time; use close before switching.
  • The system is modular and can be extended to support additional secondary indexes or record formats.

About

Multi-Index Hash Database in C for efficient, flexible Creation, Insertion, and Search operations through kernel.

Topics

Resources

Stars

Watchers

Forks