📚 Library Digitization

This project implements a digitization system for books in the IITD library. It provides tools to extract distinct words from each book and supports keyword-based search using custom-built hash tables or sorting-based approaches. Implemented as part of COL106 Assignment 4, it demonstrates both data structure design and algorithmic efficiency.

🔍 Problem Overview

Each book consists of a list of words. The library system supports:

Extracting distinct words from books.
Performing keyword search across the library.
Printing books with their respective word sets.

Two core implementations are provided:

MuskLibrary: Uses MergeSort to extract unique words.
JGBLibrary: Uses custom-built HashTables with:
- Chaining (Jobs)
- Linear Probing (Gates)
- Double Hashing (Bezos)

🧠 Key Features

✅ Custom-built HashSet and HashMap with support for insert, find, and dynamic resizing.
🔁 Dynamic resizing based on load factor to conserve memory.
🧪 Two separate libraries:
- MuskLibrary: Simpler, sort-based method.
- JGBLibrary: Advanced, collision-handling hash table approach.
📖 Keyword search and book-wise word listing.

⏱ Time Complexities

Function	MuskLibrary	JGBLibrary (Hash-based)
`__init__`	`O(kW log W)`	`O(table size)`
`add_book`	N/A	`O(W)`
`distinct_words`	`O(D + log k)`	`O(D)`
`count_distinct_words`	`O(log k)`	`O(1)`
`search_keyword`	`O(k log D)`	`O(k)`
`print_books`	`O(kD)`	`O(kD)`

🧱 Project Structure

dynamic hash table.py # DynamicHashSet and DynamicHashMap with resizing

hash table.py # Base HashTable, HashSet, and HashMap implementations

library.py # MuskLibrary and JGBLibrary classes

prime generator.py # Prime number generator for resizing (do not modify)

main.py # Debugging and test file

⚠ Constraints & Notes

❌ No use of Python dict, set, or hash libraries.
📏 All hash tables are implemented from scratch.
✅ Follow specific print formatting as required by autograder.
🧪 Use DynamicHashSet and DynamicHashMap for space-efficient resizing.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Problem statement - Library Digitization.pdf		Problem statement - Library Digitization.pdf
README.md		README.md
dynamic_hash_table.py		dynamic_hash_table.py
hash_table.py		hash_table.py
library.py		library.py
main.py		main.py
prime_generator.py		prime_generator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 Library Digitization

🔍 Problem Overview

🧠 Key Features

⏱ Time Complexities

🧱 Project Structure

⚠ Constraints & Notes

About

Uh oh!

Releases

Packages

Languages

Richat04/Library-Digitization

Folders and files

Latest commit

History

Repository files navigation

📚 Library Digitization

🔍 Problem Overview

🧠 Key Features

⏱ Time Complexities

🧱 Project Structure

⚠ Constraints & Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages