A comprehensive Retrieval Augmented Generation (RAG) application built with Next.js, featuring document processing, website scraping, and AI-powered chat functionality.
🎥 YouTube Video: Click here
⚡ Important Note:
- Your OpenAI API key and Qdrant credentials are never stored on our servers.
 - They are stored securely only on your localhost (or in your environment variables like .env.local).
 - This means your credentials always remain under your control and are not shared with any third-party servers.
 
👉 So, you can safely use your API key without any worries everything stays on your localhost.
- Multiple Data Sources: Support for text input, file uploads (PDF, CSV, TXT), and website scraping
 - Vector Database: Qdrant integration for efficient document storage and retrieval
 - AI-Powered Chat: OpenAI GPT integration for intelligent responses based on your data
 - Modern UI: Dark mode interface with light orange accent colors
 - Form Validation: React Hook Form with Zod validation for better user experience
 - Real-time Processing: Live feedback and progress indicators
 - Flexible Deployment: Support for both local Docker and Qdrant Cloud
 
- Frontend: Next.js 14, React, TypeScript
 - UI Components: shadcn/ui, Tailwind CSS
 - Forms: React Hook Form, Zod validation
 - Vector Database: Qdrant (Local Docker or Cloud)
 - AI Integration: OpenAI API, LangChain
 - File Processing: PDF parsing, CSV processing, web scraping
 
- Node.js 18+
 - OpenAI API key
 - Choose one of the following for vector database:
- Docker (for local Qdrant instance)
 - Qdrant Cloud account (managed service)
 
 
- Clone the repository
 
git clone https://github.com/BCAPATHSHALA/RAGApplication.git
cd RAGApplication- Install dependencies
 
pnpm install- Choose your vector database setup:
 
3a. Start Qdrant vector database
docker-compose up -dThis will start Qdrant on http://localhost:6333
3b. Set up environment variables
Create a .env.local file in the root directory:
QDRANT_URL=http://localhost:6333
OPENAI_API_KEY=your_openai_api_key_here3a. Create a Qdrant Cloud account
- Visit https://cloud.qdrant.io/
 - Sign up for a free account
 - Create a new cluster
 - Get your cluster URL and API key
 
3b. Set up environment variables
Create a .env.local file in the root directory:
QDRANT_URL=https://your-cluster-url.qdrant.io
QDRANT_API_KEY=your_qdrant_api_key
OPENAI_API_KEY=your_openai_api_key_here- Run the development server
 
pnpm run dev- Open your browser
Navigate to 
http://localhost:3000 
The application provides an intuitive interface for configuration:
- OpenAI API Key: Enter your OpenAI API key in the API Key section
 - Qdrant Configuration:
- For local Docker: Use 
http://localhost:6333(no API key needed) - For Qdrant Cloud: Enter your cluster URL and API key from the dashboard
 
 - For local Docker: Use 
 
You can also configure via environment variables:
QDRANT_URL: Qdrant database connection URLQDRANT_API_KEY: Qdrant API key (required for cloud, optional for local)OPENAI_API_KEY: OpenAI API key for embeddings and chat
- OpenAI API Key: Enter your API key (must start with 'sk-')
 - Qdrant Setup: Configure either local Docker or Qdrant Cloud connection
 
Text Input:
- Paste text directly into the textarea
 - Minimum 10 characters required
 - Text will be chunked and indexed automatically
 
Website Scraping:
- Enter a valid URL (must include http:// or https://)
 - The system will scrape and index the website content
 
File Upload:
- Upload PDF, CSV, or TXT files (max 10MB)
 - Files are processed and chunked automatically
 - Progress feedback provided during processing
 
- View real-time statistics of indexed documents
 - See total documents and chunks
 - Review recent data sources
 
- Use the chat interface to ask questions about your indexed data
 - The AI will provide responses based on the most relevant document chunks
 - Conversation history is maintained during the session
 - Create multiple chat sessions for different topics
 
POST /api/index-text- Index text contentPOST /api/index-website- Scrape and index websitePOST /api/index-file- Process and index uploaded filesPOST /api/chat- Chat with indexed dataGET /api/rag-store- Get indexed document statisticsDELETE /api/delete-index- Delete all indexed data
- Qdrant: High-performance vector database running in Docker
 - Port: 6333 (default)
 - Storage: Persistent volume for data retention
 - Configuration: No API key required
 
- Managed Service: Fully managed Qdrant instance
 - Scalability: Auto-scaling based on usage
 - Security: Built-in authentication and encryption
 - Global: Multiple regions available
 - Configuration: Requires cluster URL and API key
 
- Text Chunking: Recursive character text splitter (1000 chars, 200 overlap)
 - PDF Processing: LangChain PDF loader for text extraction
 - CSV Processing: Structured data handling with metadata
 - Web Scraping: Cheerio for clean HTML content extraction
 
- OpenAI GPT-4: For generating contextual responses
 - LangChain: Document processing and RAG pipeline orchestration
 - Embeddings: text-embedding-3-large for high-quality vector representations
 
pnpm run dev- Build the application:
 
pnpm run build- Start with Docker Compose:
 
docker-compose up -d- Deploy to Vercel:
 
vercel deploy- Set environment variables in Vercel dashboard:
QDRANT_URL(your Qdrant Cloud cluster URL)QDRANT_API_KEY(your Qdrant Cloud API key)OPENAI_API_KEY(your OpenAI API key)
 
For Qdrant Cloud:
- Use environment variables for sensitive configuration
 - Enable API key authentication
 - Monitor usage and costs in Qdrant Cloud dashboard
 - Consider data residency requirements
 
For Local Docker:
- Ensure persistent storage for production data
 - Configure proper backup strategies
 - Monitor resource usage and scaling needs
 - Secure network access to Qdrant instance
 
"Collection not found" errors:
- The collection is created automatically when you first index data
 - Ensure Qdrant is running and accessible
 - Check your Qdrant URL and API key configuration
 
File upload failures:
- Check file size limits (10MB max)
 - Ensure supported file formats (PDF, CSV, TXT)
 - Verify OpenAI API key is valid
 
Chat not working:
- Ensure you have indexed some data first
 - Check OpenAI API key configuration
 - Verify Qdrant connection is working
 
- Use Qdrant Cloud for better performance and reliability
 - Index documents in smaller batches for large datasets
 - Monitor OpenAI API usage and costs
 - Consider chunking strategies for different document types
 
- Fork the repository
 - Create a feature branch
 - Make your changes
 - Add tests if applicable
 - Submit a pull request
 
MIT License - see LICENSE file for details
For issues and questions:
- Create an issue on GitHub
 - Check the troubleshooting section
 - Review the API documentation
 - Visit Qdrant Cloud Documentation
 
