
Local RAG is a fully local AI-powered document processing and search system. A robust Retrieval-Augmented Generation application using Ollama local models and FAISS for efficient document processing and semantic search — designed for private enterprise data that never leaves the network.
Stack
Python, FastAPI, React.js, Ollama, FAISS, Sentence Transformers, LangChain-style retrieval, OCR.
Status
Active — work in progress. Functional end-to-end, with ongoing refinement of features and error handling.
Key features
- Multi-format document processing — PDF, DOCX, CSV, Excel, XML, and images.
- Semantic search with content-type awareness — intelligent document understanding.
- Batch processing for large files and many documents at once.
- Rate limiting and error handling for reliable API performance.
- Automatic backup and recovery for data integrity.
- Query expansion and results reranking for better search relevance.
- Faceted search — filter and navigate results.
- OCR support for extracting text from images and PDFs.
- Progress tracking and webhook notifications for long-running tasks.
Architecture
A two-tier system: a Python FastAPI backend handling document ingestion, chunking, embeddings, FAISS vector storage, and LLM orchestration via Ollama; and a React.js frontend providing the chat interface, file upload, model selection, and search UX. All inference runs locally — no data is sent to external APIs.
The processing pipeline: document loader → chunker → local embedding model → FAISS vector store → retriever → local LLM via Ollama (for example, deepseek-r1:8b or phi4:14b).
System requirements
- OS: Windows 10/11, macOS, or Linux
- CPU: 4+ cores recommended
- RAM: Minimum 8GB, 16GB+ recommended for large documents
- Storage: 10GB+ free space for application and models
- GPU: Required, at least 4GB dedicated GPU RAM for model inference
Key learnings
- Local-first RAG is viable end-to-end with Ollama + FAISS — no cloud dependency required for private enterprise data.
- Content-type aware chunking and reranking matter more than embedding model choice for mixed document corpora.
- Batch processing and progress tracking are essential for usable enterprise workflows on large document sets.
Repository
GitHub repo: github.com/PowerAI-Labs/LocalRAG · Licensed under MIT.

