# Scholaris > AI-powered bibliography and citation system for academic researchers. Converts PDFs, videos, audio, and images into searchable semantic documents (SPDF format). Runs 100% locally — no cloud processing, no data leaves your machine. ## Core Capabilities - **Semantic Search**: Natural-language search across your entire research library. Cross-lingual (search in English, find results in Spanish). Cross-modal (one query searches PDFs, lecture recordings, and video simultaneously). - **Auto-Citation**: Write your draft from memory, paste it in. Scholaris traces every claim to the exact page number in your library. Supports APA 7th, Chicago 17th, BibTeX. - **SPDF Format**: Open document format built on SQLite. Stores full text, OCR, embeddings, images, transcripts, speaker segments, and metadata in a single portable file. Query with SQL. No vendor lock-in. - **Privacy-First**: All AI models run locally on your hardware (NVIDIA or AMD GPU, or CPU-only). Documents never leave your machine. Designed for IRB-protected materials, unpublished manuscripts, and restricted archives. ## How It Works 1. Upload PDFs, video (MP4, MKV, WebM), audio (MP3, WAV, FLAC), images (PNG, JPG), or documents (DOCX, TXT, Markdown, EPUB). 2. Local AI processes everything: OCR (GLM-OCR), transcription (Parakeet/Whisper), speaker diarization (pyannote), semantic embeddings (Qwen3-VL). 3. Everything becomes a searchable SPDF file in your library. 4. Search by meaning across all media types. Get exact page numbers and timestamps. 5. Auto-cite your drafts with grounded, verified references. ## What Makes Scholaris Different Elicit and Semantic Scholar search a cloud database of published abstracts to help you *discover* new research. Scholaris searches *your own library* — every page of every book, every minute of every recording — to help you *work with* sources you've already read. They say "this paper is relevant." Scholaris says "page 47, paragraph 3 directly supports your claim." ## Target Users - Dissertation writers with hundreds of sources accumulated over years - Humanities and social science researchers (books, not just papers; Chicago 17th citations) - Seminar leaders with lecture recordings and reading lists - Field researchers with IRB-protected interview recordings and archival materials - Research groups sharing a collective library ## Pricing - **Free**: Import and browse SPDF documents, organize into libraries, semantic search, export citations. - **Pro** (€20/month, 14-day free trial): Unlimited conversions, AI-powered OCR, transcription, speaker identification, cross-modal search, auto-citation. ## Technical Stack - Frontend: Next.js 14 (TypeScript, Tailwind CSS) - Backend: FastAPI/Python (Uvicorn) - AI Models (all local): GLM-OCR 0.9B, Qwen3-VL-Embedding 2B, Parakeet TDT 0.6B, pyannote speaker-diarization 3.1, Silero VAD, Qwen3 0.6B (query expansion), jina-reranker-v2 (cross-encoder reranking) - Database: SQLite (SPDF files), ChromaDB (vector search) - Auth: Clerk - Payments: Stripe ## Links - Website: https://scholaris.joseluissaorin.com - Pricing: https://scholaris.joseluissaorin.com/pricing - Privacy Policy: https://scholaris.joseluissaorin.com/privacy - Terms of Service: https://scholaris.joseluissaorin.com/terms - Full documentation for LLMs: https://scholaris.joseluissaorin.com/llms-full.txt