- Updated: January 27, 2026
- 6 min read
Introducing OCRBase: Advanced OCR Library Powered by PaddleOCR and LLM
OCRBase: Open‑Source OCR Library Redefines Document Scanning and Structured Data Extraction
OCRBase is an open‑source OCR library that converts PDFs and images into clean, machine‑readable Markdown or JSON by combining the high‑accuracy PaddleOCR‑VL‑0.9B model with LLM‑driven parsing, all accessible through a type‑safe TypeScript SDK and real‑time WebSocket updates.
Introduction – Why OCRBase Matters in 2024
In a world where billions of documents are digitized daily, extracting reliable text and structured data remains a bottleneck for developers, data scientists, and enterprises. Traditional OCR tools often deliver raw text without context, forcing teams to write custom parsers. OCRBase solves this problem by marrying state‑of‑the‑art OCR with large language model (LLM) post‑processing, delivering ready‑to‑use JSON schemas or Markdown summaries out of the box.
Built on the UBOS platform overview, OCRBase inherits the scalability and security standards required for enterprise workloads while remaining lightweight enough for startups and hobbyists.
Key Features and Capabilities of OCRBase
- Best‑in‑class OCR engine: Utilises PaddleOCR‑VL‑0.9B, delivering >95% character accuracy on multilingual documents.
- LLM‑powered parsing: After raw text extraction, an integrated LLM transforms free‑form text into structured JSON or Markdown according to user‑defined schemas.
- Type‑safe TypeScript SDK: Full IntelliSense support, React hooks, and compile‑time validation ensure developers catch errors early.
- Queue‑based processing: Designed for high‑throughput scenarios, OCRBase can handle thousands of documents concurrently.
- Real‑time WebSocket notifications: Clients receive live updates on job status, progress, and results without polling.
- Self‑hostable architecture: Deploy on‑premise using Docker and Bun, giving full control over data privacy.
- Extensible schema system: Define custom extraction rules (e.g., invoice fields, legal clauses) and receive clean JSON payloads.
Performance Benchmarks
Independent tests on a 4‑core VM show OCRBase processing a 10‑page invoice PDF in under 3 seconds, with a 98% field‑level extraction accuracy when paired with a GPT‑4 based parser. These numbers outperform many commercial SaaS OCR APIs that charge per page and lack real‑time feedback.
Quick‑Start Guide and Usage Examples
Getting OCRBase up and running takes just a few commands. Below is a minimal example using the TypeScript SDK.
bun add ocrbase
import { createOCRBaseClient } from "ocrbase";
const client = createOCRBaseClient({
baseUrl: "https://your-instance.com"
});
// Submit a PDF for parsing
const job = await client.jobs.create({
file: document, // File object or Buffer
type: "parse"
});
// Poll for result (or listen via WebSocket)
const result = await client.jobs.get(job.id);
console.log(result.markdownResult);
For React developers, OCRBase offers a hook that abstracts the polling logic:
import { useOCRJob } from "ocrbase/react";
function DocumentViewer({ file }) {
const { data, error, loading } = useOCRJob(file, "parse");
if (loading) return <span>Processing…</span>;
if (error) return <span class="text-red-600">{error.message}</span>;
return <div class="prose">{data.markdownResult}</div>;
}
For teams that prefer a no‑code approach, the UBOS templates for quick start include a pre‑configured OCRBase workflow that can be dropped into the Workflow automation studio with a single click.
Self‑Hosting in Minutes
Deploy OCRBase on your own infrastructure using the provided Docker Compose file:
version: "3.8"
services:
ocrbase:
image: majcheradam/ocrbase:latest
ports:
- "8080:8080"
environment:
- NODE_ENV=production
restart: unless-stopped
Run docker compose up -d and your OCR API will be reachable at http://localhost:8080. For detailed instructions, see the Enterprise AI platform by UBOS documentation.
Licensing, Contribution, and Community
OCRBase is released under the permissive MIT license, encouraging both commercial and academic use. The project welcomes contributions via pull requests, and the maintainers actively review issues on GitHub.
Community members can join the UBOS partner program to receive priority support, early feature access, and co‑marketing opportunities.
Benefits for Target Audiences
Developers
- Type‑safe SDK eliminates runtime parsing errors.
- React hooks integrate OCR into modern front‑ends with minimal boilerplate.
- Dockerized deployment fits CI/CD pipelines seamlessly.
Data Scientists & AI Researchers
- Structured JSON output accelerates downstream NLP pipelines.
- Open‑weight OCR model can be fine‑tuned for domain‑specific vocabularies.
- WebSocket streams enable real‑time data ingestion for streaming analytics.
Enterprises & SMBs
- Self‑hosted option guarantees data sovereignty and compliance (GDPR, HIPAA).
- Scalable queue system supports batch processing of millions of pages.
- Integration with AI marketing agents enables automated content extraction from contracts, invoices, and marketing collateral.
Illustration – OCRBase Architecture Overview

Related UBOS Resources
While OCRBase handles the heavy lifting of text extraction, UBOS offers a suite of complementary tools that can extend its capabilities:
- OCRBase project page – central hub for documentation and updates.
- AI SEO Analyzer – automatically audit SEO health of extracted web content.
- AI YouTube Comment Analysis tool – turn video comments into actionable insights.
- AI Video Generator – create video summaries from OCR‑derived transcripts.
- AI Chatbot template – build conversational agents that can query OCR‑extracted data.
- ElevenLabs AI voice integration – add voice narration to OCR‑generated reports.
- OpenAI ChatGPT integration – enhance parsing logic with the latest GPT models.
- Telegram integration on UBOS – receive OCR job status alerts directly in Slack‑like channels.
- ChatGPT and Telegram integration – let users query OCR results via a Telegram bot.
- Chroma DB integration – store vector embeddings of extracted text for semantic search.
Pricing and Support Options
OCRBase itself is free under the MIT license, but UBOS offers managed hosting plans for teams that prefer a hands‑off experience. Review the UBOS pricing plans to compare self‑hosted versus fully managed options, including SLA guarantees and priority support.
Conclusion – Take Action Today
OCRBase bridges the gap between raw OCR output and actionable structured data, empowering developers, data scientists, and enterprises to automate document workflows at scale. By leveraging the Enterprise AI platform by UBOS, you can integrate OCRBase seamlessly with other AI services—such as AI Email Marketing or AI Image Generator—to build end‑to‑end pipelines that turn scanned documents into business intelligence.
Ready to accelerate your document processing? Visit the UBOS homepage to explore tutorials, join the community, and start deploying OCRBase in minutes.
“OCRBase has become the backbone of our invoice automation pipeline, cutting processing time by 70% while improving data accuracy.” – Lead Engineer, FinTech Startup