Kreuzberg is a Python library for text extraction from documents. It provides a unified interface for extracting text from PDFs, images, office documents, and more, with both async and sync APIs. Tries to Just Work, without complex configuration. No external API calls or cloud dependencies required. Lightweight processing without GPU requirements. Comprehensive support for documents, images, and text formats. Support for Tesseract, EasyOCR, and PaddleOCR.
A simple app (PWA) to extract text from images using Tesseract. No image upload. Everything runs locally on your device. Choose a image, edit the text if you must, then just copy and paste.
Looks like you can just clone the repo into a webroot and it'll work. Seems to work decently well.
A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense! Uses gpt-4o-mini to look at and figure out what's in the images, but so far it doesn't seem to support self-hosted models.
The general logic:
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall or Limitless' Rewind.ai. With OpenRecall, you can easily access your digital history, enhancing your memory and productivity without compromising your privacy.
OpenRecall captures your digital history through regularly taken screenshots. The text and images within these screenshots are analyzed and made searchable, allowing you to quickly find specific information by typing relevant keywords into OpenRecall. You can also manually scroll back through your history to revisit past activities.
OpenRecall is 100% open-source, allowing you to audit the source code for potential backdoors or privacy-invading features. Works on Windows, macOS, and Linux, giving you the freedom to use it on your preferred operating system. Your data is stored locally on your device, and you have the option (soon to be implemented) to encrypt it with a password for added security. No cloud integration is required. OpenRecall is designed to work with a wide range of hardware, unlike proprietary solutions that may require specific certified devices.
Surya is a document OCR toolkit.
Built on top of PyTorch. Multiple models.
Normcap is a screen capture tool for the desktop. Specifically, it looks for text in the screencap and OCRs it for you.
A flatbed document and book scanner. Will also scan 3d objects that'll fit under the camera. Minimum of 13MP image resolution (4160 x 3120), can handle up to A3 size documents. Maximum document thickness: 10mm. Scanner camera's height above the document is adjustable. As fast as one second per scan. Portable - can be folded up for transportation. Can detect when you turn the page or change the document, look for the new page, and automatically take the next image. Abbyy OCR functionality built in. Scans to Word documents, PDF, Excel spreadsheets, or TIFF image files. Software for Windows (back to XP) and OS X.
Shows up as a UVC device under Linux (archived), so any image or video capture software that is UVC enabled can do the work for you.
Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper. Paperless-ngx forked from paperless-ng to continue the great work and distribute responsibility of supporting and advancing the project among a team of people.
Paperless-ngx is a webapp that indexes your scanned documents and allows you to easily search for documents and store metadata alongside your documents. Paperless-ngx does not control your scanner, it only helps you deal with what your scanner produces.
Store archived documents with an embedded OCR text layer, while keeping originals available.
A website which allows you to upload arbitrary images or document files to be run through optical character recognition software. Text is output to the bottom of the screen, suitable for cut-and-pasting. Uses the Cuneiform and Tesseract OCR packages on the back end (but not both at once).