Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Scantools for Linux - add OCR to existing PDF or create ocr'd PDF's from scans

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.
Post Reply
Krokkie
Posts: 2
Joined: 16 Jan 2020, 05:05
E-book readers owned: Nook
Number of books owned: 0
Country: UK

Scantools for Linux - add OCR to existing PDF or create ocr'd PDF's from scans

Post by Krokkie » 16 Jan 2020, 06:18

Scantools for Linux - convert to PDF with OCR

It may interest some users in the community to produce OCR'd PDF's. There are already some solutions in place for this (such as pdfbeads or pdf.py) but how about just adding OCR on the fly by processing an existing scan to PDF or just add OCR to an existing PDF?

Scantools is a set of Linux PDF/A tools with the ability to perform OCR.

Scantools
https://cplx.vm.uni-freiburg.de/scantools/

Downloads here
https://software.opensuse.org/package/scantools


Usage Examples:


Add OCR to existing PDF with ocrPDF

ocrPDF book.pdf -o bookocr.pdf

Produce OCR'd PDF from a JPG scan with image2pdf

image2pdf scan.jpg -p A4 -r fit -b -o scanocr.pdf

Post Reply