My book processing workflow

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.

Moderator: peterZ

Post Reply
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

My book processing workflow

Post by rob »

My goal is to turn a physical book into a virtual book that will be reflowable on whatever device I like, or into a PDF that is customized for my particular device. Here are the steps involved in making that happen:

1. Scan book using DIY Book Scanner
2. Postprocess images to undistorted, cleaned up images (PostProcessor software)
3. OCR (ABBYY FineReader)
4. Spellcheck, correct various other nonspelling errors (in ABBYY FineReader)
5. Export to PDF (ABBYY FineReader)
6. Extract data and convert to reflowable format (then a miracle occurs)
7. Convert to desired format (PDF or MOBI)

Right now I'm working on step 6...
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: My book processing workflow

Post by rob »

I should mention that the step 6 I'm working on will only work for single-column text. I'm not quite up to detecting multi-column text.
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
Post Reply