Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Scanning Reference books

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.
Post Reply
Posts: 1
Joined: 05 Feb 2013, 15:36
E-book readers owned: Kindle
Number of books owned: 0
Country: USA

Scanning Reference books

Post by vanille » 30 Sep 2013, 15:06

Can someone point me to the relevant tutorial or forum post for my issue? I have a lot of reference books to scan. I'm hoping to scan them and having links in the table of contents to the corresponding pages.

Thank you.

User avatar
Posts: 8
Joined: 24 Jun 2013, 10:55
Number of books owned: 300
Country: USA
Location: NOVA

Re: Scanning Reference books

Post by Librum » 02 Oct 2013, 12:10


As in PDF internal links, or as html framework to image links, or as pure OCR, or as...

Frankly, there is not enough detail about your desires to even hazard a guess.

OUR, the Librum's, SOP is to make images, put them into a pdf, but not to do the OCR ('find text'). We take and make an ATOCI (Ascii Table Of Contents and Index). We have modified an ATOCI into html form to do links, but normally do not. This system is beneficial to us as it allows us to make the files small, in a standard format (pdf), and we can combine the indexes of the works (such as a set) into one master browser searchable file.

Is this something like you are considering?

"Ever wish to know what thy grandparents knew? Thee CAN!"

Posts: 79
Joined: 15 Sep 2010, 15:33
Number of books owned: 2000
Country: USA
Location: Ohio

Re: Scanning Reference books

Post by abmartin » 04 Oct 2013, 17:15

If you are trying to have clickable links in the document (like Google books sometimes have), you would probably need full Acrobat and a lot of manual work.

I prefer just using the outline features of pdf and djvu file formats which mean that I don't need to browse to the TOC page.

I make my files with djvubind. It is easy enough to use that to do the indexing. I prefer to wait until after it is done and just use the tool called djvusmooth which allows me to browse through the pages and create bookmarks as I go and edit the outline as appropriate. Djvubind does a great job of running OCR and the binding the pages together. (I much prefer using it to making pdf files)


You can also make PDF files using PDFBeads. PDFtk has the ability to create bookmarks. I'm sure there is some other simpler tool out there. (If memory serves, there's a gui for pdftk somewhere that might have bookmarking features)

Post Reply