Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.

Moderator: peterZ

deadrocker
Posts: 1
Joined: 15 Mar 2016, 03:40
Number of books owned: 193
Country: united states of america

Re: Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Post by deadrocker »

The key factor that one should understand is that “The end result is the same with a separate text layer in the PDF.” And as said “As others have said this works well on computers but less well on eBook readers.”
:::::::::::::::::::::::
deadrocker
Jewish classifieds
:::::::::::::::::::::::
Mlambert
Posts: 2
Joined: 09 Sep 2015, 14:36
E-book readers owned: None
Number of books owned: 750
Country: USA

I'm in way over my head, I think.

Post by Mlambert »

I am a pretty good carpenter, so I decided to build my own Archivist. I have been a book collector for years (I'm also a hand bookbinder). Unfortunately, I am NOT a computer programmer. I don't know how to do anything with a computer using a command line interface. I can follow instructions pretty well, however, and I catch on fast. Here is where I am at now: The Archivist is built (including all electronics, lighting - the works) and I have scanned my first 103-page book. I have all of my pictures in a folder called Images on my SD card. I have installed Scan Tailor and I feel comfortable with how to use it. Problem is, I can't seem to figure out how to get my images "assembled" as side by side pages, all nice and aligned. I tried to figure out Total Commander, but I can't make heads or tails of it. I'm not even sure it is what I need. Can I get some step-by-step help to figure out post processing? Is anyone willing or able to patiently guide me through a workflow that I can succeed at? Thank you!
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Post by duerig »

Mlambert, it sounds like you are almost there. The very last step of post-processing is to turn the pile of cleaned up images into a PDF (or similar format) so that you have the whole book in one file that is easy to read. There are a lot of options for doing this depending on how you want to use your ebook.

For me, the simplest freeware program I have used for this is IrfanView:

http://www.irfanview.com/

Run the program. Click 'Options -> Multipage images -> Create multipage PDF'. Click 'Add Images' and select all the images you want. Usually I click 'Sort files' afterwards to make sure that they aren't in an odd order. Then click 'Create PDF image'. Then you can read it in any PDF viewer. If you want pages side by side, that is just an option you can set on the PDF viewer.

One slight complication was that sometimes I found that I had to click 'add images' multiple times and select a hundred or so images at once because of some UI limitation. And when there are a ton of images, the easiest thing to do is to click the first image, then shift-click the last image to select everything in the range.

There are also paid options like Abbyy FineReader or Adobe Acrobat which make this whole process much slicker, add nifty features (searchable PDFs with OCR, etc.), and of course cost a lot of money. :-) But this is a thread about freeware options.

-D
Mlambert
Posts: 2
Joined: 09 Sep 2015, 14:36
E-book readers owned: None
Number of books owned: 750
Country: USA

Re: Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Post by Mlambert »

Thanks, Jonathon. I will pursue these options. Mark
Post Reply