Page 2 of 2

Re: Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Posted: 16 Mar 2016, 02:28
by deadrocker
The key factor that one should understand is that “The end result is the same with a separate text layer in the PDF.” And as said “As others have said this works well on computers but less well on eBook readers.”
:::::::::::::::::::::::
deadrocker
Jewish classifieds
:::::::::::::::::::::::

I'm in way over my head, I think.

Posted: 04 Apr 2016, 20:53
by Mlambert
I am a pretty good carpenter, so I decided to build my own Archivist. I have been a book collector for years (I'm also a hand bookbinder). Unfortunately, I am NOT a computer programmer. I don't know how to do anything with a computer using a command line interface. I can follow instructions pretty well, however, and I catch on fast. Here is where I am at now: The Archivist is built (including all electronics, lighting - the works) and I have scanned my first 103-page book. I have all of my pictures in a folder called Images on my SD card. I have installed Scan Tailor and I feel comfortable with how to use it. Problem is, I can't seem to figure out how to get my images "assembled" as side by side pages, all nice and aligned. I tried to figure out Total Commander, but I can't make heads or tails of it. I'm not even sure it is what I need. Can I get some step-by-step help to figure out post processing? Is anyone willing or able to patiently guide me through a workflow that I can succeed at? Thank you!

Re: Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Posted: 05 Apr 2016, 13:04
by duerig
Mlambert, it sounds like you are almost there. The very last step of post-processing is to turn the pile of cleaned up images into a PDF (or similar format) so that you have the whole book in one file that is easy to read. There are a lot of options for doing this depending on how you want to use your ebook.

For me, the simplest freeware program I have used for this is IrfanView:

http://www.irfanview.com/

Run the program. Click 'Options -> Multipage images -> Create multipage PDF'. Click 'Add Images' and select all the images you want. Usually I click 'Sort files' afterwards to make sure that they aren't in an odd order. Then click 'Create PDF image'. Then you can read it in any PDF viewer. If you want pages side by side, that is just an option you can set on the PDF viewer.

One slight complication was that sometimes I found that I had to click 'add images' multiple times and select a hundred or so images at once because of some UI limitation. And when there are a ton of images, the easiest thing to do is to click the first image, then shift-click the last image to select everything in the range.

There are also paid options like Abbyy FineReader or Adobe Acrobat which make this whole process much slicker, add nifty features (searchable PDFs with OCR, etc.), and of course cost a lot of money. :-) But this is a thread about freeware options.

-D

Re: Freeware Windows workflow: 40Mb 400pg OCR'd A4 book pdfs

Posted: 06 Apr 2016, 08:33
by Mlambert
Thanks, Jonathon. I will pursue these options. Mark