Best way to proceed with scanned magazines etc

Don't know where to start, or stuck on a certain problem? Drop by and tell us about it. Feel like helping others? Start here.

Moderator: peterZ

Post Reply
sidney
Posts: 5
Joined: 19 Feb 2013, 10:27
Number of books owned: 0
Country: germany

Best way to proceed with scanned magazines etc

Post by sidney »

Usually when I have a scanned book with text (and perhaps some black white diagrams or illustrations) I use scantailor to split pages etc. and then use djvubind (together with tesseract or cuneiform) to produce a small sized good looking and searchable djvu file.

However with magazines or books which contain lots of pictures (which sometimes reach the page margin) this would not produce what I want. Here are some problems with it:

- scantailor sometimes has problems to recognize the content it there are pictures which reach the page margin or if there are printed hints or annotations on the page margins. Then I have to adjust everything manually
- I tried scantailor in mixed mode to seperate text and images, but then I must define most pictures manually since the picture detecting algorithm seems not very mature
- But how to proceed afterwards? Djvubind would "destroy" the nice color or grayscale pictures.

So what's the best way (detailed steps on linux) to proceed magazines or books with lots of pictures to get a searchable pdf or djvu where the pictures are in good color or grayscale quality and where the file size is not too big?
sidney
Posts: 5
Joined: 19 Feb 2013, 10:27
Number of books owned: 0
Country: germany

Re: Best way to proceed with scanned magazines etc

Post by sidney »

I just realized that djvubind can also handle tif files in mixed mode preserving colors. Nevertheless, what's your favourite way to proceed magazines or books with lots of pictures?
Post Reply