Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

input (image type) into scantailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.
Post Reply
seasalt

input (image type) into scantailor

Post by seasalt » 04 May 2011, 19:50

hello - does it make any difference to output (in terms OCR quality or output compression) if I use my flatbed scanner to scan to
- tiff
- png
- jpeg
(I have all 3 options on my scanner, plus direct to PDF)

most books are text with maybe a handful illustrations
sometimes I have a book that is full picture instruction layout

thankyou for your help

eL_PuSHeR
Posts: 125
Joined: 28 Jun 2010, 15:25

Re: input (image type) into scantailor

Post by eL_PuSHeR » 05 May 2011, 02:24

JPEG lossy compression sucks for text. I would use TIFF. Quick, somewhat nice compression and very known format. It also supports EXIF data. PNG compression is better but no EXIF support (if you need it).

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: input (image type) into scantailor

Post by Tulon » 05 May 2011, 02:34

I would actually go with PNG. That one is guaranteed to by lossless. TIFF supports both lossless and lossy compression, so you never know what you end up with.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

seasalt

Re: input (image type) into scantailor

Post by seasalt » 05 May 2011, 04:39

thx Tulon and elpusher
also my scanner (epson gt1500)
has so many settings... it has home, office and professional mode.

if I want BEST OCR (mainly non fiction books text and some illustrations)
are these correct
1 ON Descreening filter (to reduce moire patterns - I don't know what they are)
2 ON unsharp mask filter (as says on sharpens image)
3 24 bit colour (I read on forum for OCR (tonydart comment) that best is colour scan and set black/dark grey object
options for 3
Choose a color depth setting from the Image Type menu:
â–  24-bit Color for the highest quality color scans

â–  Color Smoothing for color documents without photographs
â–  8-bit Grayscale for the highest quality scans of black-and-white photos or images
â–  Halftone if you want to select special halftone patterns
â–  Black&White for text or line art 2. Click to see more options, then choose Best or Draft for the Scanning Quality setting.
4. Choose 600dpi - I read OCR does not improve after 600dpi. even though text and few illustrations
options for resolution setting (50 to 9600 dpi).
5. specify a Target Size of image same as original versus Customise it

6. ON or Off???? Use the Adjustment settings to modify the image as necessary.
(You may have to use the scroll bar to see these options.)
Auto Adjust – When Auto Adjust is on (the default), your software detects and analyzes the image with the document settings you've specified to determine the optimum settings for your scan.
-f. Histogram Adjustment – to adjust the highlight, shadow, and gamma input levels.
-Tone Correction – to choose a preset tone curve for specific effects or to change the tone curve manually.
- Image Adjustment – to adjust the color balance, saturation, brightness, and contrast settings.

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: input (image type) into scantailor

Post by Tulon » 05 May 2011, 05:55

Scan Tailor likes its input to be as raw as possible, that is the less "enhancements" done by the scanner software, the better.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

eL_PuSHeR
Posts: 125
Joined: 28 Jun 2010, 15:25

Re: input (image type) into scantailor

Post by eL_PuSHeR » 06 May 2011, 02:40

My advice is scanning at 300dpi (minimum) and withouth scanner filters. You can postprocess output later via software. I also think scanning at resolutions highter than 600 dpi is quite pointless because filesize become HUGE and hard to work with.

Post Reply