Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

B/W source - how to keep format?

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.
eL_PuSHeR
Posts: 125
Joined: 28 Jun 2010, 15:25

B/W source - how to keep format?

Post by eL_PuSHeR » 04 Oct 2011, 16:33

Hello.

This is the first time I am using Scan Tailor with a source that is pure black & white (600 dpi). How can I make output the same as original? If I choose black & white, the outcome appears either too black or too washed out. If I choose color/grayscale the output sizes gets too big. Help.

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: B/W source - how to keep format?

Post by Tulon » 05 Oct 2011, 04:22

Can you post an example page?
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.


User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: B/W source - how to keep format?

Post by dingodog » 05 Oct 2011, 08:38

best option is to use jbig2enc

see how good is the output using jbig2enc

http://ifile.it/si9lfye

eL_PuSHeR
Posts: 125
Joined: 28 Jun 2010, 15:25

Re: B/W source - how to keep format?

Post by eL_PuSHeR » 05 Oct 2011, 08:53

What? That page isn't even processed and has bad dpi.

User avatar
daniel_reetz
Posts: 2780
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: B/W source - how to keep format?

Post by daniel_reetz » 05 Oct 2011, 09:04

Umm, that is definitely a processed image and the "DPI" is fine/large - are you referring to the moire effects that are happening because it was scanned on a flatbed?

Thanks for sharing this, dingodog!

User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: B/W source - how to keep format?

Post by dingodog » 05 Oct 2011, 09:06

eL_PuSHeR wrote:What? That page isn't even processed and has bad dpi.
Hum? so do you believe to Adobe?

You should not to believe
# pdfinfo el-factor-humano.pdf
Tagged: no
Pages: 1
Encrypted: no
Page size: 6578 x 4424 pts
File size: 458678 bytes
Optimized: no
PDF version: 1.4
ppi is the same, dpi changed, but dpi is not ppi

I rotated image with MTpaint and MTpaint stripped the dpi info (but image has the same ppi) no quality loss it is happened

do you want the PROCESSED IMAGE alone?

I extracted from jbig2enc output
- http://imageshack.us/photo/my-images/40 ... aspng.png/

image has the dithering, since jbig2enc cannot BLANK non-text areas like Scantailor can do

such type of images need to be pre-processed, adjusting contrast and brightness values before to be encioded with jbig2enc (this helps also with scantailor)

original colorful image is needed

eL_PuSHeR
Posts: 125
Joined: 28 Jun 2010, 15:25

Re: B/W source - how to keep format?

Post by eL_PuSHeR » 05 Oct 2011, 09:17

Forgive me but I do not see what your point is. My issue is with Scan Tailor's BW output.

User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: B/W source - how to keep format?

Post by dingodog » 05 Oct 2011, 09:28

the same task can be performed with jbig2enc

I'm experimenting various blur values in order to remove the dithering under text areas

blurring already dithered image (1bit) before to process with jbig2enc, helps to clear dithering

blurring image you have provided with radius=4 (gaussian blur)

and then encoding with jbig2enc

Code: Select all

jbig2 -s -p -v -T 125 out.png && pdf.py output>out.pdf
I obtain a more clear result:
- http://ifile.it/ko5u714

pejuko
Posts: 30
Joined: 17 Feb 2011, 17:06

Re: B/W source - how to keep format?

Post by pejuko » 05 Oct 2011, 11:51

Try mixed mode. If the file size is problem then you can reduce number of gray tones using imagemagick like this:
convert out/page.tif -depth 2 -quality 100 page.png
This will create page.png which should have only 4 gray tones but looks usualy good.

I get this result from scantailor+ convert:
http://www.4shared.com/photo/y6-7IKB8/out.html

In this I marked the text area below picture as non image, so the background has disappeared and I set the thicknes to -30.

Post Reply