Page 1 of 3

B/W source - how to keep format?

Posted: 04 Oct 2011, 16:33
by eL_PuSHeR
Hello.

This is the first time I am using Scan Tailor with a source that is pure black & white (600 dpi). How can I make output the same as original? If I choose black & white, the outcome appears either too black or too washed out. If I choose color/grayscale the output sizes gets too big. Help.

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 04:22
by Tulon
Can you post an example page?

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 08:10
by eL_PuSHeR

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 08:38
by dingodog
best option is to use jbig2enc

see how good is the output using jbig2enc

http://ifile.it/si9lfye

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 08:53
by eL_PuSHeR
What? That page isn't even processed and has bad dpi.

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 09:04
by daniel_reetz
Umm, that is definitely a processed image and the "DPI" is fine/large - are you referring to the moire effects that are happening because it was scanned on a flatbed?

Thanks for sharing this, dingodog!

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 09:06
by dingodog
eL_PuSHeR wrote:What? That page isn't even processed and has bad dpi.
Hum? so do you believe to Adobe?

You should not to believe
# pdfinfo el-factor-humano.pdf
Tagged: no
Pages: 1
Encrypted: no
Page size: 6578 x 4424 pts
File size: 458678 bytes
Optimized: no
PDF version: 1.4
ppi is the same, dpi changed, but dpi is not ppi

I rotated image with MTpaint and MTpaint stripped the dpi info (but image has the same ppi) no quality loss it is happened

do you want the PROCESSED IMAGE alone?

I extracted from jbig2enc output
- http://imageshack.us/photo/my-images/40 ... aspng.png/

image has the dithering, since jbig2enc cannot BLANK non-text areas like Scantailor can do

such type of images need to be pre-processed, adjusting contrast and brightness values before to be encioded with jbig2enc (this helps also with scantailor)

original colorful image is needed

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 09:17
by eL_PuSHeR
Forgive me but I do not see what your point is. My issue is with Scan Tailor's BW output.

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 09:28
by dingodog
the same task can be performed with jbig2enc

I'm experimenting various blur values in order to remove the dithering under text areas

blurring already dithered image (1bit) before to process with jbig2enc, helps to clear dithering

blurring image you have provided with radius=4 (gaussian blur)

and then encoding with jbig2enc

Code: Select all

jbig2 -s -p -v -T 125 out.png && pdf.py output>out.pdf
I obtain a more clear result:
- http://ifile.it/ko5u714

Re: B/W source - how to keep format?

Posted: 05 Oct 2011, 11:51
by pejuko
Try mixed mode. If the file size is problem then you can reduce number of gray tones using imagemagick like this:
convert out/page.tif -depth 2 -quality 100 page.png
This will create page.png which should have only 4 gray tones but looks usualy good.

I get this result from scantailor+ convert:
http://www.4shared.com/photo/y6-7IKB8/out.html

In this I marked the text area below picture as non image, so the background has disappeared and I set the thicknes to -30.