Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

djvu 1bit vs 8bits

General discussion about software packages and releases, new software you've found, and threads by programmers and script writers.
Post Reply
0kelvin
Posts: 29
Joined: 10 Nov 2012, 17:14
Number of books owned: 0
Country: Brazil

djvu 1bit vs 8bits

Post by 0kelvin » 16 Jul 2019, 15:31

Just noticed something:

A single text page, with grayscale or 1bit, results in the same PDF with mixed raster content. Maybe some bytes of difference, but they look the same. However, with djvu the result is different. Grayscale results in a somewhat bold text and 3,5x times increased filesize. It seems to add some anti aliasing to the text. With 1bit the filesize is smaller and there is no blurry antialiasing applied.

Some pages have grayscale images, making it impossible to convert to 1bit without destroying the images.

zbgns
Posts: 48
Joined: 22 Dec 2016, 06:07
E-book readers owned: Tolino, Kindle
Number of books owned: 600
Country: Poland

Re: djvu 1bit vs 8bits

Post by zbgns » 17 Jul 2019, 03:02

What software do you used for creating pdf and djvu files?

0kelvin
Posts: 29
Joined: 10 Nov 2012, 17:14
Number of books owned: 0
Country: Brazil

Re: djvu 1bit vs 8bits

Post by 0kelvin » 17 Jul 2019, 14:20

PDF in Abbyy. djvu with djvu small mod.

cday
Posts: 243
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: djvu 1bit vs 8bits

Post by cday » 17 Jul 2019, 16:07

0kelvin wrote:
17 Jul 2019, 14:20
PDF in Abbyy. djvu with djvu small mod.
Which Abbyy FineReader version? Some years ago I looked briefly at MRC in both FineReader and Nuance OmniPage 18, and with the number of options available and the interface design it wasn't easy to understand how MRC output images were produced, even after studying the limited information in the help system and PDF guides.

If the PDF output images and file sizes you are obtaining are essentially the same whether grayscale or 1-bit is selected, I would suggest that the outputs are actually being produced using the same internal settings. If you convert a page containing both a grayscale graphic and text, then with appropriate MRC settings the graphic should be rendered in grayscale and the text in 1-bit or possibly grayscale if the option is available, I'm not sure.

Do you have one or two pages you could upload, one with just text and one with mixed content, if anyone has time to investigate the results that can be obtained?

zbgns
Posts: 48
Joined: 22 Dec 2016, 06:07
E-book readers owned: Tolino, Kindle
Number of books owned: 600
Country: Poland

Re: djvu 1bit vs 8bits

Post by zbgns » 17 Jul 2019, 17:33

0kelvin wrote:
17 Jul 2019, 14:20
PDF in Abbyy. djvu with djvu small mod.
I do not have much experience with Abbyy Finereader. However I would guess is that in case of 1 bit pictures there is jbig2 compression applied and no MRC compression is necessary. In case of grayscale images, algorithms do separation of foreground (letters) and background. The foreground is compressed with jbig2, background probably with jpeg2000. Afterwards layers are combined together (plus masks) in order to have final picture. Since the background is probably white (or near white) it may be compressed very efficiently (or even omitted) and in result the output is already the same in both cases (1bit and 8 bit pictures) as only foreground matters.

Could you please make available any samples of 1 bit and 8 bit originating files in order to verify whether my theory is correct?

Moreover, as far as I understand, the djvu compression works in similar way, however some other compression algorithms are used. Maybe you should use Abbyy Finereader to create both pdf and djvu and compare them? It seems to me that Abbyy software supports also djvu.

0kelvin
Posts: 29
Joined: 10 Nov 2012, 17:14
Number of books owned: 0
Country: Brazil

Re: djvu 1bit vs 8bits

Post by 0kelvin » 17 Jul 2019, 20:18

Maybe a single page compressed differently? I compressed a whole book and the same issue didn't happen.

I think it was a false alarm. When I view the djvu at 100% zoom the quality is higher than "fit width to the screen". Fit width seems to blurry the edges of everything.

Konos93a
Posts: 130
Joined: 19 Sep 2016, 10:00
E-book readers owned: kobo aura,kindle 1,kindle pw3,pocketbook inkpad 2
Number of books owned: 3000
Country: greece

Re: djvu 1bit vs 8bits

Post by Konos93a » 29 Jul 2019, 03:32

can you write the filesize and the dimension ?

personally i use djvusmall 0.4.4

so canon a3300 1 jpeg=3.25 mb

a book with 581 pages has 1.86 gb

after scan tailor advanced 581 pages 79 mb (filesize change depends the size of the book)

581 b&w tiff after djvusmall has a djvu with filesize 4437 kb

after abbyfineread djvu + ocr 7671 kb and pdf+ocr (with b&w) 11920 kb and table of contects output

Post Reply