Introducing djvubind for djvu file creation

General discussion about software packages and releases, new software you've found, and threads by programmers and script writers.

Moderator: peterZ

User avatar
strider1551
Posts: 126
Joined: 01 Mar 2010, 11:39
Number of books owned: 0
Location: Ohio, USA

Re: Introducing djvubind for djvu file creation

Post by strider1551 »

Tulon wrote:Even though that's not optimal, the result should still look fine.
Hmm... yes. My concern was how it might affect compression efficiency, but the more I think about it the less I think it would even make a difference. I still plan to test things out on a full color image, just to double check.

Thank you for mentioning csepdjvu... probably would have spent half the night wondering how to merge two separate encodings into one page without that hint. I've worked out a script that works on a test image (attached). Long story short is a decrease from 43.7 kB (tif -> ppm -> cpaldjvu) to 11.6 kB for the test image.

Code: Select all

#! /bin/bash

convert -opaque black sample.tif sample_graphics.tif
convert +opaque black sample.tif sample_textual.tif
cjb2 sample_textual.tif temp_textual.djvu
ddjvu -format=rle -v temp_textual.djvu temp_textual.rle
convert sample_graphics.tif temp_graphics.ppm
cat temp_textual.rle temp_graphics.ppm > temp.mix
csepdjvu -vv temp.mix out.djvu
rm sample_* temp*
So it looks like I can get that feature done in the very near future. Just gotta relearn how to make debian packages before I go back to working on the code.
Attachments
sample.tif
(112.43 KiB) Downloaded 378 times
User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: Introducing djvubind for djvu file creation

Post by Misty »

strider1551 wrote:If we're talking a file created from scanned images, a djvu file will be significantly smaller in size. Djvu was made specifically for scanned images and makes use of jb2 compression, whereas the best compression for a pdf is Group4.

...

Edit:
I forgot to mention that several months ago I heard of a new compression for pdf's that is very similar to jb2 and would produce similar file sizes. For the life of me I can't remember the name of it. I do remember that there was a big issue of it being encumbered by patents. If it does take off, it would probably be a few years before it makes it into pdf reader software, and who knows when it would be accessible in the open source world if there are patent questions.
PDF actually does support JBIG2 since PDF format 1.4. DjVu's JB2 is based on JBIG2; I believe AT&T based it on a pre-standard version of JBIG2, before it was standardized in 2000. There's just a lack of open-source PDF utilities that support encoding JBIG2 PDFs. The only good one I've found is a Python script based on jbig2enc. If patents are an issue, I would assume that DjVu's JB2 is also patent-encumbered because they are based on the same format.
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Introducing djvubind for djvu file creation

Post by Tulon »

It turns out jbig2enc does support lossy encoding. The heavy lifting is provided by the Leptonica library in this case. Here is a relevant link: http://www.leptonica.com/jbig2.html
I wouldn't worry about patents. Some relevant ones are listed in the end of the JBIG2 specification, but many of them are old and some can be worked around. Yet another reason I wouldn't care about patents is that I believe only Hello World type of programs don't violate any patents.

Anyway, assembling PDFs with JBIG2 compression seems more complex that doing the same with DJVU, as there doesn't seem to be an equivalent to cdjvusep.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
User avatar
dingodog
Posts: 110
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: Introducing djvubind for djvu file creation

Post by dingodog »

on my system (Puppy Linux)

I use

- Scantailor in order to clean and split scanned pages
- Jbig2enc to reduce furtherly the size of scanned images and then pdf.py (with python) to join together all images encoded with jbig2

it's very simple
User avatar
strider1551
Posts: 126
Joined: 01 Mar 2010, 11:39
Number of books owned: 0
Location: Ohio, USA

Re: Introducing djvubind for djvu file creation

Post by strider1551 »

The next version is available for download (both source and ebuild, debian package should be up by the weekend). I make note of it here because this version supports the mixed mode offered by scantailor. If the mixed mode image contains only black/white the image will go through minidjvu, otherwise the text version goes to cjb2 and before being combined with the graphical version by csepdjvu. Better compression than before in either case. Oh, and the stupid problem of calling python3 is gone, so you can just use "djvubind".

The next few days are somewhat busy, and then I leave for a week long silent retreat. Feedback/bugs/complaints still welcomed, but any response will obviously be delayed until I get back if I don't get to it before the weekend.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Introducing djvubind for djvu file creation

Post by daniel_reetz »

blogged! thanks for the cool utility.
bkrpr
Posts: 9
Joined: 04 Mar 2014, 00:52

Re: Introducing djvubind for djvu file creation

Post by bkrpr »

also blogged and, for what it is worth, I'm in love with scantailor+djvubind. This is some fine work.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Introducing djvubind for djvu file creation

Post by daniel_reetz »

BKRPR, are you still working on this?
bkrpr
Posts: 9
Joined: 04 Mar 2014, 00:52

Re: Introducing djvubind for djvu file creation

Post by bkrpr »

daniel_reetz wrote:BKRPR, are you still working on this?
No longer actively under development. The lead developer had twins right after the last version released. Oddly, development slowed :)

At this point, Scantailor is what we were moving towards anyway so there is not much incentive to pull developer time our way. If anyone else is interested in re-using the auto crop algorithms from our codebase, they might convert well into phatch actions since both codebases are python. I can see a use in that, but I can't see much of a use to having two Scantailors out there right now.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Introducing djvubind for djvu file creation

Post by daniel_reetz »

but I can't see much of a use to having two Scantailors out there right now.
Agreed, but as noted here Tulon is still pretty much a one-man show. It would be great to get him some development support, especially as the Scan Tailor userbase continues to expand.
Post Reply