Learning to Create Tiny DJVU files

Discussions, questions, comments, ideas, and your projects having to do with DIY Book Scanner software. This includes the Stereo Data Maker software for the cameras, post-processing software, utilities, OCR packages, and so on.

Moderator: peterZ

User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Learning to Create Tiny DJVU files

Post by daniel_reetz »

dtic wrote:Very useful posts Richard!
Agreed 100%. These posts are, as far as I can see, a unique resource on the internet. Thanks for taking the time, RichardT.
jera2
Posts: 11
Joined: 28 Jul 2011, 00:09
E-book readers owned: Kobo
Number of books owned: 1200
Country: Canada

Re: Learning to Create Tiny DJVU files

Post by jera2 »

I too want to chime in and express my thanks, Richard, for all your posts. Very informative, and if I ever get around to working with this area on my own, I will study your posts again!
RichardT
Posts: 27
Joined: 24 Apr 2012, 10:17
E-book readers owned: Kindle 3rd, Kindle Fire, iPad3
Number of books owned: 0
Country: USA
Contact:

Re: Learning to Create Tiny DJVU files

Post by RichardT »

Thanks, all, for the kind words! As I come across interesting or confusing books I'll be sure to make more posts, but in truth most books are very easy to compress with the techniques already here. Between scanTailor and imageMagick and minidjvu and djvuLibre, we have a pretty complete suite of tools. I still wish a free replacement for msepdjvu existed, but the truth is for complicated pages you get much better results splitting the imagery out by hand anyway.
RichardT
Posts: 27
Joined: 24 Apr 2012, 10:17
E-book readers owned: Kindle 3rd, Kindle Fire, iPad3
Number of books owned: 0
Country: USA
Contact:

Re: Learning to Create Tiny DJVU files

Post by RichardT »

Hi all! It's been a year since my last post, and really not much has changed for my process. I scan a lot fewer books these days, since my whole library has been scanned and I try to avoid buying paper books unless that's the only form available.

One thing I've started doing since getting a large desktop machine is using a RAM disk. In a typical scenario, making tweaks to a book, I will split out a DjVu or PDF into one PNG per page, do some kind of image processing on the PNGs, and then recombine them into a DjVu or PDF. So, it has always bothered me that I have to write sometimes like 800 intermediate files to disk, then re-open and read them back in to edit, then just erase them again as soon as I'm done.

Well like I mentioned, I got a new desktop with lots of RAM to spare, so I tried making a ram disk for those files. Now, all the intermediate files never hit a physical disk! And it's faster as well.

All the main platforms can mount a ram disk. I'm on OS X these days, and the command will look like:

Code: Select all

diskutil erasevolume HFS+ 'RAM Disk' `hdiutil attach -nomount ram://xxxxxx`
.... where xxxxx is the size you want in megabytes * 2048.

Anyway, if you are computer-savvy, have lots of RAM, and are careful to create the final product on a physical disk, it can be a nice way to handle all those temporary files.
RichardT
Posts: 27
Joined: 24 Apr 2012, 10:17
E-book readers owned: Kindle 3rd, Kindle Fire, iPad3
Number of books owned: 0
Country: USA
Contact:

Re: Learning to Create Tiny DJVU files

Post by RichardT »

I know I've mentioned this before in some posts, but I used this technique today and I thought I'd add it to the thread as its own tip...

Tip: If all the text on the page is one color (say, blue text), you don't have to make a foreground color map. djvumake will create a minimal one for you.

So today the ugly scan I cleaned up had blue text on page 2. Here's what I did:
  1. Clean up the image in scantailor or imagemagick, and set it to black-and-white output. Let it make the text black.
  2. Generate an indirect DjVu file of all the pages with minidjvu like always. Now each page has its own DjVu file.
  3. Move the page that should have colored text out of the way. (e.g., mv p0002.djvu x0002.djvu)
  4. Create a djvu page with Sjbz set to the b/w text page you just moved, and FGbz set to the hex value of the color you want the text to be.
So, that last step will look something like:

Code: Select all

djvumake.exe p0002.djvu INFO=,,600 INCL=p0001.iff Sjbz=x0002.djvu FGbz=#5478bc
Here's a close-up from my file today (left is original bad jpeg scan, right is much nicer product):
side-by-side
side-by-side
b3s.png (11.15 KiB) Viewed 15437 times
I know, the crisp "B3" on the right looks a little ragged, but at 100% zoom it looks great. That's as good as it gets when starting with these horrible jpeg scans, unfortunately.

Sadly, I paid money for this scan... if anyone here happens to know a Wizards of the Coast exec, let them know they can hire me to make their scanned product offerings look great, if they want. ;)
Amjad
Posts: 8
Joined: 31 Mar 2010, 01:49

Re: Learning to Create Tiny DJVU files

Post by Amjad »

This post is priceless.
Richard rocks !
Post Reply