Re: Preserving colored text
Posted: 12 Nov 2010, 08:57
My current best is 17.3 kB. The key to getting a smaller djvufile is to rely upon the layered structure of the djvu format. To put things simply, there is a foreground, a background, and a mask. The mask is a simple black/white image of the text, encoded with cjb2. The black portions of the mask will use the foreground image for color information and the white portions will use the background image (or white by default). Both foreground and background are typically iw44 images as made with c44. From here on I will only talk about the mask and foreground layers, since I'm only concerned about colored text.
So to show this visually, this would be the mask plus the foreground creating the final djvufile. Creating the foreground image is fairly straight forward. Take the colorized original, threshold it to black and white, invert the black and white colors, then use that as a mask for c44. The code is below, and final djvufile size was 49.5 kB. The test_01.tif file is attached to a post above.
As always, less colors and colors isolated into larger segments produces smaller foreground images. In my case I don't want the true color of the text as captured, I just want red and black. So modifying the colors not only creates an image that I consider better looking, but also an image that is smaller. The important part is to eliminate the stray black pixel in the red text and vice versa, not keeping the shape of the text, hence the blurring and aggressive -fuzz settings (since the shape of the text is in the mask layer, not the foreground layer). This time the final djvufile size is 17.3 kB. Note that just the bitonal djvufile produced by cjb2 of this image was 12.3 kB, so adding color is not that much more.
So to show this visually, this would be the mask plus the foreground creating the final djvufile. Creating the foreground image is fairly straight forward. Take the colorized original, threshold it to black and white, invert the black and white colors, then use that as a mask for c44. The code is below, and final djvufile size was 49.5 kB. The test_01.tif file is attached to a post above.
Code: Select all
# Create the iw44 foreground.
convert test_01.tif _test_01.ppm
convert _test_01.ppm -threshold 99% -negate _foreground_mask.pbm
c44 -dpi 600 -decibel 30 -mask _foreground_mask.pbm _test_01.ppm _foreground.djvu
djvuextract _foreground.djvu BG44=_foreground.iw4
# Create the text layer that will be colored.
convert test_01.tif -threshold 99% _text.tif
cjb2 -dpi 600 -lossy _text.tif _text.djvu
# Put is all together.
djvumake test_01.djvu INFO=,,600 Sjbz=_text.djvu FG44=_foreground.iw4
Code: Select all
#! /bin/bash
# Create a better base image to work with. Bring out the black and red colors.
convert test_01.tif -fuzz 40% -fill black -opaque black -modulate 100,150,100 -fuzz 30% -fill red -opaque red _base.tif
# Isolate black and red colors to only the sections of the image where those colors should be.
# Note that we can "loose" the shape of the characters, all we need to do is get red and black
# in the general areas they should be in.
convert _base.tif -fill white +opaque black -despeckle -blur 10 -fuzz 50% -fill white -opaque white -colors 2 _black.tif
convert _base.tif -fill white +opaque red -despeckle -blur 10 -fuzz 50% -fill white -opaque white -colors 2 _red.tif
composite -compose multiply _red.tif _black.tif _composite.ppm
# Create the iw44 foreground.
convert _composite.ppm -threshold 99% -negate _foreground_mask.pbm
c44 -dpi 600 -decibel 30 -mask _foreground_mask.pbm _composite.ppm _foreground.djvu
djvuextract _foreground.djvu BG44=_foreground.iw4
# Create the text layer that will be colored.
convert _base.tif -threshold 99% _text.tif
cjb2 -dpi 600 -lossy _text.tif _text.djvu
# Put is all together.
djvumake test_01.djvu INFO=,,600 Sjbz=_text.djvu FG44=_foreground.iw4