pdf container for jpeg 2000 images?

Discussions, questions, comments, ideas, and your projects having to do with DIY Book Scanner software. This includes the Stereo Data Maker software for the cameras, post-processing software, utilities, OCR packages, and so on.

Moderator: peterZ

Post Reply
User avatar
dingodog
Posts: 110
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

pdf container for jpeg 2000 images?

Post by dingodog »

I have a set of JPEG 2000 images (.jp2) and I want insert all these jp2 images
in a multipage pdf

the problem is that when I convert any single jp2 image to pdf (using graphicsmagick
and then pdftk to join in a single multipage pdf)

Code: Select all

for f in *.jp2: do gm convert $f $f.pdf;done && pdftk *.pdf cat output out.pdf
any image dramatically increases its size

Do you know if exists any technique or software (in Linux) able to store jp2
images in a multipage pdf, preserving the original jp2 compression?
Lazy_Kent
Posts: 37
Joined: 26 Oct 2010, 10:06
Number of books owned: 0
Location: Moscow

Re: pdf container for jpeg 2000 images?

Post by Lazy_Kent »

As far as I can see, GraphicsMagick doesn't support JPEG2000 compression.
http://www.graphicsmagick.org/GraphicsM ... s-compress

Try ImageMagick instead.
http://www.imagemagick.org/script/comma ... 4#compress
Last edited by Lazy_Kent on 01 Jul 2011, 02:15, edited 1 time in total.
User avatar
dingodog
Posts: 110
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: pdf container for jpeg 2000 images?

Post by dingodog »

I forget to write that I already tried imagemagick also adding the -compress JPEG2000 switch, but the result was the same (I have the jpeg 2000 delegates)

Regarding Graphicsmagick, I think the online man is a bit outdated or incomplete, since it supports jpeg 2000 (I compiled statically against libjasper) in fact it encodes and decodes from and to jp2 format without any problem

I noted that Archive.org is able to produce a super compressed pdf starting from jpeg 2000 images

reading the metadata of these pdf I found:

PDF Compressor Server - LuraTech Imaging GmbH - Recoded by LuraDocument PDF v2.28 (only for windows)

in imagemagick forums I found this tread about same problem:

JPEG2000 to PDF without recompression
- http://www.imagemagick.org/discourse-se ... =1&t=10096

speaking about a special wrapper to include jp2 images in pdf
Bookstacks
Posts: 4
Joined: 07 Jan 2010, 19:35

Re: pdf container for jpeg 2000 images?

Post by Bookstacks »

Perhaps Ghostscript can process the files for you?

Chris
User avatar
dingodog
Posts: 110
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: pdf container for jpeg 2000 images?

Post by dingodog »

Bookstacks wrote:Perhaps Ghostscript can process the files for you?
Chris
I read somewhere that ghostscript can ONLY READ jpeg 2000 from PDF, not WRITE JP2 in pdf

it seems that jpeg 2000 specs for pdf are patented and developers need to buy a licence to implement this encoder for pdf applications. This can explain why I have not found any linux application able to insert jp2 images without de/recompression in pdf
bitsgalore

Re: pdf container for jpeg 2000 images?

Post by bitsgalore »

The reason for the increase in file size is that your source JP2 files probably contain lossily-compressed image data, and that ImageMagick/GraphicsMagick by default uses lossless compression while writing output. This means that the PDF contains a lossless representation of your original (lossy) image data, which explains the increase in file size (which can be huge!).

I tested this myself (using ImageMagick rather than GraphicsMagick) with a 5 MB JP2 image, which I converted to PDF using:

Code: Select all

convert test.jp2 testLossless.pdf
This produced a 14 MB PDF.

You can force lossy compression by using IM's "-quality" switch, which is explained here:

http://studio.imagemagick.org/script/co ... hp#quality

As a test I used the following command line:

Code: Select all

convert test.jp2 -quality 75 testLossy.pdf
This resulted in a PDF of slightly under 3 MB.

Although this works, it's not an ideal solution, as it applies lossy compression to image data that were already lossily compressed in the first place, resulting in unnecessary loss of quality (as also pointed out by dingodog above). You'd really want IM to simply transcode the original image codestream, but it just doesn't support that.

However, if you're only using the PDFs as an access format and you plan to keep the original JP2 files it might be good enough (you may want to do some experimenting with the quality factor).

Some side notes:

  1. 1. In the JP2 to PDF conversion with ImageMagick/GraphicsMagick you will lose any resolution info that was in the original JP2 (this is because of limitations of the JasPer library which is used by IM/GM for encoding/decoding JPEG 2000 images)
  • 2. I noticed that the conversion in IM resulted in a PDF 1.3 file. However, the 'JPXDecode' filter that is used for JPEG 2000 data in PDF is only supported from PDF 1.5 onward. Probably not a huge problem, but you might run into some issues when opening these files in older PDF viewers.
  • 3. Also, be aware that some mobile devices are not yet supporting PDF documents with embedded JPEG 2000 image streams!
Hope this is useful.

Cheers,

Johan

---------
Johan van der Knijff
KB / National Library of the Netherlands
Last edited by bitsgalore on 04 Jul 2011, 08:59, edited 2 times in total.
bitsgalore

Re: pdf container for jpeg 2000 images?

Post by bitsgalore »

Also:
dingodog wrote: it seems that jpeg 2000 specs for pdf are patented and developers need to buy a licence to implement this encoder for pdf applications. This can explain why I have not found any linux application able to insert jp2 images without de/recompression in pdf
I don't think so, since both JP2 and PDF are published as open ISO standards. See e.g.:

http://www.adobe.com/devnet/pdf/pdf_reference.html

And here for the JP2 file spec spec:

http://www.jpeg.org/public/15444-1annexi.pdf

A more likely explanation is that there's simply a lack of reliable open source JPEG 2000 libraries. Unlike formats like TIFF, JPEG or PNG, the OS choices you have when you're looking for JPEG 2000 support are pretty limited. The only ones I'm aware of are JasPer and OpenJPEG:

http://www.ece.uvic.ca/~mdadams/jasper/
http://www.openjpeg.org/

However, JasPer (which is also used by ImageMagick) has various of performance issues, and support for a number of specific features in the JP2 format is limited. Also, it was originally developed as a reference implementation for JPEG 2000, and as such it was never intended to offer a highly performant or complete encoding/decoding solution.

OpenJPEG looks more promising, but from what I understand it has some unresolved issues as well. There appears to be a modest surge in its development activity lately though ...

---------
Johan van der Knijff
KB / National Library of the Netherlands
rubypdf

Re: pdf container for jpeg 2000 images?

Post by rubypdf »

to convert jpeg2000(include jpx and jp2) to PDF with jpxdecode support, please use iText or FreePic2Pdf
openjpeg supports to read and write jpx and jp2, mupdf use openjpeg to decode jpeg2000.
User avatar
dingodog
Posts: 110
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: pdf container for jpeg 2000 images?

Post by dingodog »

many thanks

I tried FreePic2Pdf and it worked very fine in Linux (using wine)

it rightly created pdf from jp2 images, however, due probably to jpeg2000 decoding problems, Epdfview and Xpdf are not able to show images inside pdf, while Mupdf (viewer) That I compiled some time ago, is able to display images, and also Foxit PDF reader

I'll try also itext to have a scriptable program via command line
Post Reply