Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

PDFMaker 0.3 - help beta test!

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.
User avatar
strider1551
Posts: 126
Joined: 01 Mar 2010, 11:39
Number of books owned: 0
Location: Ohio, USA

Re: PDFMaker 0.1 - help beta test!

Post by strider1551 » 01 Nov 2010, 12:58

MIsty wrote:-opaque and +opaque will fill the unwanted regions with pure white?
That seems to be the default. You can make it explicit with the -fill option. Also, some threads I was just reading recommend putting the input image name before any of the options - not sure if that is still relevant or not.

Code: Select all

convert "page.tif" -fill white -opaque black "temp_graphics.tif"

User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: PDFMaker 0.1 - help beta test!

Post by Misty » 03 Nov 2010, 12:32

Thanks.

I did discover a good way to eliminate the empty files. By using

Code: Select all

identify -format %k filename
ImageMagick will return a single number with the number of colours in a file. That can be read in to a variable, and used to decide which image layer files to delete since only blank images will ever return 1.
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.

User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: PDFMaker 0.1 - help beta test!

Post by Misty » 04 Nov 2010, 16:23

I've attached PDFMaker 0.2 to this post. It eliminates that ST Separator requirement, using ImageMagick. Unfortunately this is much slower than ST Separator was, but should be more compatible.

0.2 also includes preliminary code to pass the DPI on to pdf.py, but I haven't updated pdf.py yet to have it take advantage of it.

Does anyone know of a way to identify the number of colours in an image that would be faster than using ImageMagick's identify command?

Edit: I forgot to mention that I also added an option to customize JEPG compression quality. The default is 85. Specify quality with -quality X
Attachments
pdfmaker 0.2.zip
(5.84 MiB) Downloaded 301 times
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.

Lazy_Kent
Posts: 37
Joined: 26 Oct 2010, 10:06
Number of books owned: 0
Location: Moscow

Re: PDFMaker 0.1 - help beta test!

Post by Lazy_Kent » 05 Nov 2010, 08:37

Misty wrote:Does anyone know of a way to identify the number of colours in an image that would be faster than using ImageMagick's identify command?
GraphicsMagick is faster.

Code: Select all

% time identify -format %k 001.pnm
2
identify -format %k 001.pnm  6,79s user 0,33s system 99% cpu 7,188 total
% time gm identify -format %k 001.pnm
2
gm identify -format %k 001.pnm  2,30s user 0,11s system 98% cpu 2,440 total

User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: PDFMaker 0.2 - help beta test!

Post by Misty » 05 Nov 2010, 09:03

That's quite an improvement. I hate to add another dependency, but I'll look at switching to GraphicsMagic. Thanks for the tip!

Edit: I wonder how GraphicsMagick's JPEG2000 compression is? Maybe I'll avoid the bug/quirk in ImageMagick. It would be nice to get JP2 compression reintegrated.
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.

univurshul
Posts: 496
Joined: 04 Mar 2014, 00:53

Re: PDFMaker 0.2 - help beta test!

Post by univurshul » 08 Nov 2010, 10:50

Hi Misty,

I was curious if you could send me a small-sized copyright-free PDF made from PDFMaker where I can test sending the whole doc in Illustrator LiveTrace as a complete vectorization test.

I suppose I'd like to try this on a compressed ebook with B/W line of art material & text.

Thanks.

User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: PDFMaker 0.2 - help beta test!

Post by Misty » 08 Nov 2010, 12:01

Sure, I'll send you one later today. B/W line art only, no colour illustrations?
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.

univurshul
Posts: 496
Joined: 04 Mar 2014, 00:53

Re: PDFMaker 0.2 - help beta test!

Post by univurshul » 08 Nov 2010, 13:46

A good mixed-bag of media in the PDF will be a great to try. No OCR on this yet, correct?

Also: Perhaps if there's a way to send different optimal compression levels with their specs; maybe the original uncompressed. Take your time on this, it's kind of a tall order to ask someone to do this pro bono. I appreciate it.

It will be a PDF I routinely test and learn with with if it's ok. By the time there's Mac UI on PDFMaker running, I'd have an idea how to mesh other ideas with or around it.

Thanks.

univurshul
Posts: 496
Joined: 04 Mar 2014, 00:53

Re: PDFMaker 0.2 - help beta test!

Post by univurshul » 10 Nov 2010, 13:43

and another thing I was thinking about is how you developed PDFMaker with background removal/isolation.

Granted, I'm fairly green with all the in and outs of PDF construction, I did some tests with Acrobat and realized it can't differentiate and effectively remove the background from ST-generated images.

This isn't really noticed until the user decides to invert colors (e.g, white text on black background).

My question is this: did you manage to differentiate/isolate text/line art/pictures from the ST white background? :P

User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: PDFMaker 0.2 - help beta test!

Post by Misty » 10 Nov 2010, 13:58

I isolated the background using ImageMagick/GraphicsMagick, which preprocess the files for PDF conversion. They include a -transparent option which allows a colour to be defined as transparent. When creating the image layer, I use -transparent white to remove every white pixel in the image; that way, the text layer is visible outside the image. As Tulon notes, pure white is not used for anything other than backgrounds in ST's output. There are also a few other options that work on defined colours, like -opaque and +opaque which select a colour (or any colour but the selected colour) and replace it with another colour. I create text-only pages by using -fill white +opaque black, which replaces anything that isn't pure black with white background. You could do your colour swap with the -negate option, which would invert every colour - including the background.

Unfortunately, it looks like I can't use GraphicsMagick for everything in PDFMaker. At the moment, it does not support transparency in PDFs. Do users mind having both ImageMagick and GraphicsMagick installed, or would people prefer to only need one installation even if that means a much slower conversion process?
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.

Post Reply