Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

General discussion about software packages and releases, new software you've found, and threads by programmers and script writers.
Post Reply
hatatat
Posts: 12
Joined: 01 Apr 2012, 06:53
Number of books owned: 0
Country: Germany

Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by hatatat » 12 Apr 2016, 17:31

Hi, my workflow is the following:

I have two cameras for scanning the book. It can be, that the zoom factor between cameras and also the distance to the scanbed(s) are different.
After scanning, I process the left pages and the right pages separately.

Namely in Scantaylor, in one Scantaylor project the left pages are processed, in another Scantaylor project the right pages.

In the end, both left and right pages "meet" each other for the first time in ABBYY Finereader.

In ABBYY Finereader, the geometry of the jpg files is different for left and right pages. As a result, also the final PDF document has different page width for left and right pages.

Do you have any tip how to get identical jpg dimensions for left and right pages? And not only the jpg sizes (width and height) should be fitting, but also the margins and the letter size should be identical for left and right pages in the final PDF document.

Thanks in advance!

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by Tulon » 13 Apr 2016, 04:26

That's easy to do with Scan Tailor Experimental (get it from here). Load both left and right images into the same project and select "Match size by scaling" on the Margins stage.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

hatatat
Posts: 12
Joined: 01 Apr 2012, 06:53
Number of books owned: 0
Country: Germany

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by hatatat » 13 Apr 2016, 16:00

Yes, thank you. This is a very good start.

But still I have a small concern.

So, if you have a book where every second page has a header, this does not lead to the desired result... .

dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by dtic » 13 Apr 2016, 17:55

An alternative approach:

1 Use BookCrop or similar tool to batch crop R and L pages separately. Make sure to crop as close to the full pages as possible. That should make the height/width ratio pretty similar for R and L pages. For that to work you'll need to keep the book in the same spot during capture.
2 Run a script that resizes the larger batch (R or L) as close to the size of the smaller one as possible without changing height/width ratio. You then have images that are very similar in size and elements like headers will align pretty well between the images.
3 Then process all images with ScanTailor and set content selection for all pages to use the full page (whole image) and no extra margins.

Step 3 can be done manually through the ScanTailor Experimental GUI. Alternatively my QuickPicZone has a second mode that converts a batch of jpeg files to ScanTailor BW tif images automatically. However that requires the older ScanTailor Enhanced (since Experimental has no working command line mode). The QuickPicZone route can be handy if you scan a book with text only or one with only a few pictures/illustrations, since once the BW batch job is done you can go back and redo those few pages as mixed text/image pages quickly in QuickPicZone. But for a book with loads of images/illustrations working in ScanTailor GUI is probably quicker.

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by Tulon » 13 Apr 2016, 18:13

hatatat wrote:Yes, thank you. This is a very good start.

But still I have a small concern.

So, if you have a book where every second page has a header, this does not lead to the desired result... .
"Match size by scaling" is meant to preserve the aspect ratio, so it should be fine. I didn't test this feature extensively though, so maybe I got something wrong. Did you actually get wrong results or your concern was theoretical?
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

hatatat
Posts: 12
Joined: 01 Apr 2012, 06:53
Number of books owned: 0
Country: Germany

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by hatatat » 14 Apr 2016, 15:53

Tulon wrote: "Match size by scaling" is meant to preserve the aspect ratio, so it should be fine. I didn't test this feature extensively though, so maybe I got something wrong. Did you actually get wrong results or your concern was theoretical?
yes, I get wrong results (with: "Scan Tailor-experimental-2016-02-22")

Please find attached the original files ("Untitled 2_1.jpg", "Untitled 2_2.jpg"), the scantailor project and the output I get ("Untitled 2_1.tif", "Untitled 2_2.tif").
Attachments
Untitled 2_2.tif
Untitled 2_2.tif (scantailor output)
Untitled 2_1.tif
Untitled 2_1.tif (scantailor output)
bug01.zip
scantailor project using "Scan Tailor-experimental-2016-02-22"
(1.46 KiB) Downloaded 134 times
Untitled 2_2.jpg
Untitled 2_2.jpg (orginal one)
Untitled 2_1.jpg
Untitled 2_1.jpg (orginal one)

dpc
Posts: 314
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by dpc » 15 Apr 2016, 00:24

I handle this by shooting a calibration page for each camera that is a white page with a 2"x2" black square in the center. I then have a preprocessing step that looks at this first page of each set (L/R) and determines what the actual DPI is and scales the pages so that they are the same size.

Obviously I try to get this somewhat close with proper camera setup but the scaling based on the calibration pages ensures the pages end up being the same size and the rest of the post-processing pipeline knows what the DPI is.

hatatat
Posts: 12
Joined: 01 Apr 2012, 06:53
Number of books owned: 0
Country: Germany

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by hatatat » 15 Apr 2016, 17:14

dpc wrote:I handle this by shooting a calibration page for each camera that is a white page with a 2"x2" black square in the center. I then have a preprocessing step that looks at this first page of each set (L/R) and determines what the actual DPI is and scales the pages so that they are the same size.

Obviously I try to get this somewhat close with proper camera setup but the scaling based on the calibration pages ensures the pages end up being the same size and the rest of the post-processing pipeline knows what the DPI is.
That sounds cool.

Could you perhaps share that preprocessing step?

Thanks in advance!

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scantailor - How to Get Identically Sized jpgs for Left and Right Side?

Post by Tulon » 19 Apr 2016, 06:07

yes, I get wrong results (with: "Scan Tailor-experimental-2016-02-22")
I have a good understanding of what causes this behaviour, yet this turns out surprisingly hard to fix. There won't be a quick fix, but I'll definitely keep this issue in mind.

For now, you may be able to work around this issue by manually adjusting content box height on short pages.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

Post Reply