ST just for splitting pages

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Moderator: peterZ

Post Reply
The Purple Parrot
Posts: 9
Joined: 17 Aug 2016, 01:42
E-book readers owned: IPOD
Number of books owned: 5000
Country: iSRAEL

ST just for splitting pages

Post by The Purple Parrot »

Dear Pals,
Is there a way to use ScanTailor just for splitting pages? I have found that Abby Finereader 12 is very good for everything else but scan tailor seems to be superior for splitting the pages automatically.
TIA
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: ST just for splitting pages

Post by Tulon »

ST wasn't designed to be used like that. You can still do it, though not fully automatically, which probably defeats the point in your situation. The manual part is going to be setting the content box to cover each page.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
jaffamuffin
Posts: 22
Joined: 21 Oct 2011, 09:51
Number of books owned: 0

Re: ST just for splitting pages

Post by jaffamuffin »

Yes please - This is the most wanted feature, I asked about this years ago. Scan Tailor has by far, the best page splitting algorithms. And the best way to correct any errors. The only way it can work currently is to run page split, and then just set up full content, and 0 margins, and save the ST files in case you need to go back and edit.

If there was some kind of point it at a directory and split all images within setting it would be amazing.

Alternatively, can anyone point me to the algorithms used, and perhaps a standalone utility using the same processes could be developed?
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: ST just for splitting pages

Post by Tulon »

I am no longer involved in Scan Tailor development, though I can point you to the relevant code.

The page splitting algorithm is roughly as follows:
  1. Do some pre-processing to suppress stuff other than more or less vertical lines (see filters/page_split/VertLineFinder.cpp)
  2. Find such lines in pre-processed image with Hough Transform (see imageproc/HoughLineDetector.cpp)
  3. Use heuristics to pick one splitting line or two bounding lines (see filters/page_split/PageLayoutEstimator.cpp)
If we are splitting two-page images, on step 3 we just pick the most "central" of the lines found on step 2 and use that as a splitting line.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
jaffamuffin
Posts: 22
Joined: 21 Oct 2011, 09:51
Number of books owned: 0

Re: ST just for splitting pages

Post by jaffamuffin »

thank you. I will take a look into this at some point.
Post Reply