ST cutting off edges of content?

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Moderator: peterZ

Post Reply
Scanallthebooks
Posts: 38
Joined: 01 Dec 2016, 19:05
Number of books owned: 0
Country: Denmark

ST cutting off edges of content?

Post by Scanallthebooks »

Hey guys, back with yet another question about ST. It's not a huge issue, but I've noticed that if you use the automatic content selection it will be a bit aggressive in selection, and will more often than not cut off a few edges of the content. At the bottom this is usually the lower part of the letters "g", "y" or "p", at the right it can be the right part of the letters "e", "r", "f". At the top (or bottom) it can be the edges of the page numbers.

Here's an example:
contentcut.png
Notice how the very bottom part of the letter "g" is cut off slightly, and after I've adjusted the selection box manually it's restored.

My question is if there is a way to adjust how "aggressive" ST is in selecting content? It's a bit cumbersome going through every page adjusting the content selection manually but that's what I've been doing since I've been unable to find any automatic way to prevent the edges of the text from being cut off.

Thanks in advance. ;)
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: ST cutting off edges of content?

Post by Tulon »

It's a known issue caused by estimating the content box at a reduced resolution. It's not terribly hard to fix, except no one is currently working on ST, as far as I know.

This problem may also be limited to ST experimental or at least be less noticeable on classic ST.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Scanallthebooks
Posts: 38
Joined: 01 Dec 2016, 19:05
Number of books owned: 0
Country: Denmark

Re: ST cutting off edges of content?

Post by Scanallthebooks »

Thanks for the answer Tulon. So it's a small issue, easy to fix, but creates sooo much extra work. Dammit, I just wish I knew how to program so I could fix it! :oops:

Unfortunately I can't go back to classic ST after using experimental, it's just so much faster and fluid.

So...do I dedicate my life to learning how to program and fixing ST, or do I go insane working through hundreds of pages manually selecting content? :lol:
alraban
Posts: 3
Joined: 17 Aug 2016, 08:08
Number of books owned: 1500
Country: USA

Re: ST cutting off edges of content?

Post by alraban »

FWIW, I work around this (and a few other) auto selection issues by not using the content auto-selection at all (I kept losing footnotes or image portions).

If your pages are fairly close to the same position in the scan field (I use a form feed scanner so they're always relatively close) you can get very good results by selecting just inside the actual physical page margins on one page and then applying that selection to all pages. If your pages alternate or you want better text centering, you can select on one page and apply to "every other" page, and then do the same thing on the following page.

Using that method, I only have to do manual adjustment for a handful of pages with unusually large images or that are out of alignment with the rest. Remember to turn off the extra marginal padding if you use this method.
dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: ST cutting off edges of content?

Post by dtic »

Another alternative, with some similarity to alraban's suggestion, is to use BookCrop to first crop all pages and then use Scan Tailor Enhanced (an older fork, not the same as Scan Tailor Experimental) and a script to command line process with the whole page as content. With that method there is no manual steps inside ScanTailor at all, but one in BookCrop. Works well for text only scans.
L.Willms
Posts: 134
Joined: 21 Sep 2016, 10:51
E-book readers owned: Tolino Shine
Country: Germany
Location: Frankfurt/Main, Germany

Re: ST cutting off edges of content?

Post by L.Willms »

Scanallthebooks wrote: 21 Dec 2016, 13:03 or do I go insane working through hundreds of pages manually selecting content? :lol:
I do an automatic selection first, and then check one page after the other and apply manual corrections if needed.

Same with page separation or skewing.
Scanallthebooks
Posts: 38
Joined: 01 Dec 2016, 19:05
Number of books owned: 0
Country: Denmark

Re: ST cutting off edges of content?

Post by Scanallthebooks »

L.Willms wrote: 01 Jan 2018, 11:34
Scanallthebooks wrote: 21 Dec 2016, 13:03 or do I go insane working through hundreds of pages manually selecting content? :lol:
I do an automatic selection first, and then check one page after the other and apply manual corrections if needed.

Same with page separation or skewing.
Yeah that's what I've been doing too, the problem is that every page needs corrections on all 4 sides, so for a 200 page book you basically need to do 800 margin adjustments in order to prevent clipping of the edges. Add that to the usual necessary adjustments and suddenly you need to spend a LOT of time on each book simply because the automatic content detection has a small flaw.

According to Tulon there is apparently a simple fix, but without knowledge of C++ it is beyond my abilities. Maybe one day I can learn some basic C++ and fix it, if nobody else picks up the mantle. ;)

EDIT: Oh hold on, I just saw that somebody is developing Scan Tailor Advanced, a new version of ST! This is amazing, perhaps the problem is fixed/can be fixed there! Excuse me while I try out this new version. :D
L.Willms
Posts: 134
Joined: 21 Sep 2016, 10:51
E-book readers owned: Tolino Shine
Country: Germany
Location: Frankfurt/Main, Germany

Re: ST cutting off edges of content?

Post by L.Willms »

Scanallthebooks wrote: 01 Jan 2018, 12:55 the problem is that every page needs corrections on all 4 sides,
in that case I think that you are worrying too much. It's OK when the content box is drawn very tight.

I have to do corrections on some pages when a spot is erroneously taken for a piece of text, so that I have to draw the content box narrower to the actual text. Or when there is a title page with a graphic frame, and the automatic guessing thinks the frame to be some dark shadow to be removed.
Scanallthebooks
Posts: 38
Joined: 01 Dec 2016, 19:05
Number of books owned: 0
Country: Denmark

Re: ST cutting off edges of content?

Post by Scanallthebooks »

Well, the bug does results in parts of letters being unnecessarily being cut off on every single page, simply because the content box is being erroneously drawn at a lower resolution. This is not something that happens just some of the time...it happens all the time on every single page. I guess we're all different, to me it's an issue. ;)
Post Reply