Scan Tailor

StevePoling · Post by **StevePoling** » 19 Feb 2010, 15:45

Tulon wrote:My co-worker proposed a nice and simple algorithm for preliminary text line tracing that works on blurred grayscale images, yet doesn't require gaps between words to be filled. This makes it possible to use ordinary gaussian blur rather than a much slower anisotropic one, used by the authors of the coupled snakes approach.

Excuse my ignorance, but doesn't a blurred grayscale image the same as a low-pass-filtered image? If you do a 2D Fourier Transform on the image, chop off the high frequency stuff, then transform back to the spacial domain, isn't that the same thing as the blurred grayscale? And if it makes sense to apply algorithms here, would it make equally valid sense to apply your algorithm in the frequency domain?

Mind you, I've only actually done signals work in 1-D, so if that last paragraph sounds like I'm on crack, don't be surprised.

smiles and cheers,

steve

Tulon · Post by **Tulon** » 19 Feb 2010, 18:17

StevePoling wrote: Excuse my ignorance, but doesn't a blurred grayscale image the same as a low-pass-filtered image? If you do a 2D Fourier Transform on the image, chop off the high frequency stuff, then transform back to the spacial domain, isn't that the same thing as the blurred grayscale? And if it makes sense to apply algorithms here, would it make equally valid sense to apply your algorithm in the frequency domain?

Mind you, I've only actually done signals work in 1-D, so if that last paragraph sounds like I'm on crack, don't be surprised.

smiles and cheers,

steve

It sounds right, but the algorithm in question is inherently spatial. The reason it works on blurred images is because it assesses the similarity of pixels along a path. My co-worker used this algorithm to trace layers in wood. The reason it doesn't require blurred words to be connected is that it goes column by column rather than line by line.

rob · Post by **rob** » 27 Feb 2010, 23:36

For anyone who is interested in the details of the algorithm I used to dewarp the images, I have put them up on my latest blog post.

--Rob

dtic · Post by **dtic** » 08 Mar 2010, 07:30

Hi.
Since it is my first post: what a great forum you've all got going here. Very helpful.

Scan Tailor is great. But I hope for two enhancements: dual core processing to speed things up and command line control for more automation. I hope that can be built into ST. But from reading the previous posts from the dev such new features won't be added soon. But maybe some smaller tweaks can be done to ease doing the same thing via helper scripts? Two such ideas:

1. I tested making two installs of ST and running them simultaneously. Works fine! So it is already possible to run two ST instances at the same time on half of the scans each. To speed that up it would help if some very small features were added: right clicking in the right pane and doing "insert after..." brings up a dialog where only one image can be selected. Allow multiselect (and thereby multiimport) there. That way, users can do these steps: start the first ST instance with half of the scans, do all the settings, save the project, start generating output, start second ST instance, load the project, batch remove old images, batch import the rest of the scans, start generating output.

2. if more hotkeys are added for key functionalities in ST then others can make autoit or autohotkey scripts that operates on the ST gui to automate the process more. For example, the "launch batch processing" buttons would need hotkeys.

spamsickle · Post by **spamsickle** » 08 Mar 2010, 20:23

I'm not sure what you're hoping to accomplish by your first request, or even exactly what steps you're executing. To me, if you're going to run two instances of ST simultaneously, it makes most sense to have them working on two different books. Trying to do half of one book in one instance and half of the same book in another instance, and combining them afterward, seems like more trouble than it's worth. You'd definitely want to combine them before the page formatting step, so the sizes could match up. But maybe I'm not understanding.

I used to process all the right-hand images in one instance, and all the left-hand images in another (run back-to-back rather than simultaneously), and combine them with scripts after the fact. I no longer do that, because the resulting output page sizes occasionally didn't match. I don't think a right/left split would work well with what you're proposing, because the additions would have to be interleaved, but that didn't seem like what you were proposing anyway.

Your suggestion for hotkeys seems reasonable.

I've been looking into adding an "apply to" dialog to the content selection step. I still intend to do that, but I've found another enhancement that I intend to try first.

The "split page" step seems to be designed for books that were scanned on flatbed scanners, but often seems to fail when it shouldn't on books that came from our type of DIY scanners. Often, I find, the split page filter will either mark an image as "single page (no split)" or will split the image at the outer edge of a page, with the page, gutter, and portion of facing page passing on to the subsequent filters as the "page". This seems to be the true source of many of the failures I notice in the content selection step: a bit of the gutter is tagged as content, which will (if not corrected) result in a dark line marring the final output. If I correctly split the page at the gutter, and identify the real page correctly, 9 times out of 10 the content selection step that failed before will correctly bracket the text I want.

My plan is to add a fourth option to the "split page" filter, to identify the input as coming from our DIY-style scanner. If this option is selected, I'll use the same quick-and-dirty algorithm I used in my YAPP page puller to do the page splitting, instead of the algorithms Scan Tailor currently employs. It's doing gray scales and downsampling and Hough transforms, and circles and arrows with a paragraph on the back of each one to identify pages, which would be find if it wasn't failing as often as it does.

I think this will result in more steps which don't require manual intervention, which is my goal with an application like this. I don't think it will take care of everything, so I still plan to implement an "apply to" on the content step.

I'll take a look at the possibility of adding hotkeys while I'm at it. I haven't used QT before, but I'm beginning to become familiar with it. I don't think they should be too difficult to add, but I'll know more in a day or two.

dtic · Post by **dtic** » 09 Mar 2010, 15:01

Good point on page formatting (sizing) problems if processing two book halfs separately. I didn't think of that. So let's forget 1 then.

I can follow the other aimed for changes your describe but don't know enough (yet!) about the process and ST to give feedback but thanks for taking a look at hotkeys. In addition to a hotkey for "launch batch processing", hotkeys for the "apply to..." suboptions would ease things. When running some tests with ST I found myself using "apply to all" a lot. One more thought: need apply to... options use a popup at all? There's free space in the left pane. It would clutter the UI but save clicks.

One confusing thing in ST is the animated icons (dots circling) in the left pane during batch processing. Say that you work through steps 1-3 in turn. For each you do a batch process. When you then go to step 4 and do a batch process there there are animated icons for each of steps 1-4. What does that mean? Is the previously done processing now re-done? Or does ST build on steps completed already? I assume the latter but then the icons are misleading, right?

: a.png (3.13 KiB) Viewed 9301 times

spamsickle · Post by **spamsickle** » 09 Mar 2010, 16:11

If nothing has changed upstream that requires new computation, it will pass along previously computed values. It's still going through the earlier steps, to check if the processing is needed, so the animated icons aren't really misleading.

dtic · Post by **dtic** » 10 Mar 2010, 03:10

Ok, but processing and checking if processing is needed are two things. This would be less confusing: After batch processing a step display a green checkmark icon for it. When batch processing e.g. step 4 then animate the icon for step 4 and previous steps that haven't yet been processed (i.e. steps without the green checkmark icon).

Tulon · Post by **Tulon** » 10 Mar 2010, 05:58

dtic wrote:Ok, but processing and checking if processing is needed are two things. This would be less confusing: After batch processing a step display a green checkmark icon for it. When batch processing e.g. step 4 then animate the icon for step 4 and previous steps that haven't yet been processed (i.e. steps without the green checkmark icon).

It's not that simple. Some pages may require reprocessing on a given stage, while others may not. This animation was intended to tell users that they don't have to pass each and every processing stage. A common workflow is to jump right to stage 4, going back when necessary.

dtic · Post by **dtic** » 11 Mar 2010, 07:24

Ok, that I didn't know. But it seems compatible with adding checkmarks. As long as the checkmarks correctly signal to users that the settings and previous outcome for that step will be intact then any reprocessing (behind the scenes) need not be signalled.

In cases where users jump directly to stage 4 and starts a batch process then animated icons for all stages 1-4 makes a sense, agreed.

Anyway, this is a very minor GUI issue and maybe not worth the time to change so I'll stop going on about it. ST is already great!

DIY Book Scanner

Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor

Re: Scan Tailor