Page 26 of 36

Re: Scan Tailor

Posted: 17 Feb 2010, 02:36
by Mandor
And once again - sorry for my english. :oops:
DSpider wrote:1. A feature to ignore headers and footers (such as page numbers for instance). This would work great for "pageless" OCR-ing (such as continuous text files).
I think this is a task for OCR program, not for image processing. There is special option in Abbyy FR to ignore headers and footers when exports text.
DSpider wrote:2. An "UNDO" feature with a Ctrl+Z shortcut and maybe a button to the left of the screen.
But why? :o There is no "unrevertable" operations in ST! Please, give an example...
DSpider wrote:3. An extra step after 6 Output. Maybe called "Edit" or "Retouch" or something like that capable of basic monochrome editing.
Yes, this sounds good, but it's against the "policy" of this program. For example, if I lost result files, I can run ST with source scans and saved project and I will receive the same result as last time. With your addition this will be impossible. And this step can be done with your favorite graphic editor.
DSpider wrote:PS: I don't know if it's a bug or not but every page I've scanned is slightly tilted to the left. You can observe this when selecting content.
Example, please. I didn't observe such behaviour.
horndude77 wrote:- On the page layout step I'd like to be able to specify the output page size instead of just the margins.
Can you give an example - what's the point of this request? Sounds very strange for me.
horndude77 wrote:- Be able to skip steps. For example most of the time I probably don't need the fix orientation and split page steps.
You can skip these two steps - "Fix orientation" doesn't need a special attention if you don't want to rotate pages; in "split page" step you can set all pages to "single page" type - this will be 4-5 mouse clicks.

Re: Scan Tailor

Posted: 17 Feb 2010, 10:42
by horndude77
Mandor wrote:
horndude77 wrote:- On the page layout step I'd like to be able to specify the output page size instead of just the margins.
Can you give an example - what's the point of this request? Sounds very strange for me.
I want the output to be the same size as the physical page. The current interface finds the content and puts a margin around that. What I'd like is for it to find the content and center it on the page size of my choosing. For example if I have a 9 inch x 12 inch page and I scan at 600dpi I'd like the output to be 5400 pixels x 7200 pixels.
Mandor wrote:
horndude77 wrote:- Be able to skip steps. For example most of the time I probably don't need the fix orientation and split page steps.
You can skip these two steps - "Fix orientation" doesn't need a special attention if you don't want to rotate pages; in "split page" step you can set all pages to "single page" type - this will be 4-5 mouse clicks.
Fair enough.

Re: Scan Tailor

Posted: 17 Feb 2010, 12:11
by StevePoling
Mandor wrote:
DSpider wrote:PS: I don't know if it's a bug or not but every page I've scanned is slightly tilted to the left. You can observe this when selecting content.
Example, please. I didn't observe such behaviour.
I've seen a phenomenon that might be close to this. In images where the image is subject to keystoning, the lines of text radiate in a divergent (as opposed to parallel) direction. If you make the left edge vertical, then the lines atop the page run "up hill" and the lines near the bottom of the page run "down hill." I have seen ScanTailor rotate (not tilt) the image to make some lines nearer the bottom run parallel. This in turn causes the lines near the top run "up hill." This shows up as a slight bias toward counter-clockwise rotation of the image. It also causes the left edge to rotate a few degrees out of vertical. This doesn't afflict every image, just those images that have the worst problem with keystoning.

I'd like it if ScanTailor could "tilt" the image so that the right edge of the image appears further away and the left edge of the image appears closer (or vice versa). One could tweak this until the eye sees lines of text running parallel. At present ScanTailor rotates the image about an axis emerging from the center of the page directly out of the computer screen. I'd like the ability to rotate on an axis going through the center of the image vertically (on the computer screen) to correct keystone distortions.

Re: Scan Tailor

Posted: 17 Feb 2010, 16:22
by daniel_reetz
I'm very interested in this feature discussion, but I think somebody has to point out that Tulon is very unlikely to have the time and interest to meet all the demands here. The more programmers we can get actually implementing stuff, the better. I wish I had better programming skills; some of these things I would like to tackle myself.

Re: Scan Tailor

Posted: 17 Feb 2010, 20:01
by Mary
I am wondering if the implemented dewarping algorithm in the latest scan tailor is based on CTM or couple snaked model?
If it is based on CTM or CTM2, in what respects it is different from CTM. I am asking this question because I did not get the same result with scantailor for images in DFKI Dewarping Contest Dataset (http://www.dfki.uni-kl.de/~shafait/downloads.html) as results of the CTM Method (didcontest-results-CTM2.zip) available at the same link.
For example for image img_1238_bin the output of scantailor is quite different from the output of CTM2. I can send the output image from scan tailor if that helps.

Re: Scan Tailor

Posted: 18 Feb 2010, 02:59
by Mandor
horndude77 wrote:I want the output to be the same size as the physical page.
Understood, but I don't think Tulon will agree to add such option. What to do, if specified margins are smaller than calculated largest page? Trim pages? Align to largest page without warning?
I suggest you to use some batch conversion on files, produced by ST - XnView, for example (I can help you, if you have troubles). I'm forced to do almost the same thing - to crop images before send them to ST and to convert in CCITT G4 after that. Nevertheless, ScanTailor is my favorite program and I become reconciled with these shortcomings.

And - yes, Tulon will not accept suggestions for next several months.

Re: Scan Tailor

Posted: 18 Feb 2010, 15:19
by Tulon
Hi there,

Time for a quick status update. I returned from my vacation, and from time to time I do work on ST. That's the good news. The bad news is that my involvement in the community will be scaled down significantly. In particular, feature requests will be completely ignored. Trust me, that's the only way for me to keep working on ST. Otherwise, I am doing tech support and tutoring. That makes me unhappy. When I am unhappy, I don't want to code. Simple as that.

Re: Scan Tailor

Posted: 18 Feb 2010, 15:27
by spamsickle
That's cool. I expect to implement the feature I want ("Apply to" dialog for content selection) myself, though it will take time. Thanks for providing such a great foundation, and try to cheer up!

Re: Scan Tailor

Posted: 18 Feb 2010, 18:02
by daniel_reetz
Tulon wrote:Hi there,

Time for a quick status update. I returned from my vacation, and from time to time I do work on ST. That's the good news. The bad news is that my involvement in the community will be scaled down significantly. In particular, feature requests will be completely ignored. Trust me, that's the only way for me to keep working on ST. Otherwise, I am doing tech support and tutoring. That makes me unhappy. When I am unhappy, I don't want to code. Simple as that.
Thanks for the update, Tulon, and welcome back. We want nothing more than for you to keep working on ST as you see fit. Cheers, and thanks, as always.

Tech support and tutoring are things we do well elsewhere. Maybe it's time for a "Scan Tailor Tech Support and Tutoring" thread. Feature requests will likely be ignored wherever they land. :)

Re: Scan Tailor

Posted: 18 Feb 2010, 19:49
by Tulon
Mary wrote:I am wondering if the implemented dewarping algorithm in the latest scan tailor is based on CTM or couple snaked model?
If it is based on CTM or CTM2, in what respects it is different from CTM. I am asking this question because I did not get the same result with scantailor for images in DFKI Dewarping Contest Dataset (http://www.dfki.uni-kl.de/~shafait/downloads.html) as results of the CTM Method (didcontest-results-CTM2.zip) available at the same link.
For example for image img_1238_bin the output of scantailor is quite different from the output of CTM2. I can send the output image from scan tailor if that helps.
Dewarping comprises two independent aspects:
1. Text line tracing.
2. Transformation model.

Coupled snakes are only used for text line tracing. Supposedly they are quite good at it, so they can be used to improve any other dewarping algorithm by replacing its own text line tracing method.
Having said that, STs algorithm (developed by Rob) uses neither coupled snakes nor CTM's-like transformation model. However, I've been working myself on a text line tracer that's also going to use coupled snakes as its last step. My co-worker proposed a nice and simple algorithm for preliminary text line tracing that works on blurred grayscale images, yet doesn't require gaps between words to be filled. This makes it possible to use ordinary gaussian blur rather than a much slower anisotropic one, used by the authors of the coupled snakes approach.