select contents problems
Moderator: peterZ
select contents problems
i've tested a couple of page and for some pages, select content works well but for others, it only selecte a couple of paragraphs and omits alot of other information on the page. it's ok for a couple of pages cause i can do it myself but with hundreds of pages, this is a big problem. how can i solve it? is the problem my input? is it due to the lack of proper lighting on the pages?
Re: select contents problems
I can't tell till I've seen it. Also tell which input DPI are you specifying.jinjin12 wrote:is the problem my input?
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Re: select contents problems
Content selection is also a big issue for me, but only on really old books with tons of marks and smudges in the margins. I tried playing with it, and I think the biggest problem with content selection is finding the text blocks. The algorithm that's currently used finds a lot of false-positives (like dots next to the content areas), but I'm not the one to fix it.
Here's a research paper on SWT (Stroke Width Transform), which would work perfectly for detecting text quickly: http://docs.google.com/viewer?a=v&q=cac ... JZ6g&pli=1
Here's a research paper on SWT (Stroke Width Transform), which would work perfectly for detecting text quickly: http://docs.google.com/viewer?a=v&q=cac ... JZ6g&pli=1
Re: select contents problems
hello - I love scantailor - thankyou Tulon and others for such great work!!
I have basic questions not found answer in user guide or on forum search
1) content box - is the logic I include header and footer / page number ? or strictly "content block"?
2) if I select content block (no headers hooters), will I still get this header/footer content in my output Tiff or does it remove it?
3) if I select content block, the next option, page layout, is the correct logic, I suppose to set the outer/hard margin to exact size of my book?
thankyou for any help!
cheers
I have basic questions not found answer in user guide or on forum search
1) content box - is the logic I include header and footer / page number ? or strictly "content block"?
2) if I select content block (no headers hooters), will I still get this header/footer content in my output Tiff or does it remove it?
3) if I select content block, the next option, page layout, is the correct logic, I suppose to set the outer/hard margin to exact size of my book?
thankyou for any help!
cheers
Re: select contents problems
Everything you want to preserve is meant to be there.seasalt wrote:1) content box - is the logic I include header and footer / page number ? or strictly "content block"?
Margins are cleared in all output modes except "Color / Grayscale" with "White margins" unchecked.seasalt wrote:2) if I select content block (no headers hooters), will I still get this header/footer content in my output Tiff or does it remove it?
It doesn't make sense for an e-book to have margins as large as in the original. You would be wasting screen space that way. Just choose the margins you are comfortable with.seasalt wrote:3) if I select content block, the next option, page layout, is the correct logic, I suppose to set the outer/hard margin to exact size of my book?
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Re: select contents problems
thankyou tulon
when I am using select contents - it appears to be "assessing / rendering" 1 page at a time
is there a way ST can "assess/render" all, then I use W to validate each page before any processing executed?
when I am using select contents - it appears to be "assessing / rendering" 1 page at a time
is there a way ST can "assess/render" all, then I use W to validate each page before any processing executed?