Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Dragging select content box?

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.
Post Reply
PaulFraser
Posts: 6
Joined: 23 Jul 2012, 09:05
E-book readers owned: Ipad, Kindle3
Number of books owned: 0
Country: UK

Dragging select content box?

Post by PaulFraser » 04 Sep 2012, 03:54

I suspect I know the answer to this having searched on the form, but here goes...
Is there any way of dragging the select content box? I set up the select content box on page 1 and applied it to all the other pages as the auto detect function didn't work properly.
Even if it had, this would let me keep all the pages the same size without match sizing.
If this isn't possible in Scantailor, is there any other way of doing this, i.e export out after spiltting and deskewing and back in to do the margins?
Thanks.

dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Dragging select content box?

Post by dtic » 04 Sep 2012, 12:07

I am not sure if I understand your question. Do you want to manually choose the size and position of the blue select content area for one page and then apply the same selection to all images? That is possible to do in ScanTailor through these steps:
- Go to step 4
- Click the first page in the project
- click "manual"
- resize and move the blue selection box the way you want
- click "apply to...", "all pages" and then "ok"
- click the round "play-button" to start processing

xorpt
Posts: 42
Joined: 24 Feb 2012, 01:37
E-book readers owned: Sony PRS-T1
Number of books owned: 2000

Re: Dragging select content box?

Post by xorpt » 04 Sep 2012, 20:21

I think this is a missing functionality of ScanTailor: after applying a manual content box to have it the same size in all pages, you can not move the box. All you can do is resize it (at least in ScanTailor classic). It would be nice if it would be possible to move the box around in the page without resizing it.

PaulFraser
Posts: 6
Joined: 23 Jul 2012, 09:05
E-book readers owned: Ipad, Kindle3
Number of books owned: 0
Country: UK

Re: Dragging select content box?

Post by PaulFraser » 05 Sep 2012, 03:01

Thanks for your replies:

dtic -- yes, that is what I am doing, but I would like to drag the box around on the subsequent pages without altering the size (nearly every page is misalingned, also, this is not so much for OCR as to capture a good image for subsequent PDF file creation).
xorpt - yes, thought so. Looks like I am going have to learn to program :)

dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Dragging select content box?

Post by dtic » 05 Sep 2012, 08:58

If the selection errors are systematic then try tweaking the image capture process. E.g. can the cameras be zoomed more to exclude parts of the cradle or book edges that ST might interpret as content? Can the cradle be improved (painted black, smoothed)? Would it help to batch crop the images before plugging them into ST? These might be things you've tried already but I thought them worth mentioning anyway.

Being able to directly drag the selection box would be a good feature, I agree on that. If you have the coding skills to work on that then I hope you go for it!

PaulFraser
Posts: 6
Joined: 23 Jul 2012, 09:05
E-book readers owned: Ipad, Kindle3
Number of books owned: 0
Country: UK

Re: Dragging select content box?

Post by PaulFraser » 23 Sep 2012, 14:07

Hi, the images were from a flatbed scan of a dismantled book. Heresy here, I know but until I get a DIY scanner... :)

DrCheap
Posts: 48
Joined: 07 Jan 2012, 19:27
E-book readers owned: pdf
Number of books owned: 750

Re: Dragging select content box?

Post by DrCheap » 23 Sep 2012, 16:01

Three things that seem to cause errors with the content selection and page splitting:

1. Wrong DPI. If your pixel settings on the project are way off, ScanTailor seems to have trouble with auto-splitting page and with content recognition.

2. Glare spots or similar items outside the page area. Zooming in so your original images do not include these is a great help, but consider glass, lighting, and other ways of reducing such glare.

3. Books with very rough page edges. Rough-cut pages cause a lot of errors in ScanTailor. Not much you can do about this. These books are the bane of ScanTailor and my scanning.

Now, since you scanned off a flatbed scanner from a dismantled book, your images should be really clear and clean with plain clear white around the page areas. If so, ScanTailor ought to be able to auto-recognized your content area really well.

Now, as to managing the issue manually, if I have to deal with messy images (colleague brought back some archive document images that were not well shot, for example), then I select the whole image manual page split option (the one on the far left that does not split the pages at all). This ensures the page splitting does pretty much zero to change the original image. I then use the content selection option first on Auto. If that produces lots of errors and bad images, then I start from the top and manually select and us the "apply to this and all following images" option, which hopefully will be good for a large number of images. go down checking every few images, then as they shift a little, redo it manually and use the same "and all following images" option.

If that won't work then it will require one-by-one manual selection of content, which I won't normally do. Instead, it is usually faster to just rescan the book / document and process it properly.

But before you do everything else, try first to see if inputting the proper pixel settings for the project fixes it. ScanTailor works much better with the correct pixel settings for your images.

Feel free to post links to some of the trouble page images so we can see your troubles, as well.

PaulFraser
Posts: 6
Joined: 23 Jul 2012, 09:05
E-book readers owned: Ipad, Kindle3
Number of books owned: 0
Country: UK

Re: Dragging select content box?

Post by PaulFraser » 23 Sep 2012, 17:02

Thanks for your comprehensve post. I'll post a couple of screenshots when I get back from work.
In the meantime, I scanned the images with Finereader 11 as 600dpi uncompressed tiffs, which scantailor imported without any problems, split and deskewed fine too.

PaulFraser
Posts: 6
Joined: 23 Jul 2012, 09:05
E-book readers owned: Ipad, Kindle3
Number of books owned: 0
Country: UK

Re: Dragging select content box?

Post by PaulFraser » 27 Sep 2012, 06:12

User error, the select content works fine. However, would still like to see the function as it would help produce a pdf with identical size pages. I realise this is possible with the matching page size function but does this not distort the page type/images?

PaulFraser
Posts: 6
Joined: 23 Jul 2012, 09:05
E-book readers owned: Ipad, Kindle3
Number of books owned: 0
Country: UK

Re: Dragging select content box?

Post by PaulFraser » 28 Sep 2012, 04:16

OK, I realise that I am probably reinventing the wheel here, but I found a workaround, well at least for same page size if not content selection. Here it is for anyone that finds this thread:
Get rid of all the margins.
Centre all the pages and make all the pages the same size using the cover: this gave me pages that were too big so I deselected the cover and this caused the page size to change on all of the pages to the next largest page, which I found was a scan of the spine at the end of the file. I deselected this image too and then all the pages assumed the size of the next largest page, which was the back cover. This coincidentally gave me about a five mil margin on all of the rest of the inner pages, apart from one or two which had oversize adverts on them. Job done.
So, in a nutshell, set one page to the page you want and then apply that to all the others; if this page is smaller than some of the others, such as the cover, they will have to be deselected first.
Like I said, reinventing the wheel.

Post Reply