Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Scan Tailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.
Locked
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon » 26 Nov 2009, 19:56

If you don't want any margins, you can have that by specifying zero margins on the Page Layout stage and unchecking the "Align with others" checkbox.

I am going to translate an excerpt from the Russian documentation of the Page Layout stage to make it clear to everyone:
On this stage, margins are added to the content box. There are two types of margins: the hard ones and the soft ones. Hard margins is what you see between the solid lines. They are specified by the user. You can drag any solid line - be it an inner or an outer edge, or you can set numeric margin sizes.
Soft margins is what you see between solid and dashed lines. These margins are added automatically to make a page the same size as others. If you see dashed lines, it means that somewhere in the project there is a page with that kind of width, and (possibly a different one) with this kind of height.
What you do by unchecking the "Align with others" checkbox, is excluding this page from the usual "make all pages the same size" process. Such a page won't grow (that is no soft margins will be added to it) and won't cause other pages to grow to match its size. When you work with camera based shots taken from slightly different distances, you probably want this checkbox unchecked on all pages, which you can do using the "Apply To ..." button.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

DSpider

Re: Scan Tailor

Post by DSpider » 27 Nov 2009, 19:42

Does it come with a manual ? A readme or something ?

For instance, if I'm at the Select Content section (#4), will it go through deskew automatically ?

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon » 28 Nov 2009, 06:08

DSpider wrote:Does it come with a manual ? A readme or something ?
We currently have next to none documentation in English. A few people volunteered to translate (parts of) it from Russian, but so far almost nothing have been produced.
DSpider wrote:For instance, if I'm at the Select Content section (#4), will it go through deskew automatically ?
Yes, it will. Scan Tailor's stages are like a pipeline. If an item reached a certain stage, it could have only done it by passing through all of the previous ones.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.

spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Scan Tailor

Post by spamsickle » 28 Nov 2009, 10:29

Dropping the margins to zero and unchecking "sync with other pages" worked like a charm. I understand now why the "widest" and "tallest" buttons are on the page layout display too. Thank you.

phaedrus
Posts: 56
Joined: 04 Mar 2014, 00:52

Re: Scan Tailor

Post by phaedrus » 28 Nov 2009, 12:40

We currently have next to none documentation in English. A few people volunteered to translate (parts of) it from Russian, but so far almost nothing have been produced.
I've put up the English 'Quick Start guide' and 'Tips for Scanning' on the Wiki today. I started on the main user guide but it's now 0530 in the morning and I've been up for a while so there's still quite a lot to do, also I'm not on a machine with Scan Tailor so I can't check some things or obtain an english image of certain operations. Therefore that's still very much 'work in progress' - any assistance by others would be most welcome. I don't read Russian natively, rather am using Google or Babel to do the initial rough work and then figuring it out from there so almost anyone could do it I'm sure!

Cheers, P.

User avatar
daniel_reetz
Posts: 2786
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Scan Tailor

Post by daniel_reetz » 28 Nov 2009, 20:17

I'm a reasonably skilled Russian reader, and I plan to help, but the demands on me with the end of the fall semester coming up have been severe. I will help as I have time. Between Phaedrus and I, it should get done reasonably quickly.

phaedrus
Posts: 56
Joined: 04 Mar 2014, 00:52

Re: Scan Tailor

Post by phaedrus » 29 Nov 2009, 01:48

Hey Daniel that'd be great - I've done the majority of the page texts now but I'm not on a machine with ST so am guessing a little at some of the explanations (I've not used it enough to work entirely from memory!). I expect what I've translated in some places may not be the best description of a task :oops: So if you had a chance to proof read & re-arrange my mess(es) I'm sure things could make a deal more sense. There's also a bit of tidying up to do and possibly some file renaming etc that I may do later if I get a chance too. Obviously one needs a machine with ST loaded to create replacement example images with English text so if anyone else had an opportunity to that it'd be good...

FWIW I found that Google translate generally did a closer job (IMO) than the 'fish; at least it made more initial sense to me so what I needed to do was a little easier to figure out.

P.

DSpider

Re: Scan Tailor

Post by DSpider » 29 Nov 2009, 11:47

Sometimes if the page number is on the bottom and there are just e few lines at the top of the page, Scan Tailor doesn't get the page number and I have to extend the content box.

Other times Scan Tailor sees the tips of my fingers and includes them as content. Or parts of in between the pages if I don't press hard enough on the book. Probably because it doesn't approximate the size of the page or where the page ends (the edge), at least not yet anyway. :)

Some scans show a few thin line at the edge of pages. Scan Tailor sometimes includes those as well...

phaedrus
Posts: 56
Joined: 04 Mar 2014, 00:52

Re: Scan Tailor

Post by phaedrus » 30 Nov 2009, 16:30

Sometimes if the page number is on the bottom and there are just e few lines at the top of the page, Scan Tailor doesn't get the page number and I have to extend the content box.
I often find this, even if there are more than a few lines on the page. However I did try reducing my input DPI as per Tulon's suggestion and this helped in cases where ST wouldn't recognise even some complete pages. It may be worth a try to see if it would help in your case, I have been using 300 x 300 with some success (with the exception of the page number issue).

I've very rarely had it detect anything else and certainly not my fingers which do appear in quite a few images!

Also FWIW I've done a reasonable amount more on the English User Guide etc, although I doubt there's much more there that would help you.

Cheers, P.

spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Scan Tailor

Post by spamsickle » 01 Dec 2009, 17:18

Until such time as I can feel comfortable enough with the Scan Tailor code to make meaningful changes to it, I'm working with the same version as everyone else -- I've compiled it, but haven't modified it.

I thought I'd just toss out a couple of tips I've come up with.

When I'm going through page-by-page on the "select content" step, to make sure the page numbers have been selected and the gutters in the middle of the page have not, I prefer to have the mouse in one hand (to move the edges of the selection) and the keyboard in the other (to "page down" from one page to the next). For some reason, the "thumbnail" view on the right doesn't scroll by default when I'm using the keyboard, but I've found that if I go back to a previous step and say, deskew a page, then the scrolling will work thereafter for the "page down" key.

Often, I'll have a table or an illustration in the book that was printed sideways on its own page, and I'd prefer to view it on my monitor without tilting my head or flipping the monitor. However, if I just rotate it back to reading orientation, it's now wider than it is tall, while the other pages are taller than they are wide. In this case, I keep the same margins on the "page formatting" step, but uncheck the "align with other pages" box. That keeps this "wide page" from being considered when ScanTailor decides how wide to make the other pages, and I get a table in my ebook that's oriented for easy reading.

Usually, flipping a table page back confuses ST a little bit -- it wants to treat the wide page as two pages side by side, and will actually change the single thumbnail into two. If this happens, just go back to the "split pages" step and manually specify that it's a single page, and ST will probably be able to handle the content selection as it should.

Here's the process I've settled on for the "output" step. For most of the pages, the default black and white works fine, once I uncheck the "Despeckle" box. Covers and end papers of the book jackets I usually want in color. I'll start at the front of the book, and do the first one or two color pages, and the first black and white page. Then I'll let it run on auto to output the rest of the pages in black and white without despeckling.

Once that's done, I'll scroll through the thumbnails manually, using the scroll wheel on my mouse, looking for pages with illustrations. If the illustration is a line drawing, no action is required. If it's a photo or something else with shades of gray, I'll check the "mixed" mode for output and let ST grind through and output the page again. Usually, it finds the pictures automatically, and I can go on to the next. If it misses a picture, or part of one, I click the "Picture Zones" tab in the middle pane and draw around the parts it missed myself. I can zoom into the page using the scroll wheel on my mouse, and once I've zoomed in I can push the page around by clicking and holding the left mouse button. Clicking and releasing drops another "point" in my picture outline. The outline is completed by bringing the mouse close to the first point (the new point will "glow") and clicking to close the outline. Once the outline is complete, the points can be moved around, and new points can be inserted. The outline can be deleted by right clicking and deleting from the context menu. Once all the pictures are identified, clicking in the "output" tab does what it says it will do.

When I get to the end, I do the last page or two in color (end flap and cover), and I'm done.

Finally, I've gotten in the habit of saving my work. I've only crashed ST once, but I hate having to do things twice. I save the ScanTailor Project in the same directory as the input JPEGs, but I guess you can put them anywhere. I let ST rotate my images now, then go straight to an auto "select content" step. As soon as that's run, I save the project before I do any tweaking, and save periodically as I'm making subsequent changes. It's also informative to look at the XML file ST generates for additional insight into how it does things internally.

Locked