Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Fastest computer build for Scan Tailor?

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.
Adam32
Posts: 28
Joined: 28 Jun 2014, 08:55
Number of books owned: 500
Country: United Kingdom

Fastest computer build for Scan Tailor?

Post by Adam32 » 05 Nov 2014, 04:36

I have been using Scan Tailor for a few months, but it is pretty slow on my PC. At the moment I am running a Core2Duo overclocked to 4ghz with 8GB of Ram. I am now going to build a new PC and am willing to spend up to £1500 ($2400). Any idea what sort of build I should go for to attain maximum speed on Scantailor? I was thinking of one of these processors:

Intel Core i7 4970K - 4.4GHZ
AMD FX-9590 - 5GHz

Also what about going for 64gb of RAM - will this have any noticeable effect?


Any ideas what sort of build for my price will get me the fastest speeds on Scan Tailor? Also any idea what sort of improvement in speed I will achieve over my current build?

dpc
Posts: 311
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: Fastest computer build for Scan Tailor?

Post by dpc » 05 Nov 2014, 16:33

There's a comment here from the author of ST:
http://www.diybookscanner.org/forum/vie ... =21&t=2626

Note that post is over two years old and I believe that there have been efforts since then to modify ST to use more CPU cores.

I don't know about memory usage. You could launch the app and then look at its working set with PerfMon. Perhaps an SSD would be beneficial?

It would be cool to have a set of common images that could be used to benchmark ST on a variety of machines to note the typical performance improvement seen by adding more RAM or a faster hard drive.

Adam32
Posts: 28
Joined: 28 Jun 2014, 08:55
Number of books owned: 500
Country: United Kingdom

Re: Fastest computer build for Scan Tailor?

Post by Adam32 » 05 Nov 2014, 18:55

Thanks for your reply but I have read that topic. I was really hoping to hear some benchmarks from other users and opinions on their own rigs.

User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Fastest computer build for Scan Tailor?

Post by jbaiter » 06 Nov 2014, 10:40

Currently ScanTailor only runs on a single thread. To get the best performance, you best use a script to split your ScanTailor projects into multiple configuration files and launch multiple processes to process these. This is the way spreads does it, but I think there's also an AutoHotKey script somewhere here on the forums that does the same thing.
Then it's a matter of getting as much processing power as you can get :-)
spreads: Command-line workflow assistant

dpc
Posts: 311
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: Fastest computer build for Scan Tailor?

Post by dpc » 06 Nov 2014, 14:05

Something I've wondered about is how do you break up the work between multiple running instances of ST for the content selection stage? Let's say you have a 300 page book scanned and you launch four instances of ST. Each instance would then work on 75 pages, right?

Doesn't ST look for the page with the largest content rect, add the margin width/height, and that is the output stage's image size for all of the pages? If the largest content selection rect differs between these four jobs, the size of the output images will vary. How would the largest final image size be conveyed to each of the other three running instances so that all of the output images are the same size?

dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Fastest computer build for Scan Tailor?

Post by dtic » 06 Nov 2014, 18:59

Adam32, in addition to what jbaiter wrote check this thread . My old ScanTailor_multi_core runs in Windows and operates on the Scan Tailor GUI. I haven't used it in a long time but then again Scan Tailor hasn't been much updated either so you could give it a try. In Linux environments there is Spreads as mentioned by jbaiter. On the link above there is also a post by user mhr with a Python script (haven't tried it).

Another alternative for Windows is to first crop to whole book pages, then command line process all images (in four parallell processes) to black and white text, and finally go back and adjust picture zones with QuickPicZone only for the images that need them. If what you scan is mostly text but some image or graph here and there then that workflow can be very quick. Requires that you script the command line processing step yourself though.
dpc wrote:Something I've wondered about is how do you break up the work between multiple running instances of ST for the content selection stage? Let's say you have a 300 page book scanned and you launch four instances of ST. ... Doesn't ST look for the page with the largest content rect, add the margin width/height, and that is the output stage's image size for all of the pages? If the largest content selection rect differs between these four jobs, the size of the output images will vary. How would the largest final image size be conveyed to each of the other three running instances so that all of the output images are the same size?
My old ScanTailor_multi_core simply didn't adjust for that. With it you did steps 1-5 in one ST instance and then split to two/four for the most processing intensive last step and they would vary a bit in size. Not sure what Spreads does. But you can in principle ensure same pagesize for all four instances like so: after step 5 save and parse the projectfile to detect the largest page R/L pages and copy the settings for those pages, split the project into four, pad each of the four with clones of the maximum R/L page at the end and also clone the jpg images, run the four processes who will now adjust size to the maximum R/L image i.e. the clones in all four processes, when done remove the clones and join everything into one folder. I had something like that working in a test script that expanded on ScanTailor_multi_core but it was messy plus scripting the ScanTailor GUI is tricky so I abandoned that and moved on to command line processing instead.

Adam32
Posts: 28
Joined: 28 Jun 2014, 08:55
Number of books owned: 500
Country: United Kingdom

Re: Fastest computer build for Scan Tailor?

Post by Adam32 » 07 Nov 2014, 05:48

I have tried the autohotkey multi core and it is a definite improvement on speed, although the problem is it only works on the last stage. Is there anyway of using multicore at the select content stage as well?

Also is there anyway to support more cores, such as if someone is using 8 core or 12 core?

cday
Posts: 243
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Fastest computer build for Scan Tailor?

Post by cday » 07 Nov 2014, 13:12

Equalising page sizes: an earlier thread rather incidentally detailed a novel method of equalising the pages sizes by 'printing' the images to a PDF file using the print to file function in Adobe Acrobat: in this case the pages all became A4, but the output page size can in fact be configured somewhere in Acrobat.

http://www.diybookscanner.org/forum/vie ... =19&t=2981
Ricardo wrote:One thing I found was the printing to pdf file process also makes the pages all the same size if after processing you end up with some pages different sizes...
There are a number of programs including freeware software utilities that can 'print' to a PDF file, so this might conceivably provide another way to equalise page sizes that could be run on all the pages in a book as a separate process in the workflow.

Edit:

A Google search shows that there are many freeware PDF printers or 'virtual printers' that output to a PDF file; in a quick test on an old version of CutePDF Writer a custom output page size could be set in 'Postscript Options'.

Adam32
Posts: 28
Joined: 28 Jun 2014, 08:55
Number of books owned: 500
Country: United Kingdom

Re: Fastest computer build for Scan Tailor?

Post by Adam32 » 07 Nov 2014, 15:08

I have tried the autohotkey multi core and it is a definite improvement on speed, although the problem is it only works on the last stage. Is there anyway of using multicore at the select content stage as well?

Also is there anyway to support more cores, such as if someone is using 8 core or 12 core?

dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Fastest computer build for Scan Tailor?

Post by dtic » 07 Nov 2014, 15:16

Adam32 wrote:I have tried the autohotkey multi core and it is a definite improvement on speed, although the problem is it only works on the last stage. Is there anyway of using multicore at the select content stage as well?
That is possible but I won't spend time on it since I have no use for it. The two biggest hurdles for such an update are that the script would need to work with different user settings at each stage (not needed if someone updates it for themselves and hardcodes their own settings of course) and, second, it is kind of a hassle to make autohotkey control the ST gui because ST uses QT which means that autohotkey can't find and use the controls directly so must simulate mouseclicks and keypresses.
Adam32 wrote:Also is there anyway to support more cores, such as if someone is using 8 core or 12 core?
That would be easier. If you have coded anything in autohotkey before you may be able to modify the parts of the code that covers 2 or 4 cores yourself to allow for 8 / 12 cores.

Post Reply