Page 4 of 11

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 05 Apr 2017, 23:43
by Mohib
Nice and unexpected! Thanks very much for letting me know and, also, good call on the the black page test. So thanks for that too.

I was always focused on the camera and support reflection (which was obvious as it was on top of the platen) and it never struck me the platen handle could also cause a reflection as I was just ensuring it was out of the field of view and forgot my high-school optics class!

However, it seems like my test holes to find the best place for the platen handle turned out a simple solution for solving the handle reflection problem when scanning large coffee table books!

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 06 Apr 2017, 10:33
by Mohib
In regard to my thoughts on how Scan Tailor can easily (and automatically) resolve the image scale problem based on my automated method for this issue (outlined earlier in this thread and on ScanTailor's github here: ... -290304141), I just noticed that in this thread:

"Scantailor - How to Get Identically Sized jpgs for Left and Right Side?"
viewtopic.php?f=20&t=3296&p=19883&hilit ... are#p19883

dpc also mentions using calibration pages, and also for scaling, to compensate for differences between left and right camera-platen distances:
I handle this by shooting a calibration page for each camera that is a white page with a 2"x2" black square in the centre. I then have a pre-processing step that looks at this first page of each set (L/R) and determines what the actual DPI is and scales the pages so that they are the same size. Obviously I try to get this somewhat close with proper camera setup but the scaling based on the calibration pages ensures the pages end up being the same size and the rest of the post-processing pipeline knows what the DPI is.
The only difference being between this and my image scaling problem (since it's not a fixed DPI correction) would be, as I explained earlier in this thread and on the github (with some additional details to take into account), to shoot the same calibration pages at both the start and the end of the scanning runs of the left and right pages (since I scan one side of a book at a time).

Then the scale factor between the start and ending calibration pages (for each side) can be automatically calculated (using Scan Tailor auto-content method if clear calibration images are used) and distributed across all pages (per dtic's script earlier in this thread) into the scale factor Scan Tailor is already calculating in Step 5, in order to apply the margins without actually doing a special image processing pass (and the extra time that takes) just to scale correct the pages.

In this way, this solution would:

- not increase post-processing time with an additional image processing pass,

- be fully automated (like for dpc), and

- allow for additional degrees of freedom in scanner design (and so allowing new creative designs) since the platen-camera distance no longer needs to be fixed (or, in the related problem, left and right camera placement does not need to be 100% accurate, as was the issue dpc was using calibration shots for).

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 06 Apr 2017, 11:30
by Mohib
For those looking to build the scanner, I've attached a draft PDF with the parts list (and sources) and schematics for this version.
I'll update the document with construction instructions, but I think they should be pretty obvious.

Although 90% of the construction steps from the original version have been eliminated, if you still need some help, you can refer to the old instructions ignoring what's not relevant any more. The old instructions are here:


Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 06 Apr 2017, 12:10
by Mohib
And here are few construction pics that should help clear up anything not obvious.
P4020178 - @33%.jpg
Closeup of top platen and handle
P4020179 - @33%.jpg
Close up of bottom of platen and handle assembly bolt.
P4020185 - @33%.jpg
Platen handle bolt washers (one rubber and one steel) under the platen. Bolt goes through holes 1, 2 or 3 in platen.
P4020184 - @33%.jpg
Cable inside platen handle holding it together (so no glue needed) anchored in cross-dowel
P4020186 - @33%.jpg
Felt tip mark made on platen handle cable while pulling it tight (other side of the cable was anchored in other cross dowel). Mark indicates where to make extreme edge of loop after pulling it all out and making looping on this side using a standard wire rope swage (see image in parts list document)
P4020182 - @33%.jpg
Left hand side of the vertical support cable with turnbuckle and short, secondary cable.
P4020180 - @33%.jpg
Right hand side of the vertical support cable (wraps over the T onto the left side)
P4020187 - @33%.jpg
How the two ends of the vertical support cable are anchored
P4060197 - @33%.jpg
Inside cap at bottom of vertical post showing bolt fastening it to existing threaded hole in Manfrotto clamp
P4060198 - @33%.jpg
Cap at bottom of vertical post dismantled
P4060194 - @33%.jpg
Horizontal support dismantled (normal)
P4060195 - @33%.jpg
Horizontal support dismantled (with extension for light)
P4020189 - @33%.jpg
Lock nuts to hold the macro focus rail tight to the cap.
Scanner 900 - Dismantled for transport - B&W - @33%.jpg
Dismantled for transport (platen is wrapped in the white towel)

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 06 Apr 2017, 20:55
by recaptcha
I wonder if there's a better way to stabilize the platen more, as well as the book.

Perhaps there could be adjustable bumpers to line the book up against. And maybe the platen could be mounted in some loose or adjustable way on the right hand side?

Just thinking out loud...

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 06 Apr 2017, 23:52
by Mohib
Ever since I made the first version 3 years ago I thought the same thing, because we all naturally think if mechanical systems are rigid and channelled they will work more efficiently. However, after several attempts, I realised there were deeper human forces at play, as every attempt reduced scanning efficiency.

Firstly let me explain the "problem." It's not the platen that's the real problem but the book drifting left and/or tilting a bit. This seems to be generally caused by pressing the platen into the spine. This causes the book to nudge left. Now this is not as big a problem as it seems because, firstly, you don't need to zoom too tight into the page so can tolerate some drift and, secondly, you can "roll the spine" right to shove the page into the view for several pages. So basically you only have to move the book back every 30 pages or so and it just takes a few seconds to do, so it's not a big delay.

Now I thought if I could stop this, then I could focus better on scanning without distractions or interruptions. However, as I said, whatever I tried, even some good solutions, slowed down scanning and sometimes dramatically. It was very confusing to say the least, until I realised page turning is the most consuming part of scanning and it is also the one with the human element in play. This meant it had be made as natural as possible -- better than even "ergonomic" -- and more natural it was, the less we had to focus on the actions.

It dawned on me, that the scanner already has the most natural action possible: turning a page as though you were actually reading the book. Something we can do subconsciously without thinking. And anything you "added" to the setup interfered with the fluidity of this motion and the subconscious understanding of how to do it. For example, adding a backstop behind the spine seems obvious until you experience it interfering with the natural rocking back and forth of the binding as you turn the pages (similar, but not exactly the same, that some V platens have with larger spines). And it quickly becomes so irritating you just want to shove the backstop out of the way so it stops interfering with your rhythm. Similarly anchoring the platen so it stays fixed (using cables or whatever) and doesn't drift left, interferes with the natural motion of letting it slide off the book and lifting it up-wards (notwithstanding it's almost impossible to stabilise it laterally with thick books) because your arm movement is constrained.

It's like riding a bicycle. It's all very smooth and natural, but if you hang two grocery bags off the handle bars (even if they are exactly the same weight) the whole experience changes completely and is no longer subconsciously effortless, but instead captures your full attention, demanding your constant focus, and now feels very unnatural.

So despite the scanner looking very "loose" because all the parts are disconnected (i.e. the platen, book, camera are all free floating relative to each other), I think it is precisely this freedom that lets it simulate the very natural and subconscious motion of turning the pages of a book, as though you're reading it, with no interference to your body. And so in the end, I decided the few seconds that would be saved from keeping the book or platen stable were totally overwhelmed by the sheer loss of fluidity and thus scanning throughput.

I think its this fluidity that lets you scan between the "course correction" interruptions (fixing the book position) at 1,400 pages per hour with just one camera: it's because page turning is so efficient. The mere fact that speed is attained speaks for itself that the "problems" are not really problems at all but, I think, just some necessary "inefficiency" for the overall optimal operation of the system.

Perhaps if others built the scanner and experimented with it, more ideas might come forward as they have something they can play with hands on. It's not hard to build! :)

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 07 Apr 2017, 22:33
by Mohib
dtic wrote:
30 Mar 2017, 17:00
Since page 100 was the first left side page to be shot the phone presumably gave it a filename (a datestamp or incremental number) that makes it sort first among the left side page images. In that case the script should work correctly also on the left side images.
Yes of course, that makes sense.

Thanks for this script. I tried it and it seems to do the job, although I've not tested with an actual book yet (to ensure there are no other issues with scaling when ScanTailor does it's scaling for margins, but I think it should be ok) because I have to scan a book without adjusting the macro focus rail. However, I did have a set of 95 right hand pages from a test from a while ago, before I got the macro focus rail, and the result looks good (see comparison below showing the last page, 191, compared to the first page, 3).

I made one change in your script to the call to Graphics Magick to add a quality factor, just to be sure they're not downgraded so the conversion line is now:

Code: Select all

RunWait "%gm%" convert "%A_LoopFileFullPath%" -resize %thisresize%`% -quality 100 "%A_LoopFileFullPath%",,hide
However, I also redid the rest of the script to make it more user friendly and idiot proof so it's easier for me to use on a regular basis without having to hack the code each time. The code is below if anyone wants to use it.

NOTE: The script process the files using their modification times (rather than file names) to ensure no random effects from directory listings that may not be alphabetical (which can happen on some Windows machines).

NOTE: The script assumes the first and last images in the directory are calibration images and will be ignored from the scaling process and calculations. To make the calibration images, simply place a ruler on the first page and scan it, then remove it and scan all pages, then rescan the last page in the book with the ruler on it again. You can then simply measure the pixels taken for 6 or so inches and use those counts when prompted.

Although this script eliminates the scaling issue (and so one less thing I need to pay attention to while scanning, which should increase net throughput a bit), the macro focus rail is still very useful as it allows setting zoom factors that reduce/eliminate interpolation (like 2x as opposed to 2.5x, which would decrease quality) and then positioning the camera physically so the page fills the frame.

Also, despite this script helping, I still feel the very small change to ScanTailor to addrss the scaling issue (we're talking a few lines of code and a couple of option settings -- per my ideas earlier in this thread -- in Step 5, "Margins") would be worth the little effort not just because the entire task can be automated (so it's not error prone), but also for the extra degree of freedom it allows in designing scanners (1 or 2 camera versions) which could spur new, innovative designs. It doesn't add anything to the processing time, as the scale factor calculated can be simply incorporated into the scale factor ScanTailor is already calculating at this step for each page, and then it can apply the one combined scale factor to the page.
A Short History of the Ismailis - p 011 -- p 199 - 51%.jpg
A Short History of the Ismailis - p 011 -- p 199 - 51%.jpg (387.36 KiB) Viewed 6195 times

Code: Select all

#singleinstance, force


; autohotkey script using graphics magic library

; Full path to graphicsmagick gm.exe
; Example: C:\Program Files\GraphicsMagick-1.3.25-Q8\gm.exe
gm = C:\Program Files (x86)\GraphicsMagick-1.3.25-Q16\gm.exe

; Get path to folder with jpg images to process (all left or all right)
; File modified date and time of each image must increase from first (largest) to last (smallest) page scanned 
; ALL pages must be scanned (including blank) with equal number of pages in both left and right directories
imagesFolder := ""
FileSelectFolder, imagesFolder,,2,"Choose folder with either ALL left or ALL right pages"
 if (ErrorLevel = 1) {
imagesFolder := RegExReplace(imagesFolder, "\\$")

; Get number of pixels of calibration image (6" ruler) placed on first page and last page of book 
firstImageSize := 0
While firstImageSize = 0 Or firstImageSize = "" {
 InputBox, firstImageSize, First Calibration Page, Enter number of pixels taken by a 6`" ruler placed on the first page of the book (first callibration page),,450,130
 if (ErrorLevel = 1) {
If (firstImageSize = 0 Or firstImageSize = "") {
  MsgBox ,,ERROR,Zero or blank is not allowed

lastImageSize :=  0
While lastImageSize = 0 Or lastImageSize = "" {
 InputBox, lastImageSize, Last Calibration Page, Enter number of pixels taken by a 6`" ruler placed on the last page of the book (second callibration page),,450,130
 if (ErrorLevel = 1) {
 If (lastImageSize = 0 Or lastImageSize = "") {
  MsgBox ,,ERROR,Zero or blank is not allowed
 If (firstImageSize < lastImageSize) {
  MsgBox ,,ERROR,Last/smallest image is greater than first/largest image. Please enter the values again.
  goto GetCalibrationSizes

if !FileExist(gm) or !FileExist(imagesFolder)

; Calculate scale difference (percentage) between first and last pages
totalScaleFactor := ( firstImageSize / lastImageSize ) * 100

; noOfFiles number of pages in directory to scale can be distributed across all pages
; Create file list sorted by modified date/time so script can be used after renaming and is not depenedent on original image file name.
; Using time stamp also ensures processing is not random if directory listing is not alphabetic
fileList := ""
noOfFiles := 0
Loop, files, %imagesFolder%\*.jpg 
 ; ignore first calibration image
 if (noOfFiles <> 0) {
  fileList = %FileList%%A_LoopFileTimeModified%`t%A_LoopFileName%`n

; remove last calibration page from file list
fileList := Substr(fileList,1,Instr(fileList,"`n",false,-1,1))
; reduce count by 2 becuse 2 calibration images removed
noOfFiles := noOfFiles - 2

; Sort file list (date is appeneded to front of each entry of FileList
Sort, FileList

; 8228=4+32+8192
MsgBox, 8228, WARNING, All files in the directory`n`n%imagesFolder%`n`nwill be processed and over-written.`n`nOk to proceed?
IfMsgBox, No

; Initalize the progress bar style
Progress, B2 M FM10 FS10 WM10 WS10 W700, Starting, PRESS ESCAPE TO CANCEL

Loop, Parse, fileList, `n
 ; Omit the last linefeed (blank item) at the end of the list.
 if (A_LoopField = "" ) 
 ; ignore first image coz not to be scaled
 if (a_index = 1)

 ; Split into two parts at the tab char.
 StringSplit, FileItem, A_LoopField, %A_Tab%  

 thisImageScaleFactor := (totalScaleFactor - 100) * (a_index / noOfFiles) + 100
 amountDone := floor(a_index/noOfFiles*100)
 ; Update the progress bar value
 Progress, %amountDone%
 ; Display the progress bar at the new value with new text
 Progress, ,Processing: %a_index%/%noOfFiles% -- Scale: %thisImageScaleFactor% -- %FileItem2%
 RunWait "%gm%" convert "%imagesFolder%\%FileItem2%" -resize %thisImageScaleFactor%`% -quality 100 "%imagesFolder%\%FileItem2%",,hide
 ; Just make sure we actually exit when done, just to make sure  last file is not processed twice in case don't exit with the blank above (as that seems to fail sometimes)
 if (a_index = noOfFiles)

Progress, Off


; Allow escape to cancel

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 08 Apr 2017, 06:05
by Konos93a
i will try to build something similar . i will repost back when i finish that.
saw a video while you scanning .you are much faster than i thought .
i prefer glass and not plastic panel.
even i want focus lock camera and a stable distance between lens and glass you have done a great job.

is it possible the auto focus solve the problem with the stable distance ?
if u use autofocus how many pages in 5 minutes can you get

after you take photos you process with scantaylor and "match size by scaling"
is that enough?

saw the pdf you had uploaded?
what programs do u use for cad ,and pdf making?

sorry if i make questions you already answered in your posts.
i am more a guy who build stuff with wood ,aesthetic unprofessional, trying to learn electronics with arduino ,using linux and unfamiliar with programming .

have a nice day

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 08 Apr 2017, 10:54
by Mohib
Konos93a wrote:
08 Apr 2017, 06:05
you have done a great job.
Konos93a wrote:
08 Apr 2017, 06:05
i prefer glass and not plastic panel.
Yes I think that will be better -- and won't scratch easily also, but I'm concerned the weight of 9mm-10mm glass might "feel" tiring and slow the speed. Also I think, for books with text close to the gutter, it's best not to try the bevel in the left edge of the platen (I've been thinking about, and is in the schematics as "optional"), as I also find the straight edge helpful for the same reasons JThomas mentioned about his scanner:
JThomas wrote:
21 Apr 2012, 23:41
The pressure from the bottom edge of the glass serves to press the opposing page out of the line of sight. (Second picture)
Of course given the design, one can test several different platens very easily.
Konos93a wrote:
08 Apr 2017, 06:05
even i want focus lock camera
If you have an iPhone I recommend the camera software "Camera+ Free" and "5-in-1 Superbundle" ($2.99) I've listed on page 15 in the Parts List + Schematics PDF (and in the first post of this thread). Be careful to get the right version. Otherwise, see the feature list, also on page 15, if you're looking at other camera software for iPhone or Android.
Konos93a wrote:
08 Apr 2017, 06:05
... a stable distance between lens ...

is it possible the auto focus solve the problem with the stable distance ?
if u use autofocus how many pages in 5 minutes can you get
Auto-focus will not give a stable distance between platen camera -- just auto-focus the distance what ever it is.

As for speed I've not tested it on auto-focus, but I would not use auto-focus as that will slow down every shot quite a bit. If you've got a very thick book (say over 250 pages), you might want to reset the focus every 250 pages (125 sheets), but otherwise I don't think the distance between the start and end of a 250 page book is not enough to cause blurring but I'll try some tests today and post the results.
Konos93a wrote:
08 Apr 2017, 06:05
after you take photos you process with scantaylor and "match size by scaling"
is that enough?
Scan Tailor **does not** fix the scaling problem as far as I've seen (I'm using v0.9.11.1 of Scan Tailor) and I've been discussing about how it **could** fix the problem.

But the script in my last post, updated from the original version kindly provided by dtic, **does solve the problem** from what I can see. But you must scan every page -- even blank pages in the book, but excluding the covers and blank pages at the very start and very end -- from start to finish for it to fix the scaling properly.
Konos93a wrote:
08 Apr 2017, 06:05
what programs do u use for cad ,and pdf making?
An old copy of Ashlar's products and Acrobat.

Re: Original ~600pg/hr, very portable scanner now achieving ~900pg-1100pg/hr

Posted: 08 Apr 2017, 12:05
by dtic
Mohib wrote:
07 Apr 2017, 22:33
I also redid the rest of the script to make it more user friendly and idiot proof so it's easier for me to use on a regular basis without having to hack the code each time. The code is below if anyone wants to use it.
Nice update. If you begin using this regularly you will probably want to optimize for speed. Try replacing the "RunWait" with "Run" command. The script will then in practice run a lot of parallel GraphicsMagick tasks. If too many it may slow the computer temporarily due to the cpu load. But the batch will finish markedly quicker compared to with RunWait. An intermediary alternative is to use Run but on each loop count the number of ongoing gm.exe processes and, if there are >Y processes ongoing, wait X seconds and check again.