Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Methods To Sense The 3D Surface/Structure Of A Book

DIY Book Scanner Skunk Works. Share your crazy ideas and novel approaches. Home of the "3D structure of a book" thread.
Anonymous1

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by Anonymous1 » 14 Feb 2011, 16:14

It's in here: http://www-personal.umich.edu/~nikitan/files/PDF/.
That paper looks pretty good. As I'm basically making a dewarping grid like Scan Tailor uses, I might as well take a look at it.

I'm going to try my original method (skewing chunks of the image to fit into a grid), your method, and ST's method, just to see which one works the best in this case.

How's your lens correction and such coming along? I might have to implement your lens distortion algorithms, as the 3D meshes that I'm getting seem to be a bit off.

steve1066d
Posts: 296
Joined: 27 Nov 2010, 02:26
E-book readers owned: PRS-505
Number of books owned: 1250
Location: Minneapolis, MN
Contact:

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by steve1066d » 14 Feb 2011, 17:30

For lens correction.. I'm fine with just using the spherical correction that I have in BSW. It seems to get things close enough, so I'm not pursuing that any farther right now.

For perspective correction. I think the way I documented earlier is the right approach. There's a little bit of tweaking that I can do (such as making the focal point of the transform the center of the image instead of the upper left corner).

Finally, I've also got correction for the angle of the region of a page to the camera. I think its working, but the page I was trying it on just seemed to make the page a bit wider. However, the pages I was testing with didn't have anything near the gutter of the page, so it could be working perfectly, but I'll need to test it on some other images.

So my current plan of attack is to use your rendered "perfect" images and see how close I get to perfect results. I've also got some work to make some make my line height function work more generically, and I might do a little gui program to pick the filter parameters. Imagine some sliders where you can pick the hue to match, how close the hue must match, and the saturation. That way it the detection can be tailored to different colors of lasers, and different setups.

One thing that might be causing your warping to be off is that if you are calculating the height at a point, you may actually need the height at the destination, not at the source. What I ended up doing was using my height map, and transforming that first.

One thing to keep in mind that what I come up with can be used in other tools as well.. you can just use BSW as a command line program to do the warping, if you don't want to implement it all in Python. (or if I like your solution better, maybe I'll just call something in Python from within BSW :) )

I've got my code checked into subversion if you want to look at it.
Steve Devore
BookScanWizard, a flexible book post-processor.

atarkri
Posts: 14
Joined: 08 Jan 2011, 13:29

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by atarkri » 14 Feb 2011, 19:35

Two things, first a bit of code and then some general discussion.

Thanks to blender's posting of his binarization code, I was able to write up a procedure to extract the slope of the laser line at each point:

Code: Select all

#!/usr/bin/env python

import os, sys, math
import cv
import Image, Gnuplot

imagePath = sys.argv[1]
thresh = float(sys.argv[2])
crease_guess = int(sys.argv[3])

#image = cv.LoadImage(imagePath)
raw_image = cv.LoadImageM(imagePath)
rows = cv.GetSize(raw_image)[1]
cols = cv.GetSize(raw_image)[0]

channelG =          cv.CreateMat(rows, cols, cv.CV_8UC1)
channelB =          cv.CreateMat(rows, cols, cv.CV_8UC1)
imageSubtracted =   cv.CreateMat(rows, cols, cv.CV_8UC1)
imageMorph_close =  cv.CreateMat(rows, cols, cv.CV_8UC1)
tmp =               cv.CreateMat(rows, cols, cv.CV_8UC1)
imageThreshold =    cv.CreateMat(rows, cols, cv.CV_8UC1)

#cv.Split(raw_image, cv.CreateImage(cv.GetSize(raw_image), 8, 1), channelG, channelB, None)
cv.MixChannels([raw_image], [channelG, channelB], [(1, 0), (2, 1)])

cv.Sub(channelG, channelB, imageSubtracted)

cv.MorphologyEx(imageSubtracted, imageMorph_close, tmp,
  cv.CreateStructuringElementEx(3, 3, 1, 1, cv.CV_SHAPE_RECT), cv.CV_MOP_CLOSE, 10 )

cv.InRangeS(imageMorph_close, cv.Scalar(thresh), cv.Scalar(255), imageThreshold)

image = Image.fromstring('L', cv.GetSize(imageThreshold), imageThreshold.tostring())
pixels = image.load()

#data = [] # why only use one array...?
top_curve = []
btm_curve = []

for x in xrange(image.size[0]):
  for y in reversed(xrange(image.size[1])):
    if pixels[x, y] == 255:
      #data.append(y) # finds the first from the bottom
      btm_curve.append(y)
      break
 
  for y in xrange(image.size[1]):
    if pixels[x, y] == 255:
      #data.append(-y) # finds the first from the top
      top_curve.append(y)
      break

#
## Separate the image into two halves, on either side of the middle crease
#

# The crease will be the lowest point between the two peaks
# TODO: implement a proper simulated annealing algorithm, or other local min search
creases = [top_curve[crease_guess], btm_curve[crease_guess]] # initial guess for CIMG0918.JPG
crease_indexes = [crease_guess, crease_guess]

for i in xrange(crease_guess-200, crease_guess+200): # lol magic numbers
  if top_curve[i] >= creases[0]:
    creases[0] = top_curve[i]
    crease_indexes[0] = i
    
  if btm_curve[i] <= creases[1]:
    creases[1] = btm_curve[i]
    crease_indexes[1] = i

print 'Top crease at index, value:    %d, %d'%(crease_indexes[0], creases[0])
print 'Bottom crease at index, value: %d, %d'%(crease_indexes[1], creases[1])

#
## Find the "tops" of the curves, which indicate the highest z-value
#

from itertools import count, izip 
 
left_maximums    = [0,0]
left_maxindexes  = [0,0]
right_maximums   = [0,0]
right_maxindexes = [0,0]

left_maximums[0], left_maxindexes[0]  = min(izip(top_curve[:crease_indexes[0]:], count()))
left_maximums[1], left_maxindexes[1]  = max(izip(btm_curve[:crease_indexes[1]:], count())) 
right_maximums[0], right_maxindexes[0] = min(izip(top_curve[crease_indexes[0]::], count()))
right_maximums[1], right_maxindexes[1] = max(izip(btm_curve[crease_indexes[1]::], count())) 
right_maxindexes[0] += crease_indexes[0]
right_maxindexes[1] += crease_indexes[1]

print 'Left-Top z-max at index, value:     %d, %d'%(left_maxindexes[0], left_maximums[0])
print 'Left-Bottom z-max at index, value:  %d, %d'%(left_maxindexes[1], left_maximums[1])
print 'Right-Top z-max at index, value:    %d, %d'%(right_maxindexes[0], right_maximums[0])
print 'Right-Bottom z-max at index, value: %d, %d'%(right_maxindexes[1], right_maximums[1])

#
## Extract the slope for both pages. First we will smooth with a 
## gaussian, then we will compute our slopes.
#

def gaussianSmooth(noisyInArray, sigma, mu, distance):
  gaussian = []
  smoothOutput = []
  
  if len(noisyInArray) <= distance:
    return noiseInArray
  
  for x in xrange(-distance, distance):
    #gaussian.append( 2.718281828459**(-((x-mu)**2.0)/(2.0*sigma**2.0))/(2.0*3.141592653589*sigma**2.0)**.5 )
    gaussian.append(math.e ** (-((x-mu) ** 2.0)/(2.0*sigma ** 2.0)) / (2.0*math.pi*sigma ** 2.0) ** 0.5 )

  # put in 'distance' of the values first, remember to disregard them later
  for i in xrange(1, distance): 
    smoothOutput.append(noisyInArray[i-1])

  ## Gaussian smooth, note we will lose 'distance' values on each end
  for i in xrange(distance, len(noisyInArray)-distance): # read-ahead a bit
    tmp_sum = 0
    for j in xrange(-distance, distance):
      tmp_sum += gaussian[j+distance] * noisyInArray[i+j] # 'j+distance' because we can't access a negative array index
    smoothOutput.append(tmp_sum)
    
  for i in xrange(1, distance): 
     smoothOutput.append(noisyInArray[len(noisyInArray)-distance+i-1])
     
  return smoothOutput

def gaussianPrimeSmooth(noisyInArray, sigma, mu, distance):
  gaussian = []
  smoothOutput = []
  
  if len(noisyInArray) <= distance:
    return noiseInArray
  
  for x in xrange(-distance, distance):
    #gaussian.append( 2.718281828459**(-((x-mu)**2.0)/(2.0*sigma**2.0))/(2.0*3.141592653589*sigma**2.0)**.5 )
    gaussian.append(2.0*(x-mu)*math.e ** (-((x-mu)**2.0)/(2.0*sigma**2.0)) /
                     (2.0*sigma ** 2.0 * (2.0*math.pi*sigma ** 2.0) ** 0.5)  )

  # put in 'distance' of the values first, remember to disregard them later
  for i in xrange(1, distance): 
    smoothOutput.append(noisyInArray[i-1])

  ## Gaussian smooth, note we will lose 'distance' values on each end
  for i in xrange(distance, len(noisyInArray)-distance): # read-ahead a bit
    tmp_sum = 0
    for j in xrange(-distance, distance):
      tmp_sum += gaussian[j+distance] * noisyInArray[i+j] # 'j+distance' because we can't access a negative array index
    smoothOutput.append(tmp_sum)
    
  for i in xrange(1, distance): 
     smoothOutput.append(noisyInArray[len(noisyInArray)-distance+i-1])
     
  return smoothOutput

# Compute slopes (i.e. change in height)
#  Note: Let (*) denote convolution. We are performing 'g (*) pixels' when
#  we gaussian smooth. Due to the properties of 'e', we can instead 
#  perform g' (*) p == g (*) p' [/Fourier analysis] 
distance = 50
topLeft_slopes =  gaussianPrimeSmooth(top_curve[:crease_indexes[0]+distance:], 10.0, 0.0, distance)
btmLeft_slopes =  gaussianPrimeSmooth(btm_curve[:crease_indexes[1]+distance:], 10.0, 0.0, distance)
topRight_slopes = gaussianPrimeSmooth(top_curve[crease_indexes[0]-distance::], 10.0, 0.0, distance)
btmRight_slopes = gaussianPrimeSmooth(btm_curve[crease_indexes[1]-distance::], 10.0, 0.0, distance)


# Data still too noisy. Apply more gaussian smoothing.
# TODO: we should be able to get the same result by multiplying the two sigmas...
topLeft_smoothSlopes =  gaussianSmooth(topLeft_slopes, 30.0, 0.0, distance)
btmLeft_smoothSlopes =  gaussianSmooth(btmLeft_slopes, 30.0, 0.0, distance)
topRight_smoothSlopes = gaussianSmooth(topRight_slopes, 30.0, 0.0, distance)
btmRight_smoothSlopes = gaussianSmooth(btmRight_slopes, 30.0, 0.0, distance)

plot1 = Gnuplot.Gnuplot()
plot1.title('Top curve')
#plot1.plot(data[::2])
#plot1.plot(top_curve, topleft_smoothed)
plot1.plot(top_curve)

plot2 = Gnuplot.Gnuplot()
plot2.title('Bottom curve')
#plot2.plot(data[1::2])
plot2.plot(btm_curve)

plot7 = Gnuplot.Gnuplot()
plot7.title('Top Left curve slope')
plot7('set yrange [-2:2]')
plot7.plot(topLeft_smoothSlopes)

plot8 = Gnuplot.Gnuplot()
plot8.title('Bottom Left curve slope')
plot8('set yrange [-2:2]')
plot8.plot(btmLeft_smoothSlopes)

plot9 = Gnuplot.Gnuplot()
plot9.title('Top Right curve slope')
plot9('set yrange [-2:2]')
plot9.plot(topRight_smoothSlopes)

plot10 = Gnuplot.Gnuplot()
plot10.title('Bottom Right curve slope')
plot10('set yrange [-2:2]')
plot10.plot(btmRight_smoothSlopes) 

cv.NamedWindow('Test', 0)
cv.ResizeWindow('Test', 1000, 600)
cv.ShowImage('Test', imageThreshold)

cv.WaitKey(0)
The program takes three parameters: the image file, the minimum value for the binary threshold, and an initial guess where the crease of the book might be. (It segments the curves into top-left, right and bottom-left, right.) Such a procedure may be useful for finding the normals of each point along the laser line, etc.

----
I am suspicious that having the lasers at 45 degrees may be giving slightly inaccurate data, and I'm at a loss as to how to rectify it.

The choice of having the laser lines at 45 degrees not only gives us a measure of the curl in the x-direction, but also in the y-direction. Depending on the structure of the curled page, these two measures may be present in varying ratios across the page. Contrast with laser lines being projected at 90 degrees (straight down); such laser lines would measure the x-direction only and could simply be "straightened" to account for the books curvature (but the perspective distortion would still be present).

Dan, do you think you could post four pictures, {a book laying open, a chessboard} with two laser lines projected {at 45, at 90}?

----
I imagine we could do away with needing to calibrate the setup with a chessboard/QR codes/etc. If we were to strap the lasers to the camera, we may be able to get the depth-information from the camera's perspective. I imagine we will need to cast a grid of some sort, but we may be able to get by with two lines as before. Such a setup may require the lasers lines to be projected at 90 degrees from the camera sensor plane (straight out from the camera).

With such information, we should be able to correct perspective distortion, lens distortion, and book dewarping from the laser lines.

Dan, think you can oblige? I may just build such a rig myself. Any quick tips based on your previous efforts?

User avatar
daniel_reetz
Posts: 2786
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by daniel_reetz » 15 Feb 2011, 01:09

Dan, think you can oblige?
Yeah, I can, though it might be until tomorrow evening before I get those pics.

I have enough parts to build you an identical rig to the ones I'm building for these guys. Can just mail it along.
d

Anonymous1

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by Anonymous1 » 15 Feb 2011, 01:19

This, I will play with. The morphology thing you were mentioning seems to work, so I'll dissect that and check out how I can automate the whole process. I want it to be able to threshold without any user input.

The laser setup is fine AFAICT. The positions are actually irrelevant, just as long as the angles of the lasers are known. It's just a simple intersection of a plane and a mesh if you think about it, and the calibration images give you all the necessary information about the distortion. Getting depth information from the camera's perspective is just as good as getting it from a different perspective.

Here's a test image. All of the lasers are pointing at the same 2 points. Red is 90 degrees, green is 45, and blue is 30. I ran out of preset lasers, so I made a yellow one, which is on the same plane and pointing at the same point as the camera. It's a grainy, but you can still see what's happening. The camera was off by a millimeter or so in my original render, so sorry about that:
default.png
Tons of lasers.
default.png (1.23 MiB) Viewed 3530 times
I could make an animation, though, but that's completely overkill...

Anonymous1

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by Anonymous1 » 15 Feb 2011, 01:25

Once I finish re-compiling tons of Linux stuff on my university's really fast server, I can setup a rendering farm and make these renders pretty quickly (nobody uses that poor Intel Xeon, and they have 5 of 'em. Why not?). Here's a less grainy render:
default.png
More renders...
default.png (1.08 MiB) Viewed 3528 times
Steve, I think I'll just translate to Python. I can think a bit better when I can read my code, and it's really simple to work with. But it's pretty nice that you have BSW doing all this stuff. C++ is not my thing, so I can't help the ST group. Java's manageable, as there aren't that many strange things (compiling, header files, and those annoying symbols that randomly occur. Ugh).

I might actually translate my code into Java or C++ once it works well, since Python is showing it's speed limitations (I'm not sure if 1 second is fast for Python, but I always feel like it could be faster...).

Anyways, I think we should actually dewarp something before merging or splitting solutions ;)

steve1066d
Posts: 296
Joined: 27 Nov 2010, 02:26
E-book readers owned: PRS-505
Number of books owned: 1250
Location: Minneapolis, MN
Contact:

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by steve1066d » 15 Feb 2011, 01:37

I've made a bit more progress. I cleaned up the code to calculate the height of the points on the book. It now uses the mid-point of the two laser lines as the zero position.

Here's the starting image (after the barrel correction):
Resize of IMG_0885.jpg
Here's the image after correcting for the distance to the camera:
Resize of IMG_0885_4.jpg
Here's the image after trying to correct for the angle of the area to the camera. It isn't quite right.
Resize of IMG_0885_5.jpg
My code is checked into subversion if you want to take a look:
https://bookscanwizard.svn.sourceforge. ... nwarp.java
Steve Devore
BookScanWizard, a flexible book post-processor.

User avatar
daniel_reetz
Posts: 2786
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by daniel_reetz » 15 Feb 2011, 01:39

WOW, Steve, that is some *excellent* work! Congrats! Now I have to add Java to the list of languages I'm slowly, slowly learning. :)

atarkri
Posts: 14
Joined: 08 Jan 2011, 13:29

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by atarkri » 15 Feb 2011, 01:40

These simulated-book renders are approaching works of art.

Anonymous, any chance you can upload the model/associated files somewhere so that we can all play along? If bandwidth is an issue I can help host, and I'm sure others have space too.

While I'm requesting, can we get a code dump from your line-straightening attempts? Can't hurt.

Anonymous1

Re: Methods To Sense The 3D Surface/Structure Of A Book

Post by Anonymous1 » 15 Feb 2011, 12:50

You won't be able to get these results using Anonymous's internal renderer, as it is biased (it uses shortcuts to increase speed, but it doesn't look as realistic). You'll need LuxRender, as I don't know how else to make a laser with Anonymous's internal rendering engine.

I've made a git repository, so if you want to play with the files, check them out with git:

Code: Select all

git clone git://github.com/Anonymous3D/Laser-based-Book-Dewarping.git
The BLEND file has no textures, so the plywood might not look as epic, but it should still work perfectly. Just as a note, there is a super tiny cube inside of the laser columns. That's a super-bright light, and it is forced through a slit. If you will be moving the lasers, just make sure to move the cube, or else it will wash out your whole scene.

Also, the script can detect if it's being run from within Anonymous. If you just paste it into Anonymous's text editor, it'll make a 3D model of the book image.

Post Reply