Image registration via barcodes

Discussions, questions, comments, ideas, and your projects having to do with DIY Book Scanner software. This includes the Stereo Data Maker software for the cameras, post-processing software, utilities, OCR packages, and so on.

Moderator: peterZ

User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Image registration via barcodes

Post by rob »

I don't know why I didn't think of this before. While it may not help with page curling, it will help with automatic page rotation and deskewing, I think. Stick some 2D barcodes at the corners of pages, find the barcodes using off-the-shelf open-source software, and use the results to compute rotation and skew. Here's a sample image which I just took a picture of:
Sample image, not actual resolution.
Sample image, not actual resolution.
IMG_0007_small.png (595.26 KiB) Viewed 10230 times
See the QR barcodes? Ideally they would be perfectly aligned with the book, say by shoving the page deep into the spine.

Now use the ZXing ("Zebra Crossing") software by Google to find the barcodes and print out the locations of the "finders" for each one. Each QR-formatted barcode has four finders, three large and one small. They are squares within squares. The ZXing software finds the centers of each square.

Here's the quick and dirty code I used, in case anyone is interested (also uses the sixlegs PNG reader library):

Code: Select all

/*
 *  Copyright (C) 2010 robertbaruch
 *
 *  This program is free software: you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation, either version 3 of the License, or
 *  (at your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
package barcode;

import com.google.zxing.BinaryBitmap;
import com.google.zxing.DecodeHintType;
import com.google.zxing.ResultPoint;
import com.google.zxing.client.j2se.BufferedImageLuminanceSource;
import com.google.zxing.common.BitMatrix;
import com.google.zxing.common.DetectorResult;
import com.google.zxing.common.HybridBinarizer;
import com.google.zxing.multi.qrcode.detector.MultiDetector;
import com.google.zxing.qrcode.detector.Detector;
import com.sixlegs.png.PngImage;
import java.awt.image.BufferedImage;
import java.io.File;
import java.util.Hashtable;

public class Main
{
    public static void main(String[] args) throws Exception
    {
        File file = new File(args[0]);
        BufferedImage buffImage = new PngImage().read(file);
        long t = System.currentTimeMillis();
        BufferedImageLuminanceSource source = new BufferedImageLuminanceSource(buffImage);
        HybridBinarizer binarizer = new HybridBinarizer(source);
        BinaryBitmap bitmap = new BinaryBitmap(binarizer);
        BitMatrix matrix = bitmap.getBlackMatrix();

        System.out.println("Black matrix is " + matrix.width + "x" + matrix.height);

        MultiDetector detector = new MultiDetector(bitmap.getBlackMatrix());
        Hashtable hints = new Hashtable();
        hints.put(DecodeHintType.TRY_HARDER, Boolean.TRUE);
        DetectorResult[] results = detector.detectMulti(hints);

        System.out.println("Results in " + (System.currentTimeMillis()-t) + " msec");
        for (DetectorResult result : results)
        {
            System.out.println("Result:");
            for (ResultPoint point : result.getPoints())
            {
                System.out.println("  Point " + point.getX() + "," + point.getY());
            }
        }
    }
}
And, the result on this sample image (at the original resolution):

Code: Select all

Black matrix is 3264x2448
Results in 953 msec
Result:
  Point 2190.0,298.5
  Point 2347.5,303.5
  Point 2342.0,461.0
  Point 2206.5,434.5
Result:
  Point 678.5,252.0
  Point 833.5,255.5
  Point 828.0,412.5
  Point 695.5,387.5
Note the processing time: a second or so!

Each result has four points. A correctly rotated barcode will have the small finder (the "alignment" finder) at the lower right, and the other finders (the "positioning" finders) at the other corners. You can see in the image that the alignment finder is actually on the lower left, which says something about rotation. But even better, the points in the result given by the ZXing library are in the same order: positioning finder at lower left, positioning finder at upper left, positioning finder at upper right, alignment finder (lower right).

Thus, the vector formed by the line from the first to the second positioning finder is supposed to point straight up. Call that -90 degrees (where 0 degrees is pointing to the right) In the example above, the vector points at 1.82 degrees, to the right and slightly down, which means the image would have to be rotated 91.82 degrees counterclockwise to get the barcodes facing the right way. Now, obviously if you do that to the above image, the page would be upside-down. And that's correct, because I turned the barcode page upside-down to get the barcodes on the right side. Obviously I should have printed the barcodes on the other side of the page. But let's just pretend I did it the right way :)

The other barcode shows a 1.29 degree vector, for a 91.29 degree CCW rotation. Why the difference? Parallax! Because the camera's focus is not at infinity, parallel rays are not parallel, which means parallax. So we've extracted another very important feature of the image.

How about the perpendicular vectors -- the vectors from positioning finders 2 to 3? They should be at zero degrees, but the first is at 92.0 degrees, and the second is also at 92.0 degrees. The angles between the two vectors are thus 90.18 degrees and 90.71 degrees. Why aren't these angles exactly 90 degrees? Keystoning! Perpendicular lines are not perpendicular, which means keystoning. Yet another important feature of the image extracted via the barcodes.
Detected angles
Detected angles
PreviewScreenSnapz001.png (19.07 KiB) Viewed 10230 times
From this data, a little hand-waving should result in an appropriate transform to get the image rotated properly and dekeystoned. Since I have a degree in mathematics, I am allowed to say that the solution is obvious and is left up to the student.

Kidding! I haven't done the math at this point, because I was excited enough to post immediately, in case it inspires someone.

--Rob
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Image registration via barcodes

Post by daniel_reetz »

If you printed them at a known distance from each other, you could also compute DPI automagically!
cratylus
Posts: 30
Joined: 04 Mar 2014, 00:52

Re: Image registration via barcodes

Post by cratylus »

Does it matter that the barcodes are not moving with the focal plane?
StevePoling
Posts: 290
Joined: 20 Jun 2009, 12:19
E-book readers owned: SONY PRS-505, Kindle DX
Number of books owned: 9999
Location: Grand Rapids, MI
Contact:

Re: Image registration via barcodes

Post by StevePoling »

rob wrote:From this data, a little hand-waving should result in an appropriate transform to get the image rotated properly and dekeystoned. Since I have a degree in mathematics, I am allowed to say that the solution is obvious and is left up to the student.
Bravo.

Reminds me of the old joke of a psych experiment where they let a mathematician into a room with a fire in the wastebasket and a bucket of water on the table next to it. The mathematician sees the fire, sees the bucket, and rushes to extinguish the fire.

Round two. They usher the mathematician into a room with a sink and water faucet, empty bucket on the table, and wastebasket fire. He sees the situation, rushes to fill the bucket with water, then places it on the table.

He then states, "The problem has been reduced to a previously solved case."
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Image registration via barcodes

Post by rob »

Ah yes, the mathematician who sees a flock of black sheep from an airplane and says that there is at least one sheep, black on top.

As to the DPI, Dan is absolutely correct, the DPI is now trivial to compute. My barcodes were 828 pixels at 150 dpi apart, or 5.52 inches. I suppose I could have chosen something more convenient, but I sort of placed them at random although in a line. Taking the same point on each barcode (I'll choose the first point in the list), we have 1512 pixels. This yields a dpi of 274 pixels/in, which is usual for my setup! Well done!

Incidentally, each barcode can actually decode to something different. Not sure I want to spoil the secret by saying what my barcodes decode to, but you could assign a different phrase to each barcode so that you even know which barcode is which.

Anyway, here's the solution to the transformation problem I posed above:

x' = (ax + by + c) / (gx + hy + 1)
y' = (dx + ey + f) / (gx + hy + 1)

where x, y is the measured point and x', y' is the corresponding actual point, and a through h are coefficients found via a linear least-squares solution to:
eqn.png
eqn.png (5.14 KiB) Viewed 10211 times
For (x,y) you would use the positions of the barcode finders that you know, because you created the barcode sheet in the first place. For (x',y') you would use the measured positions of the finders, and you know which corresponds to which. In the equation above, you now have twice as many rows as you have points detected in all your barcode finders. In our example, we have 8 points, and so 16 rows. Obviously this overspecifies the equation (16 equations in 8 unknowns) and that's why we use a least-squares fit. We can use Michael Thomas Flanagan's Java Scientific Library to do the linear least squares estimation.

Doing this, and flipping the image to account for my flipped barcodes, results in this transformed image. I also stuck some straight red lines in there so you can see if the image is truly rotated and dekeystoned:
out-small.png
out-small.png (860.13 KiB) Viewed 10211 times
Is it corrected? Well, nearly. Vertically, everything seems OK. Horizontally, there seems to be some problems. I suspect this is because there is a great degree of vertical separation between the barcodes, but very little in the horizontal direction, leading to a poor estimation of the skew. Still, it's pretty good. It's better than good. It's fscking fast. So much faster than my original algorithm, that I'm going to toss that original algorithm. Kick it to the curb. Circular-file it. Send it to /dev/null. Crumple it up until it's all sharp points and stick it where the sun doesn't shine. (Go on, I dare you to click)

As to the question of whether the barcode's z-separation from the actual page plane is important... I suspect a little bit, but not enough to matter. The barcode page will always be at the same z-distance from the camera, while the pages will move up and down due to the thickness of the book. And since we're transforming the image based on the barcodes, the transformed pages will be smaller or larger. That is important, since one now requires some kind of page size equalization algorithm.

But remember: you would need one anyway, since even without the barcodes, the pages will move in height with respect to the camera. So I would argue that the problem is no worse with respect to page size, but we have successfully extracted the rotation and keystoning problem from the page size problem and the line curling problem (which are probably made easier).

I HAS A HAPPY!

--Rob
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
StevePoling
Posts: 290
Joined: 20 Jun 2009, 12:19
E-book readers owned: SONY PRS-505, Kindle DX
Number of books owned: 9999
Location: Grand Rapids, MI
Contact:

Re: Image registration via barcodes

Post by StevePoling »

Doesn't the expression, "fold it three ways and put it where the sun don't shine" come from some Russian novelist?

I'm looking at your barcodes in light of the air-platen. And I think that if all the angles were kept square, one could put barcodes on the side pieces. Probably won't work b/c my air-platen isn't square enough.

The Z-axis discrepancy can be handled if you move the QR barcodes to the surface of the platen. This limits the Z discrepancy to the thickness of the plexiglass. I like the idea of 4 barcodes so that when you set the focus and zoom for the camera, you make sure you can see all four barcodes sharply in the corners then start snapping.
cratylus
Posts: 30
Joined: 04 Mar 2014, 00:52

Re: Image registration via barcodes

Post by cratylus »

rob wrote:As to the question of whether the barcode's z-separation from the actual page plane is important... I suspect a little bit, but not enough to matter. The barcode page will always be at the same z-distance from the camera, while the pages will move up and down due to the thickness of the book. And since we're transforming the image based on the barcodes, the transformed pages will be smaller or larger. That is important, since one now requires some kind of page size equalization algorithm.

But remember: you would need one anyway, since even without the barcodes, the pages will move in height with respect to the camera. So I would argue that the problem is no worse with respect to page size, but we have successfully extracted the rotation and keystoning problem from the page size problem and the line curling problem (which are probably made easier).

--Rob

Rob, this is very cool. I guess I was wondering if the barcodes would eventually become out of focus if you were scanning a particularly thick book.

My new question is this, doesn't this method assume that the position of the scanned page, relative to the barcodes, is the same page after page after page? And, if so, what happens if that relative positioning changes? How would this method handle text slanting one way on one page and then the other on the next page, as some poorly printed books do? Wouldn't it "miscorrect" for the second page? It seems to me that what you'd really need for ROTATION is a pair of barcodes on EACH PAGE. Keystoning is another story.

Thoughts?


Joel
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Image registration via barcodes

Post by rob »

I guess I was wondering if the barcodes would eventually become out of focus if you were scanning a particularly thick book.
Hmm, that's a possibility I hadn't considered. Steve is probably right, putting the barcodes on the platen, say via a transparent sticky, would help that problem. However, I've never been able to get the platen to remain stable with respect to the book itself. There's always some wobble. I would much prefer the barcodes to be in reference to the book.

Given that the barcodes are meant to be read in a variety of environments, perhaps being slightly out of focus won't matter much.
How would this method handle text slanting one way on one page and then the other on the next page, as some poorly printed books do?
Yes, but I consider that a pathological case. If the goal is to get the image exactly the way the page is, then if the printing on the page itself if misaligned, then so too should the printing on the image. In that case, you would need an algorithm to detect rotation by means of text lines. But again, I think that's a pathological case where extra processing would always be called for.
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Image registration via barcodes

Post by daniel_reetz »

It seems to me that an ideal implementation would be a thin-but-stiff card that could be inserted in the book between the first or last page and the cover. It could even have a low-tack adhesive on it, like a Post-It note, so that it would remain in place. You could easily print such a card on a laser printer using heavy paper.
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Image registration via barcodes

Post by rob »

Yes, I think a stiff card is the best. I tried a quick experiment where the barcode page was 1.5 inches away from the page (in height), and saw no perceptible focusing issue. The barcode detector detected the barcodes properly.

I also did an experiment with three bar codes, two on the side and one on top. The result was that the sides and the top were corrected, but the bottom was not! Apparently it is sufficient to have two barcodes on one side of the page in order to correct the vertical, but it may be necessary to have to have one barcode on top and one on the bottom to correct the horizontal.
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
Post Reply