Using a Reverse Warp Calculation to Dewarp

Discussions, questions, comments, ideas, and your projects having to do with DIY Book Scanner software. This includes the Stereo Data Maker software for the cameras, post-processing software, utilities, OCR packages, and so on.

Moderator: peterZ

Post Reply
User avatar
Mark Main
Posts: 17
Joined: 19 Oct 2010, 04:34
Number of books owned: 0
Country: United States

Using a Reverse Warp Calculation to Dewarp

Post by Mark Main »

I think that I have an eloquent solution to accurately dewarp. I’m getting ready for a new job, and so I want to offer this to the community so that it doesn’t get lost. If this is already a known idea, I am sorry, for my lack of time to investigate. I woke up with the idea in my head, and I spent a few hours working on it and now here I am writing to you.

This idea assume that lines (e.g. snake lines) will be used on the book to accurately judge the warp and that an algorithm will be selected to dewarp—the idea that you see presented here is a way to improve upon the accuracy of the algorithm.

I was originally coming up with ways to accurately calculate the dewarping and move all the dots from the Source to their new coordinate positions (“Cell”) onto a New Grid—this forces you to deal with issues of compression as more than one dot wants to land into the same Cell on the New Grid; or with decompression we have blank spots and need to guess what that value should be.

That’s not the best way—instead, do it backwards, reverse the math. Don’t move the source to the New Grid, instead derive the New Grid from the Source—but increase the resolution of the New Grid first and then reduce it in size after you have derived all the values.

I truly believe this will accurately do the job and here’s how:
  • 1. Build a “New Grid” that is 900% larger than the “Source”. I chose 900% because this is a perfect 3x3 finer grid resolution to the Source—it makes the concept easy to talk about. Any perfect square (e.g. 3x3, 4x4, 5x5, etc.) makes the math easy. Using 1600% (4x4 expansion) or 2500% (5x5 expansion) would produce even better results, but for this discussion, I'll assume 900%.
    We now have a New Grid that is 900% larger—it’s a much finer resolution, which allows dots to be more accurately positioned based upon the dewarp algorithm.

    2. For every cell on this finer New Grid, reverse calculate where it would reside on the Source and then take the value of the source.
    Yes, this means that a black dot on the Source will be duplicated on the New Grid; a perfect move would duplicate 9 times, but based upon the dewarping algorithm of the coordinate positions the math it could be more or less than 9. What is important to note here is that the RGB value identified in each New Grid location is the best perfect estimate as to what the RGB value should be for that location because we went to go find it in the Source based upon the math.

    3. Now we just need to reduce the size of the New Grid so that it’s back to normal size. This is easy. We simply take each 3x3 grid and blend (average) the gray scales or colors so that we have a new RGB value. This is accurately reducing the gray scale and colors from 900% down to 100%.
FINAL COMMENT

By increasing the New Grid size, it increased the accuracy of the dewarping algorithms used. Regardless of the algorithm, when we use math to make flat something that was curved, it will help to increase the resolution of the New Grid. Increasing the resolution first and then shirking the size back down later allows us to more accurately determine the best shade of gray or blended color for each dot. This is because increasing the resolution size like this allows us make the lines of the letters appear more smooth because the math will do a better job of using gray scale along the edges and corners of the line. I think that increasing the New Grid size so that its resolution is 5x5 times greater than the Source is the best choice; it's not overly large, but still offers much improvement over 3x3. I think there will be diminishing returns beyond 5x5—perhaps 7x7 and more is not going to see much greater increase in quality. Whatever algorithm is selected to dewarp, this logic provide here should improve upon its final rendering quality because it does a better job of "averaging" the best RGB choice for a line along its edges.

Here is an analogy why it works: when you are measuring something using a ruler, and you see that the actual length falls right between the two tiny lines on the measuring stick, it's always better to guess what the final decimal digit is based upon where it happens to fall—this is true because your best guess is far more accurate than simply truncating that final decimal to zero or rounding it up a value; guessing it to be .5 or .8 or whatever is better, it's more accurate. That is what is going on here. When moving these dots around based upon the algorithm there are going to be many times where a back dot was place to the left or right (or up/down) simply because some fraction of a fraction in the math said so—increasing the resolution size and then shrinking it again later allows for more shades of gray to be determined all along the edges when appropriate.
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Using a Reverse Warp Calculation to Dewarp

Post by Tulon »

It's been invented before :)

Mapping from destination to source pixels is a standard way of doing any geometric transformations.
Instead of explicit upscale->transform->rescale approach, people have been doing something called "area mapping", which has the same effect. You project your destination pixel to source image coordinates, producing an arbitrary quadrilateral. Then (for speed) you approximate it with a square or a rectangle and see which source pixels and what fraction of those it covers. Then you assign your destination pixel the weighted sum of those source pixels.

And yes, Scan Tailor does it already.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Post Reply