What is math behind image correction?

Don't know where to start, or stuck on a certain problem? Drop by and tell us about it. Feel like helping others? Start here.

Moderator: peterZ

Post Reply
gemurdock
Posts: 3
Joined: 16 Apr 2015, 10:59
Number of books owned: 0
Country: United States

What is math behind image correction?

Post by gemurdock »

I am going to develop software to take images from a phone camera and turn it into ebooks for kindle. Currently writing it in Java, but may switch to Python. The only thing holding me back is I like GUI in Java better.

Anyway, the project already uses the tesseract ocr engine and I love it. However, I can't seem to get leptonica working correctly so I may just write the image correction part myself. Only problem is I don't know how this is done mathmatically and was hoping someone could point me in the right direction. Does anyone know the math behind fixing images that are distorted by the pages being curved?

Example image:

Image
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: What is math behind image correction?

Post by duerig »

There are multiple different kinds of image correction that you would have to run on this scan. The easiest is a perspective dewarping which corrects for the angle of the camera and the fact that it is oblique to the page. If the camera and book are pretty constant relative to each other, you can use a chessboard or other callibration image to do this.

Second is the warp of the camera. This is also fairly easy because the optic warp inside of your camera will always be consistent. There may even be an in camera setting to correct or calibrate for this.

But the hardest thing is to correct for the curvature of the page. There is a lot of academic literature on this, but most of it comes down to assuming that the text should all be in straight lines, trying to detect the curvature of the text, then using that to estimate the shape of the page as a cylinder or other regular topology, and finally using that shape to dewarp it. You want to search for terms like document rectification and dewarping and similar. ScanTailor also does this, so you can look at the code there.

But there is no single way to do it because it is really just a heuristic. Even the best algorithms produce odd or bogus results with a fair bit of frequency. Especially on pages where the typology does not match the assumptions above. Say, on a map or title page.

I have also done some work on using line focus lasers to more reliably dewarp pages. I think this can give much better consistency than content based heuristics.

I will post some papers that try to dewarp based on content here when I get a chance, so you can look through them yourself. The one which made the most sense to me was one where they did a two phase dewarping. I will link that.

-D
gemurdock
Posts: 3
Joined: 16 Apr 2015, 10:59
Number of books owned: 0
Country: United States

Re: What is math behind image correction?

Post by gemurdock »

duerig, As far as I could tell it seemed that content vs laser seemed to be similar, but if laser is more accurate overall then I would rather program for that since lasers are inexpensive. Where could I find some resources on that in particular? What is the premises behind it?
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: What is math behind image correction?

Post by duerig »

Take a look at these threads for Laser Scan stuff:

http://www.diybookscanner.org/forum/vie ... =17&t=3066
http://www.diybookscanner.org/forum/vie ... =17&t=3079

More generally, I really liked this paper on dewarping:

Webpage: http://academic.research.microsoft.com/ ... 03335.aspx
PDF: http://iit.demokritos.gr/~bgat/3337a209.pdf

It takes as input two arcs and the lines between them and then rectifies it using a 'coarse' algorithm. The authors of the paper find the arcs by looking at the content of the image and assuming that the lines of letters should be straight. But you can substitute laser lines to find the arc instead. It works both ways. They also propose a 'fine' algorithm which does touch-up, but I haven't looked into it much since the results with just the 'coarse' algorithm were so good when I was using it for laser dewarping.

If you want to dive into this area, I would suggest that you take a look at the references they describe as related work in the introduction. There are a ton of papers here, all of which have lots of equations to unravel. :)
Post Reply