Are DIY Book Scanners better than commercial scanners?

Built a scanner? Started to build a scanner? Record your progress here. Doesn't need to be a whole scanner - triggers and other parts are fine. Commercial scanners are fine too.

Moderator: peterZ

Post Reply
elbourne
Posts: 6
Joined: 11 Oct 2019, 09:35
E-book readers owned: Freedom Scientific Openbook
Number of books owned: 10000
Country: USA

Are DIY Book Scanners better than commercial scanners?

Post by elbourne »

I am new here and I love what y'all are doing.

I am legally blind and have been scanning books for over 25 years. I OCR them and run them through voice synthesis, so I can read them. In the old days I used a flat bed scanner. I now use a product by Freedom Scientific, which is a camera mounted on a stand that works in conjunction with their proprietary software. It does a pretty good job, but I have recently been wondering if there is something better out there.

In my search I have found this site and love the idea of building a rig myself and using open source software for processing.

My question is, how does a DIY book scanner compare with a commercial scanner? Is it comparable to a $525.00 CZUR or a $8,000 ATIZ, or something else?

I'm thinking especially regarding accuracy and speed.

What is going to give me the best OCR results? I have seen the open source Tesseract OCR program, but have not tested it enough; and I do not have Fine Reader. What is your experience?

Thank you so much.
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Are DIY Book Scanners better than commercial scanners?

Post by duerig »

Overall, the quality of a DIY Book Scanner is going to be higher than the flatbed scanners (near the gutters). Away from the gutters, the sensors of flatbed scanners will likely yield better results. But you will always have distorted or cut off text near the gutter since the page curves.

DIY scanners will also be higher than the inexpensive overhead scanners. Those inexpensive overhead scanners usually have cell phone style cameras that are not great resolution. And worst of all, there is nothing that holds the pages flat. So the scanners use one of a number of heuristics to dewarp the page and these heuristics work great some of the time and terribly at other times. They tend to work best on standard printed pages full of text in straight lines. And they tend to fall down if there is artwork or other non-straight text on the page.

DIY scanners usually have a platen that holds the pages flat. The purpose of a DIY scanner is to provide an ideal photographic environment with flat pages and even lighting in order to take a picture. At this point, the best DIY scanners succeed in this and the limiting factor is usually the quality of the camera.

The $8000 commercial scanner will likely get higher quality than a DIY scanner, though that is likely mostly down to the high end cameras used in such a rig. Spending the money on high end cameras with a DIY scanner will yield high quality results that are probably comparable. Though at a certain price point of cameras, you can end up spending a lot anyhow.

And the hundred thousand dollar and million dollar scanners will almost certainly give much better results than a DIY scanner. And they will scan your books automatically and probably tie your shoelaces as well. :)

There is a diminishing returns in this. If you have a overhead scanner and are getting good results with what you have, then there is probably no reason to try to build a DIY scanner. If you just have a copy stand, the benefit you might see from a DIY scanner might end up mainly being a matter of throughput or ease of scanning (DIY scanners often have scanning rates higher than 1000 pages per hour).

Tesseract seems reasonably accurate when I've used it. It does take a long time to run, but that is compute time so there is no need for you to sit and wait as it runs. I don't use OCR much myself, though so I can't comment beyond that.

Best of luck.

-Jonathon Duerig
elbourne
Posts: 6
Joined: 11 Oct 2019, 09:35
E-book readers owned: Freedom Scientific Openbook
Number of books owned: 10000
Country: USA

Re: Are DIY Book Scanners better than commercial scanners?

Post by elbourne »

Thanks Jonathon. That is very helpful.

I think I will probably end up building a DIY scanner. My set up does relativly well, but I think I might could get better results with something like y'all are doing here. And plus, it looks very fun. :)
Post Reply