Raspberry PI camera?

Everything camera related. Includes triggers, batteries, power supplies, flatbeds and sheet-feeding scanners, too.

Moderator: peterZ

BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Raspberry PI camera?

Post by BillGill »

It has occurred to me that a good camera for scanning books might be a Raspberry Pi camera. I did a really quick check on line and found this one on Amazon. https://www.amazon.com/Raspberry-Pi-Cam ... B01ER2SKFS At $25 dollars plus a Raspberry Pi kit ($50 at the top of the link) you could have a relatively inexpensive 8 MP camera.

You might have to configure the software, but there is a strong community of Raspberry Pi fans who have probably developed software that could be used to control the camera and download the pictures to your main computer.

Now that I have thought about this I may wind up having to actually do something with it. Of course I have been thinking of another project, so maybe not. We shall see.

Bill
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Raspberry PI camera?

Post by duerig »

I have played around with this. There are a few limitation of Raspberry Pi cameras. The 8 MP resoloution, the complete lack of zoom, and the fact that a single Raspberry Pi can only handle one camera at a time. But they are cheap. They have manually-set focus control. And they are relatively more reliable than the CHDK-based solutions since they are designed up front to work with a computer controller rather than having that added in later by a third-party.

There are two possible ways to control dual cameras with the Raspberry Pi. First, there is a custom board somebody created that acts as a camera multi-plexer and is controlled with GPIO pins.

The second is that you can actually use three Raspberry Pi boards. Two Raspberry Pi Zeros and a Raspberry Pi 3 'hub' board. Each Raspberry Pi zero controls one camera and then talks using Ethernet-over-USB to the hub. The hub then consolidates the images and/or provides a UI. I got a proof-of-concept working for this at one point. But it would be neat to see it taken further.

-Jonathon Duerig
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Raspberry PI camera?

Post by BillGill »

I just got a Raspberry PI camera and am starting to play around with it. The lack of zoom doesn't bother me, since I am currently using a single camera system with the camera placed so that it just covers the platen with no zoom. I am hoping the 8 MP resolution will provide a better input to the OCR, so that the output will be closer to the original, without needing so much editing.

One of the things I have noticed about it is that there are a lot of cables coming out of it. Power, monitor, mouse, and keyboard. Some of those can be removed by using a wireless mouse and keyboard. Connecting it to another computer would add one more cable. So there may be some cable control issues.

Also I will need to work with the software to figure out the best control strategy. I may be able to download some good stuff from the Raspberry PI community. I may have to learn Python if I have to do any programming for myself. That may be frustrating, but I have done quite a bit of programming, so it shouldn't be too much trouble. Just the usual frustration of learning a new language.

Bill
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Raspberry PI camera?

Post by BillGill »

I got started working with the Raspberry Pi camera yesterday. I think maybe AMT's USB camera https://forum.diybookscanner.org/viewt ... =15&t=3535 might be a better choice than a Pi camera. If, as he says there is free software to control the camera, then you get to bypass a bunch of stuff in processing the pictures.

I have been able to get pictures using the Pi Cam, but I am still having problems getting the physical set up right. The thing about the PI Cam is that it has a manual focus. It comes with the focus set at infinity. If you want a closer focus you have to manually set it. In theory this is relatively simple. My camera came with a focus tool that slips over the lens and allows the focus to be easily adjusted. The problem I have encountered is that when I have the camera installed in a case I can't get the focus tool on the lens. Here is a photo of the camera installed in a standard style of case.
PiCamFocus.jpg
PiCamFocus.jpg (240.58 KiB) Viewed 17546 times
Notice that the lens (that's that little black thing with a dot in the middle peeping through the holei n the case) is totally enclosed, with no way to reach it with the adjustment tool. That's the white plastic circle at the bottom of the picture. I thought at first I could enlarge the hole it is peeking through, but drilling one big enough for the tool would damage the camera mount, so that isn't an option.

One option would be to mount the camera separately, but that would leave the electronics exposed. I really don't like exposed electronics. I can probably come up with a solution, but it is obviously not a real simple job.

As far as I can see the USB camera is the same as the Pi Camera, with a different interface. So AMT will jut need to come up with a realistic mounting method.

Bill
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Raspberry PI camera?

Post by BillGill »

Ok, I don't have a full setup using a Pi Camera for scanning, but I have gotten far enough to do some test scans. I had some problems coming up with a mount that (almost) satisfies me. I decided that the main item of interest was not exactly the look of the images, but what the OCR saw in the images. So I used an image from the latest book I have been scanning. Then I took a fresh scan of the same page in the book and scanned it using the Pi Camera. I then ran both of the through the OCR program. I am using AABBY Finereader 14. I saved the output as Word 16 files (.docx). Then I took screen captures of the files and here they are.

This is the text output from an image taken by a Canon Elph 160.
Canontext.jpg
This is the text from an image taken by the Pi Camera.
PiCamtest.jpg
As you can see the text converted from the Pi Camera is much better than that from the Canon. I want to emphasize that scans from this book seem to be low quality. That I think makes a good point for using the PI Cam.

This does not mean that I am pushing the use of a Pi Cam for scanning. There seems to be considerable effort required in getting the whole system working. Using the USB camera suggested by AMT may be a better way, since it appears that it can be integrated into a PC.

The system I am using right now takes one image at a time, and then it has to be reset. There are ways to have the image captured on command, I just have to find the best one and install the correct software in the Raspberry PI.

Bill
dpc
Posts: 379
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: Raspberry PI camera?

Post by dpc »

Wow. Interesting results. It still would be worthwhile to see the image files directly from the two cameras to try to figure out why there was such a big difference in the final two files you've posted. Do you have the original page images available?

The OCR'ed results are so dissimilar that if I didn't know better I'd say those came from two different pages! That's the problem with AABBY though. Even with the better camera you've still got a bit of manual editing to do on that page. I finally just gave up trying to produce OCR'ed ebooks and went to searchable PDF to archive my library. Sure, the files are larger but storage is a lot cheaper than my time to proofread and manually edit thousands of pages.
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Raspberry PI camera?

Post by BillGill »

I didn't post the original files because the files are so large. I would have to run them through Photoshop to compress them down to a size that I can upload to the forum. So the differences might be lost to the compression. I want to emphasize that this book was worse than average. I spent almost 18 hours doing the editing. The one I did just before this one was a much better scan and it took about 14 and a half hours. My impression is that it was a longer book. It had a few more pages, but I think the type used was more condensed.

For reference I just checked the earlier, and easier, wound up being 140 pages in Word. The bad one wound up being 88 pages in Word.

OK, I just tried compressing the files enough to load them into the forum. It looks like a losing proposition. I could probably do it, but it would take more time than I am really interested in.

I prefer to go for an EPUB output. The file can then be transferred to my device and is easily readable on many devices. PDF may be good, but I'm not sure about it. At least not enough to use them for general reading. I am trying to migrate my library to my tablet so I will be able to take it with me if I wind up having to go into an assisted living facility.

Bill
dpc
Posts: 379
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: Raspberry PI camera?

Post by dpc »

OK, well thanks for trying. You might be able to upload your large images to one of the free image hosting sites out there and post a link here. You have to be wary of compression artifacts though. Some of these sites will allow large image files to be uploaded but they will compress your file to to a more lossy jpeg so that it's much smaller in size.

Wikipedia: List of image-sharing websites
zbgns
Posts: 61
Joined: 22 Dec 2016, 06:07
E-book readers owned: Tolino, Kindle
Number of books owned: 600
Country: Poland

Re: Raspberry PI camera?

Post by zbgns »

Maybe resolution of images you load to FineReader is too big? I mean especially images taken by Canon. It needs to be taken into consideration that OCR systems do not like both too low and too big DPI. Usually optimal is 300 DPI as OCR engines are trained to work with images of this resolution. It applies to FineReader too: https://abbyy.technology/en:kb:images_r ... n_size_ocr

In case of bigger DPI, OCR systems may be overreactive and interpret garbage (spots, stains and so on) as letters, numbers and signs. I guess that it may be in case of the output from the Canon image. Also misrecognized text attributes like bold and italic may signalize this. So it seems probable to me that the image from Canon was 'too good' for FineReader, whereas Pi Camera output was of too low quality. In both cases it resulted in significant number of OCR errors, however in different places. FineReader is very good OCR technology, so I would expect 1 - 3 errors per page in typical situation. Otherwise there must be something wrong with quality of input images.

I think also that FineReader wrongly identified block of text in both cases and covered something that creates vertical line after binarization of the image. Maybe it is edge of paper of part of gutter. Signs like ']', '|', '1' in places where originally were line breaks may indicate this.

As I understand you performed OCR on raw images taken by both cameras with no preprocess. Maybe it is worth consideration to use Scan Tailor in order to obtain images that are binarized, descewed, with proper DPI and with no distortions? If you do not like Scan Tailor for any reason, you may use e.g. Photoshop for this test. After this accuracy of OCR on raw and preprocessed images may be compared. I also hope that you will find a solution to upload sample images.
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Raspberry PI camera?

Post by BillGill »

I attributed the improvement of the OCR to the fact that the image from the Pi Camera has much better resolution. However the Pi Camera has 8 MPX resolution and the Canon has 20 MPX resolution. So that is backward from what I was thinking. However the image size is the other way around. The Canon image is 1.59 MB. The Pi Camera image is 4.41 MB. The images are both JPGs. The 2 cameras must be using different compression algorithms. I suspect that the PiCamera image has enough better detail that Finereader does a better job. Possibly cropping the overscan off of the images would help. I may try that.

Bill
Post Reply