So long suckers! I'm scanning the easy way...

Book scanning methods that involve taking books apart.

Moderator: peterZ

Post Reply
mjklin
Posts: 17
Joined: 07 Jan 2010, 12:18

So long suckers! I'm scanning the easy way...

Post by mjklin »

Prove me wrong diybookscanner.org:

I've been lurking for years trying to find a workable non-destructive way to scan books. I'm encouraged by the progress but frankly it's still way behind where I thought it would be by this point. And in the meantime other options have become more viable.

Maybe this is an unstated assumption on this board, but for me my goal has been to have a working station that I could either take into a library or have in a vehicle outside a library. That way I could take care of everything in one shot, especially if said library was far away.

I'm not seeing anything like this available at the moment. So from now on I'm just gonna do it the easy way:

1) Find desired used (good condition) book on Amazon.com for less than five bucks
2) Receive book in mail
3) Cut off binding with portable band saw
4) Scan pages with Fujitsu ScanSnap
5) Run resulting PDF through ABBYY FineReader for OCR'd text
6) Clean up header/footers and other junk with regex program, e.g. RegexBuddy

This process is my gold standard for ease and convenience. I challenge you to make an easier workflow! See you when I check back in a few years' time...
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: So long suckers! I'm scanning the easy way...

Post by duerig »

One of the promises of the laser scanning work is the creation of much more mobile scanners. The software there is at a good state right now, so we just have to figure out the best hardware arrangement. But even if we design a great set of hardware, it won't be as fast or efficient as your destructive method.

Personally, I have found that I've spent more time on post-processing than actually scanning books. While destructive scanning greatly improves the efficiency of the physical scan process, it doesn't help that much with post-processing. The big win, IMO, comes with completely automating post-processing. I have never used ABBYY, but I hear it is a great program. If it requires no manual intervention, then you are probably about as efficient as it is possible to get. My hope is to achieve that using open source tools (Tesseract, DJVU, etc.).
rkomar
Posts: 98
Joined: 12 May 2013, 16:36
E-book readers owned: PRS-505, PocketBook 902, PRS-T1, PocketBook 623, PocketBook 840
Number of books owned: 3000
Country: Canada

Re: So long suckers! I'm scanning the easy way...

Post by rkomar »

Have you told the library owners that you're going to saw the spines off their books?
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: So long suckers! I'm scanning the easy way...

Post by duerig »

Hahaha. I think the idea is to buy a cheap used book from Amazon *instead* of going to the library. For a rare out of print volume, I have a feeling he would have to skip the band saw.
mjklin
Posts: 17
Joined: 07 Jan 2010, 12:18

Re: So long suckers! I'm scanning the easy way...

Post by mjklin »

duerig: I've never been able to set up a workable system for photographing books beyond just using an iPhone. For me there are just too many variables, most notably page curl near the binding. I too am looking forward too new technologies in this space. I had high hopes for the scanner from that Dutch company but apparently it was awful.

rkomar: I may have been unclear. The idea is to buy the cheap copy from Amazon rather than borrow it from the library. Seriously, have you seen the prices on used books? I might spend more in gas getting to the library...
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: So long suckers! I'm scanning the easy way...

Post by duerig »

mjklin: The idea of laser scanning is that you take two photos of the page. One with lasers on and one as a normal scan. The lasers let you determine the shape of the page and dewarp the image to straighten the text and get rid of page curl. You can see the results as I have improved the algorithm here: http://www.diybookscanner.org/forum/vie ... =17&t=3079

Now that the post-processing software works pretty well, the hope is to come up with one or more designs that are simple and/or portable. You end up with a copy stand with a couple of cheap lasers attached. You hold the book under the camera with your hands and turn the pages. No plates or levers or moving pieces are required. Afterwards, all the post-processing is completely automated. You fire it off and then come back when it is done to find a DJVU book with all the pages cropped, deskewed, dewarped, and OCR'd.

We'll see how well this turns out in the long run. Right now, I am very optimistic.

Again, even in the best case I don't think this will go as quickly as an auto-fed scanner scanning flat pages and with a different automated post-processing setup. But I think it is about as close as you can get in the non-destructive world without a high-speed camera and cutting edge image-processing.
dpc
Posts: 379
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: So long suckers! I'm scanning the easy way...

Post by dpc »

Your step #5 (and likely step #6) are going to take far more of your time than you realize. I have a theory that this is the reason why a lot of the people who were very involved in the posts on this website in the past mysteriously disappear after they build a scanner. There's a realization that acquiring images is far and beyond the easiest part of the process of producing a digital book and they give up once they find out how time consuming the post-processing phase can be.

Take it from someone who bought ABBYY Fine Reader Pro thinking he could produce an eBook from scanned images by clicking a few buttons and letting the application do the dirty work running overnight - it takes a lot of hand-holding! After a week of fighting formatting errors and tweaking things with Sigil I finally realized that my time was far more important to me. Now I just archive my books from home library using Adobe Acrobat, creating searchable PDFs.

I wish you the best of luck. If the destructive scanning doesn't work out you can always use that portable bandsaw to cut steaks from a cheap side of beef.
rkomar
Posts: 98
Joined: 12 May 2013, 16:36
E-book readers owned: PRS-505, PocketBook 902, PRS-T1, PocketBook 623, PocketBook 840
Number of books owned: 3000
Country: Canada

Re: So long suckers! I'm scanning the easy way...

Post by rkomar »

LOL! I have a Plustek OpticBook scanner that I bought quite a while ago. With it, I learned how much work it was producing an ebook after scanning. It's for that reason that I'm dragging my feet on actually finishing a camera-based scanner. I know that the fun part is getting it working, and the dreary part comes as soon as it's finished.
Post Reply