Help Wanted - Will pay for expertise in batch processing of JPEGS

Don't know where to start, or stuck on a certain problem? Drop by and tell us about it. Feel like helping others? Start here.

Moderator: peterZ

Post Reply
User avatar
imperialist1960
Posts: 5
Joined: 08 Mar 2016, 03:29
Number of books owned: 5000
Country: United States

Help Wanted - Will pay for expertise in batch processing of JPEGS

Post by imperialist1960 »

I run a website that is an archival resource of scanned technical material and is free to the public.

The content was aquired and scanned 15 years ago. The JPEG images were manually processed by volunteers who got burned out before the project was completed. I seek someone who has automated/can automate a system so that the images are straightened, descreened, whitened, sharpened, etc.

There are perhaps 60 books of 400 pages or something - I'd have to count, but that's ballpark.
There is no deadline, but progress must be made and made consistently.

Do you know how to process JPEGS to create straightened, cleaned images?
The pages were scanned by after spines were removed, first one at a time, then by small-batch feeder, and the last were done on a large office-class machine after hours while the boss was away. Last 50% were done in the time it took to do a chapter at first...

This is a PAID gig. Please forward this post to anyone you know.

We would need a single volume done to our satisfaction as proof of ability to proceed.
Once set up, how hard can it be? Hard enough for us. We need help!
www.imperialclub.com
100,000+ pages of electrons that salute America's Most Carefully Built Car
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Help Wanted - Will pay for expertise in batch processing of JPEGS

Post by duerig »

I think that for your particular use case, this script might do everything you need:

http://www.fmwconcepts.com/imagemagick/ ... /index.php

Play around with the options until you find something you like. Then just start it running on your images and come back to it in a week and see what it looks like.

Since you used a document feeder, you don't have to worry about cropping which is a tricky problem for post-processing camera based scanners. This sharpens, cleans the background, deskews (called unrotate in the script), etc.

-D
User avatar
imperialist1960
Posts: 5
Joined: 08 Mar 2016, 03:29
Number of books owned: 5000
Country: United States

Re: Help Wanted - Will pay for expertise in batch processing of JPEGS

Post by imperialist1960 »

Fantastic - I will look into this - really appreciate the tip.
www.imperialclub.com
100,000+ pages of electrons that salute America's Most Carefully Built Car
User avatar
imperialist1960
Posts: 5
Joined: 08 Mar 2016, 03:29
Number of books owned: 5000
Country: United States

Re: Help Wanted - Will pay for expertise in batch processing of JPEGS

Post by imperialist1960 »

The answer to my question was a "Document Scanning Service Bureau".

Found them on Google (they're all over the place).
They do what this list exists to teach one to DIY and will scan for a fee, process scanned images for a fee.
In this case, processing of scanned jpegs will run about $0.02-$0.05 depending on complexity.

Raw scans are processed start to finish at $0.08-$0.12 per sheet via a commercial process using tools that do the loose docs through a high-speed mechanical feeder.

Prices are probably lower in lower cost metro areas (I'm in SF Bay Area, where everything's expensive).
www.imperialclub.com
100,000+ pages of electrons that salute America's Most Carefully Built Car
joseph73
Posts: 24
Joined: 10 Jan 2013, 22:02
Number of books owned: 1000
Country: USA

Re: Help Wanted - Will pay for expertise in batch processing of JPEGS

Post by joseph73 »

I can do this. I'm studying for a big exam and have extra time. I have a pretty extensive (mostly automated) process to process books, background yellowing removal, text line straightening, then output to pdf file. It has to be tweaked for each book as the quality of text differs. Some ink is dark, some light. Some books pages are degrading, etc. I tend not too use text binarization as it degrades the text quality too much. Leaving it in grayscale or color means more cpu processing, and a little bigger file sizes, but *MUCH* better text quality. File sizes are 8-10mb for text only. Maybe 50-100mb for full color books, average 300-400 pages. Jpeg2000 could reduce that size if needed. Books are searchable. I use full spectrum lights and a foveon sensor for better color images, the files work best for post-processing, but can process any image file.
Post Reply