Let's Make A DIY Book Scanner Test Chart

A place to tell us about your work and projects. Self-links encouraged!

Moderator: peterZ

User avatar
JonEP
Posts: 81
Joined: 19 Apr 2010, 15:09

Re: Let's Make A DIY Book Scanner Test Chart

Post by JonEP »

I like the idea of making a DIY Book Scanner Test Chart, but at this point, for me, the mechanics of photographing the book take a distant second place to the entire conundrum of post-processing, in terms of what I find problematic with the DIY Book Scanning Endeavor (to continue the Winnie-the-Pooh style capitalization theme, which I Very Much Like :D ). My current scanning rig more or less does the trick for me (I'd switch out for better cameras, and figure out how to get the front and back of my platen to come down with equal force, but no complaints otherwise). Scan Tailor is so tantalizingly almost working, but also so confoundingly Often Needing My Personal Attention when I'd like it to be automatic, that I'm far more interested in thinking about a Post Processing Workflow Test Chart of some sort....
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Let's Make A DIY Book Scanner Test Chart

Post by daniel_reetz »

What do you see as the difference between a post-processing test chart (as you envision) and the vision of a complete-system chart as I laid out earlier in the thread?

I hear you on the post-processing. If your rig is stable, have you looked much at steve1066ds new app? It allows for the manual specification of pretty much everything.
kasslloyd
Posts: 41
Joined: 19 Dec 2010, 21:25

Re: Let's Make A DIY Book Scanner Test Chart

Post by kasslloyd »

as in the other thread about something similar, creating a mockup book of free art from say wikimedia commons in Adobe InDesign is something I could quickly produce if that is a viable way to create a test case.

To test against the dot patterns of screening with pictures, as in printed on a press. At a press they use a specialized piece of software/hardware called an Imagesetter... to simulate the dot patterns as if it was printed on an offset printer I don't know how to do that on normal laser printers, even very high end ones. You can approximate colors and looks for proofing in Acrobat, but thats not the same..

I don't think it's necessary to worry about the halftone patterns since I don't think many scanners here will be scanning at high enough DPI to really make those visible...

Thoughts?
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Let's Make A DIY Book Scanner Test Chart

Post by daniel_reetz »

I think we need something like that, plus some test elements. Like, as mentioned earlier in the thread, some lorem ipsum at various sizes and various places across the page.

I'd love to see something mocked up if you're motivated to do it - we can perfect it as we go along.
User avatar
reggilbert
Posts: 49
Joined: 28 Sep 2010, 19:57
Number of books owned: 3000
Location: Buffalo, New York

Re: Let's Make A DIY Book Scanner Test Chart

Post by reggilbert »

kasslloyd wrote:At a press they use a specialized piece of software/hardware called an Imagesetter... to simulate the dot patterns as if it was printed on an offset printer
I believe an imagesetter is the machine that outputs either positive images (on film or paper, pasted up on pages and given to the print shop to be printed), or negative images (on film, produced by the printer from either the customer's positive or from a digital file, used to shine light through to burn the metal plates that are put on the press) or, perhaps nowadays (this was just coming in when I left the publication business ten years ago), the metal plates themselves (produced by the printer to put directly on the press).

For the printing house it is not a question of simulating dot patterns. It is a question of photos being given a pattern because their machinery cannot print photos without them. Regular offset printers, from three-foot-high machines you can still find in some small urban print shops to the multi-story monsters that produce newspapers and books, are what you might call "ink or no ink" affairs, that is, they are not capable of the subtlety of film paper with resolution so high that no dot is apparent in a picture. In the case of black and white, offset printers are effectively binarizers. There is an equivalent process even with color printing (you can see a dot pattern even in the National Geographic, the epitome of sophisticated color printing). So you couldn't just slap a photo into a layout because the camera work needed to produce film or plate somewhere along the line has to turn that photo into "dot / no dot" form. That process has to be carried out by the customer, because you don't want to apply a dot pattern to a whole page (or more likely, a paste-up of four, eight or more pages) that includes type. But how the customer carries out the process is dependent on the printer's printing machinery, which requires a given resolution as applied to printing pictures, specified in lines per inch.
kasslloyd wrote:I don't know how to do that on normal laser printers, even very high end ones. You can approximate colors and looks for proofing in Acrobat, but thats not the same.
Exactly. What I have been trying to get at in this thread are two problems.

One problem -- only a possible problem, I'm just raising the issue -- is that books contain photos of differing lines-per-inch / dots-per-inch resolution. Might those different types of printed pictures yield different results under different lighting / camera setting scenarios?

Another problem, what I think you are getting at above, is how to reproduce book-like pictures on a test page. We are adding two layers of rasterizing / screening, whatever the appropriate technical term is, to the image -- scanning for inclusion in the test page, then the printing of the test page.

I am hoping that someone knows a graphics professional who could help us figure out:

1) Do we need to concern ourselves with the lpi / dpi of the specific photos included on a test page and, if so, how can we obtain pictures we know have that / those resolution(s)?

I seem to recall that print houses have a loupe and some chart that helps them do this so they can figure out if their customers have produced workable master images.

2) What issues are involved with producing any dot-patterned printed image on a test page?

I have not been able to even close to faithfully reproduce a book image in test scans and prints. I have scanned black and white book photos at 100, 200, 300, and 600 dpi and printed them at 600 dpi and 1200 dpi (in other words, eight test images). All printouts yield an image with a visible dot pattern different from the original. That visible dot pattern is similar to, but not quite the same as, the so-called "moire" pattern created in printed photos of printed photos, in other words, a photo that had to be screened twice.

Bottom line, so far, no way to create an image of a book photo that would make any sense to include in a test page.
kasslloyd wrote:I don't think it's necessary to worry about the halftone patterns since I don't think many scanners here will be scanning at high enough DPI to really make those visible...
Now I am not sure what we are talking about. If this is true (and maybe it is), then all of the above is unnecessary. But my understanding is that the lpi /dpi of book images is low, speaking relatively, generally below the 300 dpi often discussed as optimum on this forum (300 dpi, oddly not any higher, is the ideal dpi for OCR). Or maybe the issue is moot only at certain relationships of source / scanning resolution, whether higher, lower, or certain multiples of either.

We really need a graphics professional to weigh in here. Anybody know one who could run an eye over this thread?
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Let's Make A DIY Book Scanner Test Chart

Post by daniel_reetz »

OK, let's unpack.

0. If we relied on professional input and opinion to get things done, this project and most of its innovations would never have gotten off the ground. We shouldn't wait for input that almost invariably says "that's impossible" - it's not, this is disruptive technology. That said, I'm not going to ignore any incoming advice.

1. If you want the standard to be exactly like a book page, then the simplest and best thing to use is an already-printed book page. It already has all the properties you describe, the exact type of printing you are interested in, and probably the mixed content, too. The major problem becomes that we can't print it, because we don't have a million dollar press. So we're back to people buying books for every person in the forum, or mailing a book around. Or stealing Gideon Bibles from hotel rooms.

2. Cameras are fundamentally different imaging devices than flatbed scanners and drum scanners, so most knowledge from that domain doesn't apply. For example, the major reason we don't usually see moire patterns on photographed images, but we DO see them on flatbed-scanned images, is because of an element called an OLPF in the optical path of the camera. That's one example, if you want to get technical, I build cameras and camera systems at my job and spent the last 4 years setting up, modifying, and calibrating displays to exacting standards for neuroscience research. As someone who works with that kind of thing day in and day out, my opinion is that we don't need to get that technical to have a good, working standard.

3. As I mentioned before, if you are interested in testing just the resolution limits of the camera, then the best thing is to look at the highly controlled test images and information here and here. The MTF charts they make available specify the exact ability of the camera to resolve image content at different frequencies and different spatial scales.

4. Personally, (and this is where we might just have to make two charts and fork this project) I am more concerned with creating a common standard than with imitating a book. Fundamentally we need a point of reference that is diagnostic. Specific problems with specific books can be handled on a case-by-case basis. No scanner can perfectly handle all books, we need to accept that. A test chart can give us an idea of whether or not they've got white balance right, how on-axis their cameras are, how even their lighting is, what kind of glare problems they have, and a host of other setup issues that have little to do with cameras and printing issues. A well designed chart - not necessarily a well-printed one - would also give us a good idea of how much their imager is resolving. We can even use it to straighten out their images as Rob has done here.

5. We have almost no control over the print drivers in our consumer level printers. In print shops, they have a RIP that gives them exquisite control over the print hardware. We don't have that kind of control. Which leads me to

6. The answer is that we don't apply any screening to the images ourselves. We create an image at a very high resolution (say 1200 or 2400DPI) which is a higher resolution than any printer is capable of printing. Then we send that image to the printer at maximum quality settings (my printer has two modes: High Quality and Ink Saver) and let the print driver process it for the specific quirks and capabilities of that hardware (after all, that is precisely the job of a print driver). The resulting print will be the maximum that the hardware can do under ordinary conditions. More on this below.

or 7. We do exactly that, but have it printed at a "professional" shop and mail them out. If we do that, we still need to share the source file so people who can't or won't wait can go their own way.

So to address your specific questions, as a non-print-professional-but-most-definitely-comfortable-with-cameras-right-down-to-CMOS-level person who is deeply invested in the capture side of all this:
1) Do we need to concern ourselves with the lpi / dpi of the specific photos included on a test page and, if so, how can we obtain pictures we know have that / those resolution(s)?
WE specify the "DPI" of the digital document. If we make a document, say, 10x10 inches, and put a 10,000x10,000 pixel photograph in it, the image will be 1000DPI (really, PPI). We can synthetically generate that photograph, take our own pictures, or render some kind of MTF chart. We can also render an image at the desired resolution that has every other line black and white or other tricks to deliberately produce moire. Creating images is not difficult in any way. Having set an image with enough pixels on the page (the SCREEN image vs the PRINTED image), it is up to the print driver/hardware to do screening as it deposits each little bit of ink on the page.
2) What issues are involved with producing any dot-patterned printed image on a test page?
That is the work of the print driver, and is unmodifiable from our POV. If we don't introduce problems (and by that, I mean adding our own dot pattern to the image BEFORE printing, which will invariably introduce some kind of moire) we will simply be working at whatever the max resolution of the printer is.
I have scanned black and white book photos at 100, 200, 300, and 600 dpi and printed them at 600 dpi and 1200 dpi (in other words, eight test images). All printouts yield an image with a visible dot pattern different from the original.
You scanned them on a flatbed right? Flatbeds have inherent moire issues. Cameras also have these issues, but much less so.
Bottom line, so far, no way to create an image of a book photo that would make any sense to include in a test page.
Here we are in total disagreement. We can have an original, high-resolution natively digital image for people to look at. Something with fine detail, like, say, a tree. Since they can see the ORIGINAL digital image, AND the printed page, AND the output of their scanner, they can judge for themselves what is being resolved and what isn't, and so can we. Or maybe we are in total agreement - we shouldn't halftone images before the printer does, too many issues with that.
Or maybe the issue is moot only at certain relationships of source / scanning resolution, whether higher, lower, or certain multiples of either.
It is true that certain spatial frequencies will induce moire, and that is frequency is dependent on printed DPI, printed dither patterns, camera optics (in three dimensions), sensor density, OLPF frequency, and different debayering algorithms. However, in the hundreds of books that I've scanned, it's never been problem enough to matter on the capture side. If we introduce a halftone pattern into the SOURCE image, it is definitely possible to prematurely introduce moire into the PRINTED image, so we should just Not Do That. I can't see a good reason to do it. It's a fundamental fact that sampling at less than two times the maximum spatial or temporal frequency in the source signal will introduce aliasing. Because the total number of parameters interacting is great, we should strive to simply put a high-resolution image on the page at the maximum resolution of our printing hardware, rather than compounding the problem with frequency-multiplied input of a pre-dot-patterned image.
kasslloyd
Posts: 41
Joined: 19 Dec 2010, 21:25

Re: Let's Make A DIY Book Scanner Test Chart

Post by kasslloyd »

Well to put out there I have about 10 years professional experience in print media, from high end magazines and brochures to newsprint. At the newspaper we referred to the device that printed the film that was used to burn the plates for the press we called it an imagesetter, it was technically a RIP.

I agree to a point with Daniel, but I think 2400 dpi is insane, 1200 is probably insane.. Any printer any of us could afford can't handle more than 600 dpi, and even those don't usually "truly" support 600dpi for raster images, maybe for vector. Magazines are printed at 300dpi, if it's a very high quality art book 600dpi, beyond that I've not ever experienced in the industry. A 8x11 1200dpi image would choke most modern computers and printers just simply don't have enough memory to handle that.

The example biology book quick thing I made previously in the other thread was at 300 dpi for the raster art. It's easy to get whatever we want, just find a freely available image on commons.wikimedia.org and so long as it's big enough once resized we're fine. I have Adobe Creative Suite CS5 (I love being a student, got it for $350 with my macbook.. heh).

Commons has some extremely large images that can be used to generate any DPI image you want... some of them are listed here:

http://commons.wikimedia.org/wiki/Category:Large_images

I have to admit I haven't read every post yet in this thread, but it doesn't sound like we've really decided what to use as a test case.. what was you imagining?

1 single page test case? A set of pages? What all do we need to test?
StevePoling
Posts: 290
Joined: 20 Jun 2009, 12:19
E-book readers owned: SONY PRS-505, Kindle DX
Number of books owned: 9999
Location: Grand Rapids, MI
Contact:

Re: Let's Make A DIY Book Scanner Test Chart

Post by StevePoling »

daniel_reetz wrote:Or stealing Gideon Bibles from hotel rooms.
If you'd like a Gideon Bible, I know some Gideons and I can get you one.
User avatar
reggilbert
Posts: 49
Joined: 28 Sep 2010, 19:57
Number of books owned: 3000
Location: Buffalo, New York

Re: Let's Make A DIY Book Scanner Test Chart

Post by reggilbert »

Pasted in below (as degraded JPEG) and attached (as much sharper PDF) is a proposed test page, designed to fit the criteria listed in Daniel's last post, which I understand to be to enable assessment of evenness of lighting and effectiveness of camera settings as applied to the basic mix of types of elements in books and magazines.

This is just a proposal. I am willing to produce whatever emerges as the consensus on the forum.

Features of this test page:

--An arrangement of type and b&w and color images that allows some portion of all three to show up in the space on the cradle likely to be occupied by the item to be scanned, whether that item is large or small or oddly dimensioned.

--The same image in the four corners for comparison of lighting on different parts of the page

--Small-to-large type range, from the 4 point conceivable in some footnotes to the 24 point of some titles, appearing in two places on the page for the same reason

--The type contains all the letters of the English alphabet and many of the symbols found on a QWERTY keyboard. I don't think the issues change for any other kinds of characters, but the type could contain some accented Roman and other characters from other Indo-European languages and some characters from Arabic and Asian languages. I would need some help figuring out how to do the latter.

--Some vertical and horizontal straight lines to judge keystoning and lens distortions

Notes: the source image for the photo, a daguerreotype of Frederick Douglass, was 2089 x 3000; the source image for the color picture, a painting of John Brown on his way to his execution, was 1632 x 2016; the source type is just the New Times Roman found on my Windows machine, chosen for what I hoped would be relatively wide availability when printed by others, but the copy of Acrobat on that same machine reported that the font was not available when I tested the document by trying to edit it, so it may be necessary to see if we can force whatever format we use to actually include the font inside itself.

The document was produced in a free page layout program called PagePlus Starter Edition that limits functionality to induce you to upgrade. So it was possible to save the document only in the PagePlus format. However, it was possible to produce something that could be shared by using a PDF printer driver. If we end up wanting a non-PDF format for the test page, once we settle on what the test page needs to look like I am willing to do what is necessary to execute the layout in that other format. Also, somebody with a more standard page layout program could reproduce what we settle on, probably in just a few minutes.
scanner test page draft 1.jpg
scanner test page draft 1.pdf
(532.76 KiB) Downloaded 618 times
kasslloyd
Posts: 41
Joined: 19 Dec 2010, 21:25

Re: Let's Make A DIY Book Scanner Test Chart

Post by kasslloyd »

Do we want to test functionality of say content selection algorithms like in ST for the Mixed mode? if so then diagrams and other strange things will be necessary.. if it's just to test the quality of cameras and resulting books then pictures would be good like those, but also maybe an actual modern color picture over old bw/paintings.. A very dark one where black extends to all the borders and a very light one where white extends to all the borders, to see how it handles the two, imo.
Post Reply