Has anyone played with doing really, really high-resolution scanning? Like 2000dpi+? For a 12"x12" scanner, that's 24,000 pixels in each direction, or a... 1.6GB raw image?
Some desktop scanners can do that. Drum scanners can do that and better if you don't mind destroying the book by feeding individual pages and wet-mounting them. I'm interested in non-destructive methods.
If not, how might you go about it? Multiple permanently-mounted cameras and stitching them? That gives you known DPIs and known overlaps. One camera and a mechanism to shift its position and stitch them, doing the whole book in multiple passes? Is there a way to do it without stitching them, but capture a single image?
(No points for suggesting superresolution techniques, but feel free to start another thread about that.)
Not looking for answer for implementation purposes (yet), just for starting a discussion.
A few months ago, I donated most of my old and rare books to the Internet Archive, as well as several out-of-print books that had been donated to the public domain by their author, and donated to cover their digitization. They're all online now, if you're curious.
The books ended up being around 350-400dpi, which is fine for casual browsing and reference, and fine if the goal is "just" to preserve the content "somehow." It's fine to read with, and it's fine if you want to load the pages up in InDesign as a reference and re-make the original layout, and it's fine to get a reasonable print of the original page at the original size.
But, a typographer pointed out that for the letterpress sample books, it wasn't high resolution enough to see all the little imperfections in the prints, which are part of what you look at when you study traditional printing. He suggested at least 2000dpi.
So. 2000dpi. Anyone going there?
Really high resolution scanning?
Moderator: peterZ
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Really high resolution scanning?
Strikes me that this is a problem of intent.
If you're looking for detail usable for letterpress reproduction, you're no longer talking about page images. You're talking about 3D-paper information at the pulp level. Normals, micro-textures, stuff that would be captured by something like GelSight. Because there is no way that the printing techniques, in the hairy, screwy medium of paper pulp, after years of swelling, stretching, etc, accurately reproduce much of any useful information at 2kilo Features Per Inch; but there is also little question that the particular printing technique made an impression that has made SOME kind of detail/error/noise down to that size.
Octavo did extremely high resolution scanning (over 600, perhaps 1200dpi) using "raking light" setups (lights at oblique angles)- which exposed printing impressions on the page as well as paper fiber details and other minor marks. They also folded along time ago. Friends with one of their ex-employees.
The IA scans with 5D Mark II's at a reasonably consistent 400DPI. They capture page appearance rather than text, though they do OCR on the captures.
If you want to capture at that kind of resolution, you must move beyond traditional setups. I mean, a drum scanner may be able to do 24,000PPIbut that is basically microscopy. Let's set 1000DPI as a more reasonable target resolution. The vast majority of output devices (printers, litho, etc) can barely reach that level of resolution in its fullness, regardless of their claims. You could do that with four Canon 5D Mark II cameras using Nikkor 60mm macro lenses pointed at quadrants of the page with more than 20% overlap and fixed focus and apertures 4.5 or greater. You'd stitch those pages together, and there you'd be. Except for camera shake, thermal flex of camera mounts under hot lights, lens distortion, variations in lens "copies" with slightly differently mounted elements, etc.
At some point you are going to be limited by the pixel size of the camera. As an example, for the 5DmII the pixel size is 6.4um, 6.4µ = 0.00025197in. Now figure you need at least 4 of those for each unit of color information, minus whatever the cutoff frequency of the OLPF is, and you can see the kind of trouble you'll get into.
Another thing that bugs me is that printing tech only really has the capability to get contrast ratios of, say, 100:1 or so (making this up, but it's not a lot more than that). So most of the bit-depth of our cameras is wasted on lighting inhomogeneity rather than on the dynamic range of the print. There just isn't that much info in the contrast domain. Color domain is something else but nothing short of a foveon is going to be accurate per-pixel in color.
Frankly, I think the only way to get the info your letterpress friend is looking for is an actual 3D scan; that's where the GelSight type ideas come into play.
My personal future looking plan is to go the other way, but I enjoy thinking about these kinds of limits-of-tech questions, so fire away.
If you're looking for detail usable for letterpress reproduction, you're no longer talking about page images. You're talking about 3D-paper information at the pulp level. Normals, micro-textures, stuff that would be captured by something like GelSight. Because there is no way that the printing techniques, in the hairy, screwy medium of paper pulp, after years of swelling, stretching, etc, accurately reproduce much of any useful information at 2kilo Features Per Inch; but there is also little question that the particular printing technique made an impression that has made SOME kind of detail/error/noise down to that size.
Octavo did extremely high resolution scanning (over 600, perhaps 1200dpi) using "raking light" setups (lights at oblique angles)- which exposed printing impressions on the page as well as paper fiber details and other minor marks. They also folded along time ago. Friends with one of their ex-employees.
The IA scans with 5D Mark II's at a reasonably consistent 400DPI. They capture page appearance rather than text, though they do OCR on the captures.
If you want to capture at that kind of resolution, you must move beyond traditional setups. I mean, a drum scanner may be able to do 24,000PPIbut that is basically microscopy. Let's set 1000DPI as a more reasonable target resolution. The vast majority of output devices (printers, litho, etc) can barely reach that level of resolution in its fullness, regardless of their claims. You could do that with four Canon 5D Mark II cameras using Nikkor 60mm macro lenses pointed at quadrants of the page with more than 20% overlap and fixed focus and apertures 4.5 or greater. You'd stitch those pages together, and there you'd be. Except for camera shake, thermal flex of camera mounts under hot lights, lens distortion, variations in lens "copies" with slightly differently mounted elements, etc.
At some point you are going to be limited by the pixel size of the camera. As an example, for the 5DmII the pixel size is 6.4um, 6.4µ = 0.00025197in. Now figure you need at least 4 of those for each unit of color information, minus whatever the cutoff frequency of the OLPF is, and you can see the kind of trouble you'll get into.
Another thing that bugs me is that printing tech only really has the capability to get contrast ratios of, say, 100:1 or so (making this up, but it's not a lot more than that). So most of the bit-depth of our cameras is wasted on lighting inhomogeneity rather than on the dynamic range of the print. There just isn't that much info in the contrast domain. Color domain is something else but nothing short of a foveon is going to be accurate per-pixel in color.
Frankly, I think the only way to get the info your letterpress friend is looking for is an actual 3D scan; that's where the GelSight type ideas come into play.
My personal future looking plan is to go the other way, but I enjoy thinking about these kinds of limits-of-tech questions, so fire away.
-
- Posts: 138
- Joined: 30 Oct 2010, 23:56
- Number of books owned: 0
- Location: Austin, Texas, USA
- Contact:
Re: Really high resolution scanning?
Oh, absolutely. My intent was to get the books preserved and made available again in a reasonable format and somewhat expediently. 400dpi JPEGs and OCR text are a good first effort at preserving these materials.daniel_reetz wrote:Strikes me that this is a problem of intent.
A typographer's intent is to be able to zoom in reaaaaal close to the letters, I suppose. (He also complained about the Octavo scans, incidentally.)
You know, when he said this exact thing to me, I figured he was blue sky-ing, but, no, here you are with some fascinating links.daniel_reetz wrote:You're talking about 3D-paper information at the pulp level.
That said, I don't believe he cares about physically reproducing the texture, just being able to explore it.
The original Matrix film used still cameras for the bullet time sequences, and there was a writeup or video somewhere explaining how each camera was calibrated individually (or perhaps it was just having its imperfections noted so they could be automatically corrected in post) because of just those reasons. So, maybe it's not impossible. I can't find a reference in a quick search, though.daniel_reetz wrote:You could do that with four Canon 5D Mark II cameras using Nikkor 60mm macro lenses pointed at quadrants of the page with more than 20% overlap and fixed focus and apertures 4.5 or greater. You'd stitch those pages together, and there you'd be. Except for camera shake, thermal flex of camera mounts under hot lights, lens distortion, variations in lens "copies" with slightly differently mounted elements, etc.
Do you mean pursuing faster, cheaper, lower-resolution scans?daniel_reetz wrote:Frankly, I think the only way to get the info your letterpress friend is looking for is an actual 3D scan; that's where the GelSight type ideas come into play.
My personal future looking plan is to go the other way, but I enjoy thinking about these kinds of limits-of-tech questions, so fire away.
Thanks!