Feedback for improvement of my scans

Everything camera related. Includes triggers, batteries, power supplies, flatbeds and sheet-feeding scanners, too.

Moderator: peterZ

Post Reply
meelash
Posts: 5
Joined: 20 Jul 2016, 07:24
E-book readers owned: 1
Number of books owned: 5000
Country: USA

Feedback for improvement of my scans

Post by meelash »

Hi,

I've completed a self-made archivist book scanner (the wood one) and using Canon IXUS-160 cameras. I've completed scanning a couple of books with acceptable results, but I'm wondering if any of the members with experience in cameras and camera settings could look at some of my raw images and suggest changes that might result in even better quality scans
022.jpg
033.jpg
Like I said, after processing with scantailor, the results were acceptable, but if possible I'd like to be able to increase the text sharpness and decrease pixelation even at high zoom. Here are the completed scans after processing:
https://archive.org/stream/alfawz_alkab ... 9/mode/2up
https://archive.org/stream/istihbaabudd ... 9/mode/2up

Any suggestions for improvements in future scans?
BruceG
Posts: 99
Joined: 14 May 2014, 23:17
Number of books owned: 500
Country: Australia

Re: Feedback for improvement of my scans

Post by BruceG »

meelash
I use a Nikon S6500 16meg camera and it produces jpeg file about 7000+ kb. The Canon I think is a 20meg camera so should be greater.
The out put you have is only about 250 kb which is large for a OCR page but useless for a image page. Some where along the line the file size is being greatly reduced. Not knowing ScanTailor, but I would guess there are options in saving the output. You may have to play around to get the quality you require.
I use OmniPage to OCR but it does not do Arabic, I think Abbyy finereader does.
cday
Posts: 451
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Feedback for improvement of my scans

Post by cday »

meelash wrote:I've completed a self-made archivist book scanner (the wood one) and using Canon IXUS-160 cameras. I've completed scanning a couple of books with acceptable results, but I'm wondering if any of the members with experience in cameras and camera settings could look at some of my raw images and suggest changes that might result in even better quality scans
The attachment 022.jpg is no longer available
033_Grayscale_Levels.jpg
The top image in your post 022.jpg [apparently no longer available] is the camera JPEG output, and the lower image 033.jpg is the Scantailor output with the settings you are using, straightened but not cropped and output as a 24-bit colour image?

When the lower image 033.jpg is downloaded (note that the image displayed in the post is a preview with reduced pixel dimensions) the file has pixel dimensions of 2972 x 4176 px and a file size of 4317 KB.
meelash wrote:Like I said, after processing with scantailor, the results were acceptable, but if possible I'd like to be able to increase the text sharpness and decrease pixelation even at high zoom. Here are the completed scans after processing:
https://archive.org/stream/alfawz_alkab ... 9/mode/2up
https://archive.org/stream/istihbaabudd ... 9/mode/2up

Any suggestions for improvements in future scans?
The Internet Archive links are to your output images of the same book? The outputs linked to are available for download in many alternative formats, and if they are your images I'm not clear which you are referring to when you say the outputs are pixelated and could be better quality. Could you please point to examples, or better still upload an example file (in a ZIP archive, if necessary).

Edit:

I've just discovered that the above image was saved with a quality Q=40 (in XnViewMP) as that was the setting I last used in a previous test a while back: if you zoom well in you can see slight compression artefacts (like faint smudges between the characters) but they could probably wouldn't normally be noticed. A slightly higher quality setting should eliminate those with little increase in file size...

Having had a quick look, the PDF output seems of quite reasonable quality but does become slightly pixelated on zooming in; the JPEG 2000 output (jp2 file format) is similar, I think, but although the image is saved as a grayscale image it in fact only contains two colours: it is a black and white image saved in an 8-bit grayscale format. The pixelation you see when zooming in is largely the result of converting the image original colour image to black and white at some point, possibly due to an unintended Scantailor setting.
BruceG wrote:I use a Nikon S6500 16meg camera and it produces jpeg file about 7000+ kb. The Canon I think is a 20meg camera so should be greater.
The output you have is only about 250 kb which is large for a OCR page but useless for a image page. Some where along the line the file size is being greatly reduced. Not knowing ScanTailor, but I would guess there are options in saving the output. You may have to play around to get the quality you require.
It is important to be clear that the filesize and quality of a JPEG image are not directly related: the format is designed in a way that exploits the characteristics of the human eye to enable considerable compression to be used with no or minimal perceived loss of image quality. Typically an image saved with quality Q=85 will be much smaller than the same image saved with quality Q=100, yet for practical purposes be indistinguishable when viewed, and sometimes even substantially higher compression can be used producing an even smaller file that is still of acceptable quality.

The quality issue you are facing, I think, is due to the conversion of your images to black and white, and I can illustrate both that and the appropriate use of JPEG compression with the following image:
033_Grayscale_Levels.jpg
That is your Scantailor output file you posted above downloaded (after clicking on the displayed preview to bring up the original file), converted to grayscale, and then with a levels adjustment applied to whiten the background. If you download the image (after clicking on the preview to display the full image) you should find that it displays well even when zoomed in, and is also a JPEG image with a file size of 403 KB compared with 4317 KB of the original image... It was only a quick 'proof of concept' test and neither the levels adjustment or the file compression used were optimised.
Post Reply