Hello I am glad to be here!

A place to introduce yourself, and to meet other awesome people.

Moderator: peterZ

ouroboros
Posts: 8
Joined: 02 Oct 2012, 10:22
E-book readers owned: kindel fire
Number of books owned: 500
Country: US

Hello I am glad to be here!

Post by ouroboros »

I am a chess book scanner. A specific niche I know..I found this site about a year ago, got busy, and I finally got around to becoming a member! I would like to work on a DIY scanner when i can, but finding time can be tough. The more I look at the materials list the more i realize I have at least half of this stuff laying around! Maybe soon huh?
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Hello I am glad to be here!

Post by daniel_reetz »

I LOVE NICHE SCANNING.

Tell us more about your books - what are the specific challenges of scanning chess books? And yes, you should start right away. Maybe by just pointing your camera at some books and seeing what you get from Scan Tailor...
ouroboros
Posts: 8
Joined: 02 Oct 2012, 10:22
E-book readers owned: kindel fire
Number of books owned: 500
Country: US

Re: Hello I am glad to be here!

Post by ouroboros »

I apologize for not answering a whole lot sooner(After trying to log in about 10 or so different times, I finally realized I forgot all of my log in info... :roll: ). Now I have fixed that I will be back more! and sooner! :lol:
In the past year, I have scanned about 12 books so at least I have been learning! 12 doesnt sound like much but finding books no one has scanned before is the problem with my low number there.
To answer your question, There are a few issues with scanning chess books. I would say first and foremost is the clarity of the board diagram. For example,
Image

These horizontal line that indicate the dark squares can be a problem, not that they have to be perfect, but as the page is reprocessed to a smaller size it is hard to keep blotches from forming. This some times makes the pieces hard to differentiate which is the 2nd biggest issue.
Here is an example:
Image

The piece is easily filled which is what you dont want because they become indistinguishable. Is it a black Rook or a dirty white one? lol These are scanned at 300 dpi by the by.
cday
Posts: 456
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Hello I am glad to be here!

Post by cday »

At a quick look -- and possibly telling you things you already know -- I think I see signs of three separate issues with the images you posted:

You probably would benefit from scanning at a higher DPI, which of course would take longer and produce larger file sizes, but because your images seem to be black and white they could probably still be quite small if a suitable file format and compression method were used.

As the pages you're scanning are printed, the black areas may be composed of small dots and need to be descreened to obtain optimum image quality; your scanner probably has alternative setting you could try if you haven't already done so, but it may not provide enough control to exactly match the screen frequency used in the printed page.

The image posted seems to show signs of compression artifacts (the smudges between the characters in the close-up view) as if it has been saved as a JPG file at some stage with too much compression used, although the uploaded images are PNGs which is a lossless format.

If books you are scanning are black and white, the pages would probably be best saved -- from the point of view of image quality and file size -- as black and white TIFF files with Fax G4 or -- if available -- JBIG2 compression.
rkomar
Posts: 98
Joined: 12 May 2013, 16:36
E-book readers owned: PRS-505, PocketBook 902, PRS-T1, PocketBook 623, PocketBook 840
Number of books owned: 3000
Country: Canada

Re: Hello I am glad to be here!

Post by rkomar »

cday wrote:If books you are scanning are black and white, the pages would probably be best saved -- from the point of view of image quality and file size -- as black and white TIFF files with Fax G4 or -- if available -- JBIG2 compression.
I would say that this is true if you scan at a high enough resolution (say, 600 dpi). You can then rescale that to smaller resolution grayscale images that still look pretty good. I wouldn't rescale to smaller black and white images, though, as a lot of the details in the pieces will be lost. The grayscale images should be compressed with some non-lossy format. I agree that JPEG should be avoided at any stage of the process.

Edit: I just tried scanning a chess diagram, and even 600 dpi isn't enough to get all the details in black and white. If you can't go higher in resolution than that, then I'd say that scanning in grayscale is the way to go.
ouroboros
Posts: 8
Joined: 02 Oct 2012, 10:22
E-book readers owned: kindel fire
Number of books owned: 500
Country: US

Re: Hello I am glad to be here!

Post by ouroboros »

Wow! Thank you cday and rkomar for your suggestions!
I'll give more info..in scatological order as it is my thought process...I wish I where kidding! :?
My leximark scanner came with ABBYY 6. I can not scan as a document because the OCR reads the piece in the text as letters, not a symbol. For example, the rook is read as a 2 capital I's (iIIi usually).
So I think I am now hitting on my problem...I scan as a color photo. Also, the paper used is VERY thinso much so that bleed through is a problem.( I mean you can read the next page through the paper!) I was not happy with the outcome after I processed with ScanTaylor.
Recently, I started using Abobe Acrobat XI to scan with instead and save directly as a pdf. I am sure those example came from those scans. From there I saved as jpegs(which i shall never do again, lol) or tiffs or pngs for that matter.
I will use the suggestions listed so far and see what I can get...
Best Wishes!
ouroboros
cday
Posts: 456
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Hello I am glad to be here!

Post by cday »

ouroboros wrote: ... the paper used is VERY thin so much so that bleed through is a problem ...
You should be able to avoid bleed through, or at least most bleed through, by placing a sheet of matte black art paper behind the page you are scanning.

rkomar has a point about grayscale, at least in the sense that grayscale (and colour) images can be displayed anti-aliased, which can considerably smooth and enhance the appearance of text, for example. You should be able to scan direct to grayscale and see how it displays your images, but the downside is that file sizes for grayscale or colour will be much larger than for black and white with suitable compression. I'm surprised that 600dpi didn't produce a good quality black and white image.

My point about descreening settings may not apply to pure black and white pages, but would certainly be relevant if there are any gray halftones or black and white photographs on the pages -- formed from black ink on white paper, using variable size dots to produce the gray tones.

Be aware that when you save a scan to PDF the image in the file is normally still stored within the file in one of the standard image formats such as JPG or TIFF, and with an appropriate -- hopefully selectable -- level of compression.
ouroboros
Posts: 8
Joined: 02 Oct 2012, 10:22
E-book readers owned: kindel fire
Number of books owned: 500
Country: US

Re: Hello I am glad to be here!

Post by ouroboros »

I scanned a page at 600 dpi black and white as well as 600 dpi gray. The black and white seemed to me to be cleaner...the blacked "popped" off the page. there was not that much of a size difference between the two. So, i would choose B&W over gray.
However, I did find a significant difference on the same page Black and white .tiff @ 600dpi(114 KB) and B&w pdf @600dpi(74.5 KB)!
Thank you again cday!
cday
Posts: 456
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Hello I am glad to be here!

Post by cday »

Scanning to black and white (or converting to black and white after scanning to grayscale or colour) should eliminate or at least largely eliminate the bleed through problem as any bleed through lighter than mid-gray will be displayed as white.
ouroboros wrote: ... I did find a significant difference on the same page Black and white .tiff @ 600dpi(114 KB) and B&w pdf @600dpi(74.5 KB)
The difference in file sizes is almost certainly because the images are encoded in different file formats and/or using different compression: the PDF format is only really a 'wrapper' around the image(s) contained in the file, and usually only slightly increases the overall file size. The Acrobat user interface generally gives rather less control of compression than a typical scanner interface.

A possible explanation for the difference in files sizes above is that the Acrobat image is compressed using JBIG2, a format Acrobat supports that is less commonly available in scanner drivers. But we don't actually know the compression method used for the TIFF file, or the file format and compression method used in the PDF file -- if you would care to upload the files I can look at them see if you are using the optimum methods.

Depending on how important quality, scanning time and file size are, you might try a few more test scans to see if 600dpi is actually needed in black and white, and to evaluate different file formats and compression methods.
rkomar
Posts: 98
Joined: 12 May 2013, 16:36
E-book readers owned: PRS-505, PocketBook 902, PRS-T1, PocketBook 623, PocketBook 840
Number of books owned: 3000
Country: Canada

Re: Hello I am glad to be here!

Post by rkomar »

I think you should also test by scanning the board layouts in a few books. I did my test on a rather old book (Capablanca's "Chess Fundamentals"), and I found that scanning at 600 dpi was barely good enough to do black and white. I have uploaded the results of the scans at 300 dpi and at 600 dpi (both PNG and JPG) at http://www3.telus.net/rkomar/chess/ if anyone wants to see what I mean. I've displayed the balack and white PNG files here, as well.
600 dpi bw
600 dpi bw
300_bw.png
300 dpi bw
(19.64 KiB) Downloaded 7341 times
Post Reply