Hello I am glad to be here!
Moderator: peterZ
-
- Posts: 8
- Joined: 02 Oct 2012, 10:22
- E-book readers owned: kindel fire
- Number of books owned: 500
- Country: US
Hello I am glad to be here!
I am a chess book scanner. A specific niche I know..I found this site about a year ago, got busy, and I finally got around to becoming a member! I would like to work on a DIY scanner when i can, but finding time can be tough. The more I look at the materials list the more i realize I have at least half of this stuff laying around! Maybe soon huh?
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Hello I am glad to be here!
I LOVE NICHE SCANNING.
Tell us more about your books - what are the specific challenges of scanning chess books? And yes, you should start right away. Maybe by just pointing your camera at some books and seeing what you get from Scan Tailor...
Tell us more about your books - what are the specific challenges of scanning chess books? And yes, you should start right away. Maybe by just pointing your camera at some books and seeing what you get from Scan Tailor...
-
- Posts: 8
- Joined: 02 Oct 2012, 10:22
- E-book readers owned: kindel fire
- Number of books owned: 500
- Country: US
Re: Hello I am glad to be here!
I apologize for not answering a whole lot sooner(After trying to log in about 10 or so different times, I finally realized I forgot all of my log in info... ). Now I have fixed that I will be back more! and sooner!
In the past year, I have scanned about 12 books so at least I have been learning! 12 doesnt sound like much but finding books no one has scanned before is the problem with my low number there.
To answer your question, There are a few issues with scanning chess books. I would say first and foremost is the clarity of the board diagram. For example,
These horizontal line that indicate the dark squares can be a problem, not that they have to be perfect, but as the page is reprocessed to a smaller size it is hard to keep blotches from forming. This some times makes the pieces hard to differentiate which is the 2nd biggest issue.
Here is an example:
The piece is easily filled which is what you dont want because they become indistinguishable. Is it a black Rook or a dirty white one? lol These are scanned at 300 dpi by the by.
In the past year, I have scanned about 12 books so at least I have been learning! 12 doesnt sound like much but finding books no one has scanned before is the problem with my low number there.
To answer your question, There are a few issues with scanning chess books. I would say first and foremost is the clarity of the board diagram. For example,
These horizontal line that indicate the dark squares can be a problem, not that they have to be perfect, but as the page is reprocessed to a smaller size it is hard to keep blotches from forming. This some times makes the pieces hard to differentiate which is the 2nd biggest issue.
Here is an example:
The piece is easily filled which is what you dont want because they become indistinguishable. Is it a black Rook or a dirty white one? lol These are scanned at 300 dpi by the by.
Re: Hello I am glad to be here!
At a quick look -- and possibly telling you things you already know -- I think I see signs of three separate issues with the images you posted:
You probably would benefit from scanning at a higher DPI, which of course would take longer and produce larger file sizes, but because your images seem to be black and white they could probably still be quite small if a suitable file format and compression method were used.
As the pages you're scanning are printed, the black areas may be composed of small dots and need to be descreened to obtain optimum image quality; your scanner probably has alternative setting you could try if you haven't already done so, but it may not provide enough control to exactly match the screen frequency used in the printed page.
The image posted seems to show signs of compression artifacts (the smudges between the characters in the close-up view) as if it has been saved as a JPG file at some stage with too much compression used, although the uploaded images are PNGs which is a lossless format.
If books you are scanning are black and white, the pages would probably be best saved -- from the point of view of image quality and file size -- as black and white TIFF files with Fax G4 or -- if available -- JBIG2 compression.
You probably would benefit from scanning at a higher DPI, which of course would take longer and produce larger file sizes, but because your images seem to be black and white they could probably still be quite small if a suitable file format and compression method were used.
As the pages you're scanning are printed, the black areas may be composed of small dots and need to be descreened to obtain optimum image quality; your scanner probably has alternative setting you could try if you haven't already done so, but it may not provide enough control to exactly match the screen frequency used in the printed page.
The image posted seems to show signs of compression artifacts (the smudges between the characters in the close-up view) as if it has been saved as a JPG file at some stage with too much compression used, although the uploaded images are PNGs which is a lossless format.
If books you are scanning are black and white, the pages would probably be best saved -- from the point of view of image quality and file size -- as black and white TIFF files with Fax G4 or -- if available -- JBIG2 compression.
-
- Posts: 98
- Joined: 12 May 2013, 16:36
- E-book readers owned: PRS-505, PocketBook 902, PRS-T1, PocketBook 623, PocketBook 840
- Number of books owned: 3000
- Country: Canada
Re: Hello I am glad to be here!
I would say that this is true if you scan at a high enough resolution (say, 600 dpi). You can then rescale that to smaller resolution grayscale images that still look pretty good. I wouldn't rescale to smaller black and white images, though, as a lot of the details in the pieces will be lost. The grayscale images should be compressed with some non-lossy format. I agree that JPEG should be avoided at any stage of the process.cday wrote:If books you are scanning are black and white, the pages would probably be best saved -- from the point of view of image quality and file size -- as black and white TIFF files with Fax G4 or -- if available -- JBIG2 compression.
Edit: I just tried scanning a chess diagram, and even 600 dpi isn't enough to get all the details in black and white. If you can't go higher in resolution than that, then I'd say that scanning in grayscale is the way to go.
-
- Posts: 8
- Joined: 02 Oct 2012, 10:22
- E-book readers owned: kindel fire
- Number of books owned: 500
- Country: US
Re: Hello I am glad to be here!
Wow! Thank you cday and rkomar for your suggestions!
I'll give more info..in scatological order as it is my thought process...I wish I where kidding!
My leximark scanner came with ABBYY 6. I can not scan as a document because the OCR reads the piece in the text as letters, not a symbol. For example, the rook is read as a 2 capital I's (iIIi usually).
So I think I am now hitting on my problem...I scan as a color photo. Also, the paper used is VERY thinso much so that bleed through is a problem.( I mean you can read the next page through the paper!) I was not happy with the outcome after I processed with ScanTaylor.
Recently, I started using Abobe Acrobat XI to scan with instead and save directly as a pdf. I am sure those example came from those scans. From there I saved as jpegs(which i shall never do again, lol) or tiffs or pngs for that matter.
I will use the suggestions listed so far and see what I can get...
Best Wishes!
ouroboros
I'll give more info..in scatological order as it is my thought process...I wish I where kidding!
My leximark scanner came with ABBYY 6. I can not scan as a document because the OCR reads the piece in the text as letters, not a symbol. For example, the rook is read as a 2 capital I's (iIIi usually).
So I think I am now hitting on my problem...I scan as a color photo. Also, the paper used is VERY thinso much so that bleed through is a problem.( I mean you can read the next page through the paper!) I was not happy with the outcome after I processed with ScanTaylor.
Recently, I started using Abobe Acrobat XI to scan with instead and save directly as a pdf. I am sure those example came from those scans. From there I saved as jpegs(which i shall never do again, lol) or tiffs or pngs for that matter.
I will use the suggestions listed so far and see what I can get...
Best Wishes!
ouroboros
Re: Hello I am glad to be here!
You should be able to avoid bleed through, or at least most bleed through, by placing a sheet of matte black art paper behind the page you are scanning.ouroboros wrote: ... the paper used is VERY thin so much so that bleed through is a problem ...
rkomar has a point about grayscale, at least in the sense that grayscale (and colour) images can be displayed anti-aliased, which can considerably smooth and enhance the appearance of text, for example. You should be able to scan direct to grayscale and see how it displays your images, but the downside is that file sizes for grayscale or colour will be much larger than for black and white with suitable compression. I'm surprised that 600dpi didn't produce a good quality black and white image.
My point about descreening settings may not apply to pure black and white pages, but would certainly be relevant if there are any gray halftones or black and white photographs on the pages -- formed from black ink on white paper, using variable size dots to produce the gray tones.
Be aware that when you save a scan to PDF the image in the file is normally still stored within the file in one of the standard image formats such as JPG or TIFF, and with an appropriate -- hopefully selectable -- level of compression.
-
- Posts: 8
- Joined: 02 Oct 2012, 10:22
- E-book readers owned: kindel fire
- Number of books owned: 500
- Country: US
Re: Hello I am glad to be here!
I scanned a page at 600 dpi black and white as well as 600 dpi gray. The black and white seemed to me to be cleaner...the blacked "popped" off the page. there was not that much of a size difference between the two. So, i would choose B&W over gray.
However, I did find a significant difference on the same page Black and white .tiff @ 600dpi(114 KB) and B&w pdf @600dpi(74.5 KB)!
Thank you again cday!
However, I did find a significant difference on the same page Black and white .tiff @ 600dpi(114 KB) and B&w pdf @600dpi(74.5 KB)!
Thank you again cday!
Re: Hello I am glad to be here!
Scanning to black and white (or converting to black and white after scanning to grayscale or colour) should eliminate or at least largely eliminate the bleed through problem as any bleed through lighter than mid-gray will be displayed as white.
A possible explanation for the difference in files sizes above is that the Acrobat image is compressed using JBIG2, a format Acrobat supports that is less commonly available in scanner drivers. But we don't actually know the compression method used for the TIFF file, or the file format and compression method used in the PDF file -- if you would care to upload the files I can look at them see if you are using the optimum methods.
Depending on how important quality, scanning time and file size are, you might try a few more test scans to see if 600dpi is actually needed in black and white, and to evaluate different file formats and compression methods.
The difference in file sizes is almost certainly because the images are encoded in different file formats and/or using different compression: the PDF format is only really a 'wrapper' around the image(s) contained in the file, and usually only slightly increases the overall file size. The Acrobat user interface generally gives rather less control of compression than a typical scanner interface.ouroboros wrote: ... I did find a significant difference on the same page Black and white .tiff @ 600dpi(114 KB) and B&w pdf @600dpi(74.5 KB)
A possible explanation for the difference in files sizes above is that the Acrobat image is compressed using JBIG2, a format Acrobat supports that is less commonly available in scanner drivers. But we don't actually know the compression method used for the TIFF file, or the file format and compression method used in the PDF file -- if you would care to upload the files I can look at them see if you are using the optimum methods.
Depending on how important quality, scanning time and file size are, you might try a few more test scans to see if 600dpi is actually needed in black and white, and to evaluate different file formats and compression methods.
-
- Posts: 98
- Joined: 12 May 2013, 16:36
- E-book readers owned: PRS-505, PocketBook 902, PRS-T1, PocketBook 623, PocketBook 840
- Number of books owned: 3000
- Country: Canada
Re: Hello I am glad to be here!
I think you should also test by scanning the board layouts in a few books. I did my test on a rather old book (Capablanca's "Chess Fundamentals"), and I found that scanning at 600 dpi was barely good enough to do black and white. I have uploaded the results of the scans at 300 dpi and at 600 dpi (both PNG and JPG) at http://www3.telus.net/rkomar/chess/ if anyone wants to see what I mean. I've displayed the balack and white PNG files here, as well.