Creating a Digital Library

Book scanning methods that involve taking books apart.

Moderator: peterZ

John_Latta
Posts: 11
Joined: 25 Sep 2013, 16:54
E-book readers owned: iPad
Number of books owned: 10000
Country: US

Re: Creating a Digital Library

Post by John_Latta »

Good points have been raised.

Preserving these books for posterity is not my objective. Most books have a short useful life. Thus, if my great grandchildren found them they would likely be uninterested. However, I have created archives of family history including photos – nearly 500GB - on USB drives. Daniel is spot on – make many copies and give them to all potentially interested parties. That is one of the safest means of preservation but there is no need to do this for my digital library.

The current bookshelf has 3,502 books which requires only 1.42TB, thus, the amount of space is relatively small. All of this can fit on the SSD on the Lenovo Yoga C940 which is one of the book readers. The Active Directory Network here has 200TB of RAID 6 storage which has many backup copies of the library. These backups are available to every computer on the network but very seldom needed. Selectively books are also downloaded to an iPhone for reading.

Cloud storage is not used. Too expensive, too slow and inconsistent with the constantly changing library. Keep it local and protect it extensively – is my mantra.
John_Latta
Posts: 11
Joined: 25 Sep 2013, 16:54
E-book readers owned: iPad
Number of books owned: 10000
Country: US

Re: Creating a Digital Library

Post by John_Latta »

Changed on Cloud Storage

For other reasons I had a motivation to evaluate Cloud storage. My experiences with the various services has been mixed; a key reason being the subscription charges. pCloud offers 2TB for a fixed fee of $350. Another factor is it is fast. There are apps for smart phones.

Books in the digital library are large – typically 100MB or larger some as large at 3GB. I want large files to retain high photo quality and the higher page resolution implies better OCR quality.

One book was deposited on pCloud; 847MB. It took a while to download to the iPhone using the pCloud app but the reading experience within the app was excellent.

This capability allows me to put books of current interest on the iPhone. The experience is very similar to the ebook offerings, but this is MY digital library.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Creating a Digital Library

Post by daniel_reetz »

Right. Very interesting, John. At the end of the day "The Cloud" just means "somebody else's computer". I don't trust the cloud at all. Companies and services come and go. And who knows what happens to all those computers that somebody else once owned?

In the olden days, when I was young ;) we ran our own severs and had our own files on them. And I think for a personal library this is a great idea. When I worked at the Internet Archive, they were testing the idea of a digital locker. That you could have your own library there, for yourself and only you, but using their reader. I loved this idea, but have no idea where it ended up.

One principle:
Just like the personal scanners, personal libraries need to serve their users.

Another principle:
What might seem worthless to you could be gold to others, and vice versa. Archiving would be super-tractable if each book only needed to be scanned once, instead of being scanned per-user.
iam2sam
Posts: 12
Joined: 11 Apr 2015, 07:26
Number of books owned: 0
Country: US

Re: Creating a Digital Library

Post by iam2sam »

John,
You hinted at some of the answers to these questions in your reply to Daniel R., but I'm curious about the specific motivation. Did you scan books for which good quality digital copies were freely available? If so, why? Are you satisfied with the results of your investment in time and resources in duplicating those particular digital editions? This is of particular interest for me, as I intend to create a digital copy of my library. It is only a bit more than 10% the size of yours in volumes (~600) but, based on your experience, that would still require a formidable effort. My tentative plan is to identify and procure digital copies for those books I can, subject to scanned quality (including indexing on those books where I think that is important) and cost, and digitize and index the remaining volumes myself. I'm hoping that your answer will help me decide if I have missed anything important in formulating that plan.
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Creating a Digital Library

Post by BillGill »

I haven't been replying to this thread before, but now I think I might be able to help you. I have been doing just about what you are talking about. I have a large personal library and want to recreate a good portion of it as a digital library. So far I have been downloading the books where I can, from Gutenberg (http://www.gutenberg.org/) or buying them where Gutenberg doesn't have them. However, there are quite a few books in my library that are not available in digital form. So I have been scanning them in and converting them to my preferred format, which is EPUB, your preference may be different.

I do not plan to distribute any books that are in copyright. These books are strictly for my own personal use.

So far I have scanned over 100 books. It has taken me about 4 years. So doing this is not a trivial task. It generally takes me about a 1 to 2 weeks to finish a book. That is working on them several hours a day. I find that the biggest part of the task is editing to correct errors that the OCR software generates. I generally go through the text 3 times in a text editor of some sort, and then once more in the Calibre (https://calibre-ebook.com/) EPUB editor.

By the way, when I find that books I have scanned in have become available for sale I do buy the official copy. I have done this several times.

Bill
iam2sam
Posts: 12
Joined: 11 Apr 2015, 07:26
Number of books owned: 0
Country: US

Re: Creating a Digital Library

Post by iam2sam »

BillGill wrote: 18 Dec 2020, 14:53 I haven't been replying to this thread before, but now I think I might be able to help you. I have been doing just about what you are talking about...
Good information to know, thanks. I am aware of the Project Gutenburg library, but have spent more time looking for book copies at archive.org. In your experience. what are the salient differences between the two sites for this purpose?
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Creating a Digital Library

Post by BillGill »

I'm afraid I can't answer that one. I haven't ever tried archive.org. Let me look at it. Well, it seems to have a lot in it. I may have to check it out. At first glance it is kind of confusing.

Bill
iam2sam
Posts: 12
Joined: 11 Apr 2015, 07:26
Number of books owned: 0
Country: US

Re: Creating a Digital Library

Post by iam2sam »

BillGill wrote: 19 Dec 2020, 12:18 I'm afraid I can't answer that one. I haven't ever tried archive.org. Let me look at it. Well, it seems to have a lot in it. I may have to check it out. At first glance it is kind of confusing.

Bill
I can understand that. It doesn't exhibit a tremendous amount of structure. To me it is geared more toward searching for a specific document or a document on a specific subject than for browsing. My process to this point is to search for a book that I have in my physical library on IA, and if it is available, evaluate the quality. If adequate, I will download their copy rather than digitize mine. I am primarily using PDF as it is imo more of a "common denominator" than ePub. However, i have had difficulties in adapting PDF scanned material to my eReader. I'm going to experiment to see if either Scantailor Advanced or k2optpdf can produce acceptable results on my device, if not, I may need to take another look at ePub...
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: Creating a Digital Library

Post by BillGill »

I use EPUB because it is the most universal format for Ebooks. It was developed specifically to be used for Ebooks. There are a number of Ebook readers available for free down load. If you are not familiar with it it is based on HTML They designed it to wrap up a bunch of HTML files into one big file. It works on most devices. Calibre (https://calibre-ebook.com/), which is described as an Ebook management system can convert most digital book formats into EPUB, and then edit them.

Bill
dpc
Posts: 379
Joined: 01 Apr 2011, 18:05
Number of books owned: 0
Location: Issaquah, WA

Re: Creating a Digital Library

Post by dpc »

iam2sam wrote: 19 Dec 2020, 15:36 ...
I am primarily using PDF as it is imo more of a "common denominator" than ePub. However, i have had difficulties in adapting PDF scanned material to my eReader.
...
What sort of difficulties are you having?

I work exclusively with PDFs in digitizing books. I tried ePub a decade ago when I began to look at book scanning and it took too much hand-holding to get a file without conversion/formatting errors. I realize PDF isn't for everyone and I applaud Bill's desire to convert his entire library to ePub, it's was just too much work for me. It also doesn't capture highlighting and notes/corrections that I've written in the margins of some of my old textbooks.

In any case, I've found that the PDF viewers found on some tablet devices don't do a good job of displaying my PDFs. If I use the Adobe Acrobat reader on these devices the pages look great, so you might want to try different PDF viewers to see if that helps. I use Adobe Acrobat to produce the PDFs and found that it helps to produce page images that are the same resolution as my target tablet device. I also create a searchable PDF and use Adobe's ClearScan. This does create larger file sizes, but storage is cheap (and only getting cheaper as time goes on) so that isn't a concern for me.
Post Reply