Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

Distributed Digital Library - Ideas

A place to tell us about your work and projects. Self-links encouraged!

Would you contribute to the development of a digital library?

Yes - I could contribute ideas.
6
25%
Yes - I could contribute images.
6
25%
Yes - I could code.
3
13%
Yes - I could do ocr.
3
13%
Yes - I could do quality control.
3
13%
Yes - I could help administer and manage.
3
13%
No - I do not have the time but I think it is a good idea.
0
No votes
No way - how can ordinary people do what google does?
0
No votes
 
Total votes: 24

User avatar
daniel_reetz
Posts: 2797
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Distributed Digital Library - Ideas

Post by daniel_reetz » 22 Mar 2010, 17:23

the first instructable is not gospel, and dumpster diving was never necessary, just fun. but since that comes up over and over again, check my New Standard Scanner thread for fresh, near-complete plans.

just a note, would you consider blockquoting the copy/paste from other sites so we can clearly see what is yur writing and what is not? I know you're formulating an argument here but it is important to quote the minimum necessary and be clear about attribution. thanks :)

sanjayayogi
Posts: 19
Joined: 04 Mar 2014, 00:52

Re: Distributed Digital Library - Ideas

Post by sanjayayogi » 22 Mar 2010, 17:34

daniel_reetz wrote:the first instructable is not gospel, and dumpster diving was never necessary, just fun. but since that comes up over and over again, check my New Standard Scanner thread for fresh, near-complete plans.

just a note, would you consider blockquoting the copy/paste from other sites so we can clearly see what is yur writing and what is not? I know you're formulating an argument here but it is important to quote the minimum necessary and be clear about attribution. thanks :)
I am updating my posts to make sure it is clear what is linked and copy/pasted to credit the source. Generally, I put the link and say that the following is from the site and then quote specifics from within the copy/paste to draw further attention to it.

Daniel, by the way, just joking about the dumpster diving, it is part of what caught my interest. I like and admire your style!

wels
Posts: 21
Joined: 04 Mar 2014, 00:52

Re: Distributed Digital Library - Ideas

Post by wels » 03 Aug 2010, 07:32

After over half a year of inactivity I passed by, looking up unread articles - here I am ;)

I like the idea.

A similar idea came across my mind last year: That it would be useful to have a distributed global directory of scanned books (esp. of private individuals) - just a collection/database of meta data about those books, no content at this stage. So, before scanning (and OCRing) a book, you can check if someone did this one already. Instead of doing the whole work again, one could contact the other individual and e.g. handle things out on a peer-by-peer basis.

... and I wouldn't stick to file names so much. From my pov, it's a detail. Future developments go towards meta-object systems on top of file systems anyway (additional abstraction layer). So, you'll have some kind of descriptor 'file' for each data file, for each directory, for each group of data objects ... like embedded meta data in mp3s or EXIF in jpeg

sanjayayogi
Posts: 19
Joined: 04 Mar 2014, 00:52

Re: Distributed Digital Library - Ideas

Post by sanjayayogi » 03 Aug 2010, 10:31

I like your idea of meta idea only, describing the scans. If we only share meta-data about the scans it would be difficult to be accused of illegal scanning of copyrighted material. The idea of not re-scanning books that have already been scanned is smart. Time is valuable.

I have been playing with couchdb, which is a super database that is designed for easy replication, and distribution. It would be possible to have database locally, that someone maintains, of scanned books meta-data.

The other very nice thing about couchdb is that it is a non-relational database with the possibility to add additional information very simply. After that replication can be automated (1.0.0) if desired. It would make a great back-end for this sort of project. I set up a Linux VPS, that I have been using to test some ideas the last few months, and it seems that all of the open-source pieces exist that would make a project like this quite possible.

My ideas of meta-data embedded in the name of the scan is so that meta-data may be "stripped" from the name itself and allow machine reading of meta-data during post-processing of scanned materials. I agree it is a detail, albeit a necessary one that really helps later to keep from having to rename file, and it also allows easy reading of files to know what they contain.

In the possibility that people are interested enough in this project, I will post my email here so that you may contact me directly. In the past, when I start an open-source project, of this nature, I usually start it on Google Groups, so that we can have a focused development forum, and not fill up a venue like this with development that may not be the originating forums primary focus. Then, I try to directly invite an initial hand-picked group that would bring ideas to the table, later opening to all contributers (a moderated group). I did this a while ago and in 4 months we had over 50 developers and testers working together globally.

Any one interested?

Email me:

sanjaya.yogi at gmail.com

Post Reply