Search found 6 matches
- 17 Sep 2010, 21:27
- Forum: Chat
- Topic: What's broken about eBooks? What would un-break them?
- Replies: 43
- Views: 42848
Re: What's broken about eBooks? What would un-break them?
if (as Wikipedia http://en.wikipedia.org/wiki/Extensible_Metadata_Platform says ) XMP is most commonly serialized and stored using a subset of the W3C Resource Description Framework (RDF), which is in turn expressed in XML. , then this could be relevant to the Semantic Web, as RDF http://www.w3.org/...
- 16 Sep 2010, 23:56
- Forum: Chat
- Topic: What's broken about eBooks? What would un-break them?
- Replies: 43
- Views: 42848
Re: What's broken about eBooks? What would un-break them?
Well, I hate replying to myself, but I meant to add that there's an ambitious project to build a Semantic Web. http://en.wikipedia.org/wiki/Semantic_Web that would help us understand context and content of digital content. An important pre-requisite to the Semantic Web is to make digital content and...
- 16 Sep 2010, 23:31
- Forum: Chat
- Topic: What's broken about eBooks? What would un-break them?
- Replies: 43
- Views: 42848
Re: What's broken about eBooks? What would un-break them?
Metadata! To archive and distribute books, good metadata will become more essential. Agreed. Metadata will help us sift through book collections. Some electronic documents are rife with metadata, for instance, emails. There's the usual stuff: sender, recipients, subject, and body, but emails can co...
- 14 Sep 2010, 23:21
- Forum: Chat
- Topic: What's broken about eBooks? What would un-break them?
- Replies: 43
- Views: 42848
Re: What's broken about eBooks? What would un-break them?
@ univershul -- Not all PDFs are generated the same. With Apple and now with some desktop Linuxes like Ubuntu, you can Print to PDF. It just means that you get an image of the content -- in most cases it's another rendiering of the content to a bitmap and then to a page which can be far away from it...
- 11 Sep 2010, 22:44
- Forum: Show and Tell / Book Projects
- Topic: DIY scanner and Scan Tailor processed books on Google Books
- Replies: 16
- Views: 75245
Tesseract without compilation
[ There are three decent (depending on your needs and skills) options for open source OCR right now. Tesseract, Ocropus, and Cuneiform. Tesseract - http://code.google.com/p/tesseract-ocr/ - the development version has to be built from source in order to get page layout analysis. See http://code.goo...
- 11 Sep 2010, 16:41
- Forum: Introductions and connections
- Topic: Post something about yourself here (The Hello Thread)
- Replies: 441
- Views: 657915
Re: Post something about yourself here (The Hello Thread)
Hello Thread -- Thirty years ago I had a student job at the Georgia Newspapers Project ( http://www.libs.uga.edu/gnp/ ) where I prepared old and crumbling newspapers for the microfilm cameras by taping up tears and flattening folds and creases with a hot iron. I wondered why they couldn't be OCR-ed,...