E-book standard format

Discussions, questions, comments, ideas, and your projects having to do with DIY Book Scanner software. This includes the Stereo Data Maker software for the cameras, post-processing software, utilities, OCR packages, and so on.

Moderator: peterZ

Kirtai
Posts: 10
Joined: 27 Jul 2009, 10:49
E-book readers owned: PRS-505
Number of books owned: 3000
Location: Scotland

Re: E-book standard format

Post by Kirtai »

rob wrote:Just one other comment: I think reflowable layout will eventually die. They are here for now because (a) most ebook readers aren't large enough right now to handle a real page, and (b) web pages used to have to fit on small screens.
What do you mean when you say "real page"?
rob wrote:Sure, there can be several different editions of a book, all with a different layout, but that still doesn't change the underlying text which, remember, should be searchable and can be put on a separate layer. There shouldn't be a need, though, to reformat a book on the fly. Your ebook reader should be able to handle an appropriate edition.
One real reason to reformat on the fly is for people who need to resize the text due to poor eyesight (that'll be us as we get older btw). Another reason is when the publisher uses hideous fonts in an ebook. (I bought the Dresden Files books and they use a font that might be quite nice on paper but is absolutely awful to read at the standard zoom level on the Sony.)

Do you think publishers will provide normal & large print versions in every size?
User avatar
ceeann1
Posts: 106
Joined: 17 Nov 2010, 20:00
E-book readers owned: Several Palm PDA's
Number of books owned: 700
Location: Albuquerque, New Mexico
Contact:

Re: E-book standard format

Post by ceeann1 »

I thought I would sum up what has been discussed so far and see if there is some common ground.

We Said:
1. I want an image, and underneath, text that can be searched, tagged with the location of the text in the image, and that's what searchable PDF does.
2. As a teacher, PDFs are slightly desirable over DjVU since everyone knows how to handle a PDFs
3. DjVU is an uncommon standard compared to PDF
4. There is certainly a place for reflowable text, but that is at the publisher's level, not the reader's level.
5. The searchable PDF/DjVu option is good. I used it for a while, until I realized how much time I saved by turning off OCR in djvubind.
6. I agree with recent comments that a format that preserves the layout is preferred.
7. Knowing of course there would be those of us that would prefer to use a different format we could just have a set of format changers as a group if we had
somewhere to start from.
8. If we agree to the requirement of having format changers for the group, I recommend that the format require loseless compression so that a secondary format can be made with the same quality of input.
9. ddjvu" can output to pdf, tiff, pbm, pgm, ppm, pnm, or rle.
10. In the future, books are going to contain more graphical content.
11. A book is in fact not the product merely of the author, but also of the whole team who (ideally) work together to produce the entire look and feel of the book
12. One real reason to reformat on the fly is for people who need to resize the text due to poor eyesight (that'll be us as we get older btw). Another reason is when the publisher uses hideous fonts in an ebook.

I have chosen the above clips from the text of our comments to try to compress what we said to the essentials. I have in places restated the clip to make it more readable for the stated purpose. I have no doubt missed something important so if that is the case please state what should be added. From the above I want to say what we are thinking in general terms.

An E-book standard format should:
1. Preserve the format of the book on a page by page basis.
2. Have searchable text with the location of the text available.
3. Have lossless compression to allow e-book format changes of the same quality
4. Allow for expansion of graphic content in the future.
5. Be as fast as possible to use (speed is an issue).
6. Allow for use by those who are visually handicapped
7. Allow for formats of differing purposes since there are real reasons for using alternative formats.

I believe that summarizes what we have been saying in so far as we have been speaking of a standard e-book format. I have not presented these in any particular order. I have no doubt inserted my own bias although I have tried to keep that as minimal as possible. I believe there is more to say. I really did not think of many of these objectives and I think they are really very insightful and important!!

Please ask those people whose opinion you value for input on what an e-book should be. I think it is important to get as broad a set of ideas as possible.

Ceeann
StevePoling
Posts: 290
Joined: 20 Jun 2009, 12:19
E-book readers owned: SONY PRS-505, Kindle DX
Number of books owned: 9999
Location: Grand Rapids, MI
Contact:

Re: E-book standard format

Post by StevePoling »

I think any time someone says that they want to identify the "best" anything, everyone goes into one of those blind-men-and-elephant deals where they identify their particular interest and say that the solution must optimize for "snakeness" or "ropeness" or "wallness" or some other attribute of the elephant. That is, every question of "best" e-book format will beg the question of "best for what?"

Before I bought my SONY eReader, I thought PDF was everything I'd ever need. Then I found it illegible on the tiny screen (unless I created the PDF myself and designed it for the dimensions of the reader). A few months later, I bought the Kindle DX with a nice, big screen and paid through the nose for luxury. PDF looked better, but still not good enough for comfortable reading.

Vector-oriented formats such as Topaz is that they can be easily scanned with high apparent accuracy without bothering about OCR. I like that, but I want something that runs on both my Sony and my Kindle. However, without the OCR there's no searching, etc.

Reflowable text-oriented formats such as ePub or Mobi are my favorites, because the text reflows to nicely fit different-sized screens. However, I've not yet been satisfied by OCR accuracy. But that's probably my fault.

Having a vector-oriented display that's backed up by OCR for searching sounds good. If I only depend upon OCR for searching, I can tolerate less than 100% OCR accuracy. Does anything like that run on both the Kindle and the SONY?

For my purposes, ePub works on my SONY and my Android phone and the Nook I'm tempted to buy. And Calibre does a great job of converting ePub to Mobi format for my Kindle, but I'd rather have Kindle-resident ePub reading firmware.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: E-book standard format

Post by daniel_reetz »

StevePoling wrote:I think any time someone says that they want to identify the "best" anything, everyone goes into one of those blind-men-and-elephant deals where they identify their particular interest and say that the solution must optimize for "snakeness" or "ropeness" or "wallness" or some other attribute of the elephant. That is, every question of "best" e-book format will beg the question of "best for what?
:) :) :)

Optimizing for "optimumness" here
User avatar
ceeann1
Posts: 106
Joined: 17 Nov 2010, 20:00
E-book readers owned: Several Palm PDA's
Number of books owned: 700
Location: Albuquerque, New Mexico
Contact:

Re: E-book standard format

Post by ceeann1 »

Optimum
most favorable conditions or greatest degree or amount possible under given circumstances

Quite Daniel...

Please note I did not ask for the best nor the worst standard, I did not ask for a possible or an impossible standard, in fact I suggested that the more basic the standard the better. A good standard would be getting as much as we can get with the current tech base and still do it quickly, reliably, and in a way that will enable cross platform use... you know sort of like linux can be on many hardware platforms yet still use the same basic kernel. It would be great if this e-book format could transform/ translate to other popular standards (ie kindle/mobi). I would love to use my current pda (palm tungsten E2) with a book I scanned in our own format... dreams are nice and sometimes we make em come true. Mobi pocket is one program I own that can do e-books in the portable hardware I own. So what? I would rather make a difference and adapt to what ever that will be than stay where I am. Probably why I was a teacher, and why I used to do southern blots in a research lab... as well as many aspects of my personal life. Yeah I know it sounds egalitarian to have a standard format because it is.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: E-book standard format

Post by daniel_reetz »

One real reason to reformat on the fly is for people who need to resize the text due to poor eyesight (that'll be us as we get older btw). Another reason is when the publisher uses hideous fonts in an ebook. (I bought the Dresden Files books and they use a font that might be quite nice on paper but is absolutely awful to read at the standard zoom level on the Sony.)

Do you think publishers will provide normal & large print versions in every size?
That's an excellent point, and probably the best argument for reflow.

Aside:

In support of the "resolution will always increase" argument I posited earlier, here's a just-announced 6" reader with 1024x768 (213 ppi) resolution:

http://www.engadget.com/2011/01/07/iriv ... -hands-on/

The Kindle, which has the same-size screen, is 800x600 which is 3.6 in (91 mm) × 4.8 in (122 mm) 167 ppi.

27% improvement, not bad.
kasslloyd
Posts: 41
Joined: 19 Dec 2010, 21:25

Re: E-book standard format

Post by kasslloyd »

I think the technology is going to move more toward that of the iPad, touch screen interfaces that instead of needing to enlarge the text and reflow it you can quickly zoom and pan around a page without needing to reflow it. There is something to be said about page layout and design, reflowing gets rid of hundreds of years of technology as an easy-way-out for limited technologies like the Kindle. Magazines and graphic novels are not the only kinds of print media that benefits from well laid out pages and design.
dickda1

Re: E-book standard format

Post by dickda1 »

I am a long time Acrobat and Fine Reader OCR user. PDF format is a poor intermediate for book readers.

My big gripe about generating PDF's is that is very difficult to then automatically generate another ebook format (epub, mobi, etc) if the text contains images. Reflow is easy for text only, but usually requires manual adjustment of images.
Mangan
Posts: 17
Joined: 19 Jan 2012, 14:33
E-book readers owned: Sony Xperia Arc
Number of books owned: 1000

Re: E-book standard format

Post by Mangan »

Don't you think more and more people will use their iPhones/Andoprids to read books? I know I do, I don't have a reader.

Using Xperia Arc for reading books, 480x854 pixels at 4.2". I'm going to use djvu with OCR for my first project. :-)
Post Reply