My FLOSS Manual On E-Books

A place to tell us about your work and projects. Self-links encouraged!

Moderator: peterZ

JDSimmons

My FLOSS Manual On E-Books

Post by JDSimmons »

I have been writing a FLOSS Manual on e-books: finding, reading, creating, converting from printed books, and publishing. I ended up designing and building my own book scanner, adapting ideas from this site. If you want to check out the book it can be found here:

http://en.flossmanuals.net/ReadingandSugar/Introduction

I have used the techniques from this site to create books for donation to the Internet Archive and Project Gutenberg. My IA donations can be checked out at the following URLs:

http://www.archive.org/details/BigAviationBookForBoys

http://www.archive.org/details/AncientM ... ierreLouys

http://www.archive.org/details/OrpheusMythsOfTheWorld

I didn't use Scan Tailor for the submissions because IA likes the books to look like the originals, and Scan Tailor cleans things up and reformats the pages. I did use Scan Tailor pages for my submission of "Ancient Manners" to Distributed Proofreaders, and got great results.

If you check out my FLOSS Manual I'd appreciate any feedback you might have.

James Simmons
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: My FLOSS Manual On E-Books

Post by spamsickle »

I thought there was a lot of good information in the manual, but I didn't see a way to download it.
Tim

Re: My FLOSS Manual On E-Books

Post by Tim »

spamsickle wrote:I thought there was a lot of good information in the manual, but I didn't see a way to download it.
In the upper left under the heading "READING AND SUGAR" there is a "Make PDF" link that will do so for you. Not extremely intuitive, but it worked.
Tim

Re: My FLOSS Manual On E-Books

Post by Tim »

Definitely good stuff. I thought I'd point out a couple things in the Scantailor section. The e-books says: "In the Scan Tailor method you would run Scan Tailor on left and right pages separately, then combine them together using the method described previously.", but it doesn't have to be done that way. Scan Tailor has a variety of options in the fix orientation such as apply to every other page and every other selected page that makes handling right and left pages together easy. You do have to combine the pages first, but then scantailor is one step for both the right and left pages.

Also the section on the Page Layout stage in Scantailor should be adjusted a little. The page layout stage shows you which pages have a larger content area with the dashed lines (newer versions allow sorting too), and if you have one or more pages that go way to the edge of the page, you can compensate by making the global margin addition smaller if you want to. So you can still keep those pages in if you want to, just make accommodations for them. You can see more about the page layout stage in the documenation: http://sourceforge.net/apps/mediawiki/s ... age_Layout which I see needs to be updated a little again.
JDSimmons

Re: My FLOSS Manual On E-Books

Post by JDSimmons »

Tim wrote:
spamsickle wrote:I thought there was a lot of good information in the manual, but I didn't see a way to download it.
In the upper left under the heading "READING AND SUGAR" there is a "Make PDF" link that will do so for you. Not extremely intuitive, but it worked.
A future version of the FLOSS Manuals software called Booki will allow you to download the book in many formats, including PDF, EPUB, HTML, and Open Office format. Right now there is a site called http://objavi.flossmanuals.net/ which will create a nicer PDF than you get from the "Make PDF" link. This PDF, plus another one with a cover design, is submitted by the site to lulu.com, which is a print on demand service that lets you order bound and printed books. I'd definitely recommend poking around the FLOSS Manuals site and Booki (http://www.booki.cc/booki-user-guide/ho ... use-booki/) to anyone interested in e-books.
JDSimmons

Re: My FLOSS Manual On E-Books

Post by JDSimmons »

Tim wrote:Definitely good stuff. I thought I'd point out a couple things in the Scantailor section. The e-books says: "In the Scan Tailor method you would run Scan Tailor on left and right pages separately, then combine them together using the method described previously.", but it doesn't have to be done that way. Scan Tailor has a variety of options in the fix orientation such as apply to every other page and every other selected page that makes handling right and left pages together easy. You do have to combine the pages first, but then scantailor is one step for both the right and left pages.

Also the section on the Page Layout stage in Scantailor should be adjusted a little. The page layout stage shows you which pages have a larger content area with the dashed lines (newer versions allow sorting too), and if you have one or more pages that go way to the edge of the page, you can compensate by making the global margin addition smaller if you want to. So you can still keep those pages in if you want to, just make accommodations for them. You can see more about the page layout stage in the documenation: http://sourceforge.net/apps/mediawiki/s ... age_Layout which I see needs to be updated a little again.
Thanks for the information. I'll have to check out that documentation a little more. I have a high opinion of Scan Tailor, and if I wasn't trying to make e-books to donate to the Internet Archive (where they like to have the page images look like the originals) I wouldn't use any other method of creating e-books.
Tim

Re: My FLOSS Manual On E-Books

Post by Tim »

JDSimmons wrote:Thanks for the information. I'll have to check out that documentation a little more. I have a high opinion of Scan Tailor, and if I wasn't trying to make e-books to donate to the Internet Archive (where they like to have the page images look like the originals) I wouldn't use any other method of creating e-books.
What part of the Scantailor cleanup process do they object to? It might be something you can skip or compensate for in Scantailor. Or maybe IA could be persuaded or knows of someone that can contribute code to Scantailor to get output the way they like it.
User avatar
Misty
Posts: 481
Joined: 06 Nov 2009, 12:20
Number of books owned: 0
Location: Frozen Wasteland

Re: My FLOSS Manual On E-Books

Post by Misty »

From his description, it sounds like they prefer the images to look pretty much exactly like the originals. The goal of Scan Tailor is to process book pages to make them look like something other than the original pages - that's pretty much incompatible with the Internet Archives's goal.
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.
JDSimmons

Re: My FLOSS Manual On E-Books

Post by JDSimmons »

Tim wrote:
JDSimmons wrote:Thanks for the information. I'll have to check out that documentation a little more. I have a high opinion of Scan Tailor, and if I wasn't trying to make e-books to donate to the Internet Archive (where they like to have the page images look like the originals) I wouldn't use any other method of creating e-books.
What part of the Scantailor cleanup process do they object to? It might be something you can skip or compensate for in Scantailor. Or maybe IA could be persuaded or knows of someone that can contribute code to Scantailor to get output the way they like it.
Actually the Internet Archive will take anything you give them as long as its in the public domain or properly licensed. Submissions from people like me are called "Community Texts". I'm trying to make my submissions look as much as possible like the books that IA scans themselves with their famous Scribe workstations. Nobody asked me to do that, it's just something I feel the need to do. This means that I need to crop my pictures to get page images, and I have to deskew the picture before cropping, etc.

Scan Tailor does not deskew or crop pages. Instead, it figures out where the content is on the page, deskews just that rectangular area, and sticks it on a brand new white page with new margins. The end result is very attractive, but does not look like the original page any more. The color of the page and the margins are different from the original book. Most of the time this is a great improvement, but it just isn't what the original page looked like.

If you check out the links to my three donated books (you can read them online without downloading them) you'll get a better idea of what I'm talking about.

Now when I submitted "Ancient Manners" to Distributed Proofreaders the look of the original pages was never an issue, because they're going to create Plain Text files and HTML. They just needed the cleanest pages and illustrations I could give them, and Scan Tailor is right on the money for that purpose.

Internet Archive also contains books digitized by Google and Microsoft, and the quality of their scans is noticeably worse than what IA does itself. My donations aren't up to IA standards, but I'm working on it.
JDSimmons

Re: My FLOSS Manual On E-Books

Post by JDSimmons »

Actually, the Internet Archive does not have any requirement that Community Texts look like the original book. Their own scans strive to do this, and I was trying to follow their example. However, not every book needs or deserves this treatment. I used Scan Tailor to produce this submission, and I think it is more readable than a non Scan Tailor submission would be:

http://www.archive.org/details/ThirteenWomen

I may submit Scan Tailored versions of other books where the manual method just produced dingy pages.
Post Reply