HELP - Scan Tailor Project --> .pdf
Moderator: peterZ
HELP - Scan Tailor Project --> .pdf
I've just finished building my scanner, and I've even got the cameras up and running with SDM...everything working beautifully. Now I'm ready to test the post processing so I loaded Scan Tailor, watched the video tutorials and even processed a test-project of about 10 pages. But I'm at a loss now as to how to get the STProject to .pdf in order to view it on my computer and mobile device. What's the most common procedure? I'm using a MacBook Pro, with Windows XP installed as well. Thanks.
-
- Posts: 496
- Joined: 04 Mar 2014, 00:53
Re: HELP - Scan Tailor Project --> .pdf
You need PDF binding/building software. There is freeware, shareware and flagship software which does this.
First, locate where the output TIFFs that were produced in Scan Tailor.
OSX has the app "Preview' already on your Mac which can convert a series of TIFFs from Scan Tailor into a PDF. Simply open Preview with the cover image TIFF, and drag more images onto the opened TIFF. It should combine the TIFFs where you can then save the correlated images as a single PDF. In ColorSync utility app (built-in on OSX), you can make custom compression settings for your PDFs too.
When you have the desire to OCR your TIFFs, I personally recommend OmniPage Pro X. It performs OCR before it compresses and converts the image to PDF. There are several apps like AABBY Express, Adobe Acrobat, and Readiris, etc.
There is also plans regarding some interesting PDF & DJVU software for the community being written by DIY members here, so you should also explore djvubind (http://www.diybookscanner.org/forum/vie ... ?f=3&t=521) and look for an upcoming PDF builder app as well.
But try and stay with PDFs for awhile to ensure application compatibility and don't delete your master processed images; you need to determine the most ideal compression settings and what format will be best for you. This takes time, trial and error.
First, locate where the output TIFFs that were produced in Scan Tailor.
OSX has the app "Preview' already on your Mac which can convert a series of TIFFs from Scan Tailor into a PDF. Simply open Preview with the cover image TIFF, and drag more images onto the opened TIFF. It should combine the TIFFs where you can then save the correlated images as a single PDF. In ColorSync utility app (built-in on OSX), you can make custom compression settings for your PDFs too.
When you have the desire to OCR your TIFFs, I personally recommend OmniPage Pro X. It performs OCR before it compresses and converts the image to PDF. There are several apps like AABBY Express, Adobe Acrobat, and Readiris, etc.
There is also plans regarding some interesting PDF & DJVU software for the community being written by DIY members here, so you should also explore djvubind (http://www.diybookscanner.org/forum/vie ... ?f=3&t=521) and look for an upcoming PDF builder app as well.
But try and stay with PDFs for awhile to ensure application compatibility and don't delete your master processed images; you need to determine the most ideal compression settings and what format will be best for you. This takes time, trial and error.
-
- Posts: 596
- Joined: 06 Jun 2009, 23:57
Re: HELP - Scan Tailor Project --> .pdf
Since you say you have Windows XP installed, here's what I'm doing.
A product called ImageMagick converts from Scan Tailor's TIF images to pdf, with one console command:
mogrify -format pdf *.tif
Once I have PDF versions of all the pages, I use a second tool, pdftk, to put them together with the command
pdftk p*.pdf cat output mybook.pdf
The only thing to be careful of here is not to get into an infinite loop by accidentally mixing your output with your input. All my separate pages are named either p0001.pdf or simply 0001.pdf, so my input specification is either p*.pdf or 0*.pdf, and I make sure my output name doesn't begin with either "p" or "0".
For most books now, I'm also going through a third step, using Adobe Acrobat to OCR and output a Clearscan version of the PDF. This is a commercial product, though, unlike Image Magick and pdftk.
A product called ImageMagick converts from Scan Tailor's TIF images to pdf, with one console command:
mogrify -format pdf *.tif
Once I have PDF versions of all the pages, I use a second tool, pdftk, to put them together with the command
pdftk p*.pdf cat output mybook.pdf
The only thing to be careful of here is not to get into an infinite loop by accidentally mixing your output with your input. All my separate pages are named either p0001.pdf or simply 0001.pdf, so my input specification is either p*.pdf or 0*.pdf, and I make sure my output name doesn't begin with either "p" or "0".
For most books now, I'm also going through a third step, using Adobe Acrobat to OCR and output a Clearscan version of the PDF. This is a commercial product, though, unlike Image Magick and pdftk.
- dingodog
- Posts: 110
- Joined: 22 Jul 2010, 18:19
- Number of books owned: 1000
- Country: on the net
- Location: on the net
- Contact:
Re: HELP - Scan Tailor Project --> .pdf
I usespamsickle wrote:Since you say you have Windows XP installed, here's what I'm doing.
mogrify -format pdf *.tif
*sam2p*
- http://pts.szit.bme.hu/sam2p/
with this script:
Code: Select all
#!/bin/bash
directory=`pwd`
for file in $directory/*.tiff
do
filename=${file%.tiff}
sam2p $filename.tiff $filename.pdf
done
it is important to perform a further refinement, after joined the single pdfs, XREF table must be rebuiltspamsickle wrote:
then I also use pdftk
Once I have PDF versions of all the pages, I use a second tool, pdftk, to put them together with the command
pdftk p*.pdf cat output mybook.pdf
Code: Select all
pdftk *.pdf cat output mybook.pdf ; pdftk mybook.pdf output fixed.pdf ; mv fixed.pdf mybook.pdf
Re: HELP - Scan Tailor Project --> .pdf
Apologies I've been taking so long with my PDF maker. I've been busy on non-scanning projects for the past few months, which has kept me away from it, and I originally left it off when I ran into a problem with ImageMagick. I'm still aiming to get it finished in the relatively near future, and I have most of the technical issues sorted through now.
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.
Re: HELP - Scan Tailor Project --> .pdf
Thanks for the information. I'm trying to shorten the learning curve with Scan Tailor, and as soon as I become adept at formatting the images/pages. I'll use this information and some of the things from the other replies to put everything in a .pdf. It would be great if Scan Tailor would have this built in - sort of an all-in-one post processing program. Thanks again.univurshul wrote:You need PDF binding/building software. There is freeware, shareware and flagship software which does this.
First, locate where the output TIFFs that were produced in Scan Tailor.
OSX has the app "Preview' already on your Mac which can convert a series of TIFFs from Scan Tailor into a PDF. Simply open Preview with the cover image TIFF, and drag more images onto the opened TIFF. It should combine the TIFFs where you can then save the correlated images as a single PDF. In ColorSync utility app (built-in on OSX), you can make custom compression settings for your PDFs too.
When you have the desire to OCR your TIFFs, I personally recommend OmniPage Pro X. It performs OCR before it compresses and converts the image to PDF. There are several apps like AABBY Express, Adobe Acrobat, and Readiris, etc.
There is also plans regarding some interesting PDF & DJVU software for the community being written by DIY members here, so you should also explore djvubind (http://www.diybookscanner.org/forum/vie ... ?f=3&t=521) and look for an upcoming PDF builder app as well.
But try and stay with PDFs for awhile to ensure application compatibility and don't delete your master processed images; you need to determine the most ideal compression settings and what format will be best for you. This takes time, trial and error.
-
- Posts: 496
- Joined: 04 Mar 2014, 00:53
Re: HELP - Scan Tailor Project --> .pdf
Scan Tailor just works with the images and cleans them for later ebook construction. That alone is worth buffering apps before and after it. However, we do have a member spearheading the all-in-one route: http://www.diybookscanner.org/forum/vie ... ?f=3&t=302clemd973 wrote:It would be great if Scan Tailor would have this built in - sort of an all-in-one post processing program. Thanks again.
I haven't had a chance to test it myself.
I'm actually busy testing software tools that focus on preparing images pre-Scan Tailor. I'll have a discussion posted about Adobe Lightroom 3 soon.
Re: HELP - Scan Tailor Project --> .pdf
Can't wait to see it. How will we know when it's up and running??? As for me, please PM me when it's ready...I'd love to beta-test it if you're planning on going that route! PhilipMisty wrote:Apologies I've been taking so long with my PDF maker. I've been busy on non-scanning projects for the past few months, which has kept me away from it, and I originally left it off when I ran into a problem with ImageMagick. I'm still aiming to get it finished in the relatively near future, and I have most of the technical issues sorted through now.
Re: HELP - Scan Tailor Project --> .pdf
For Mac users, here's a good alternative route for pre-ScanTailor processing: http://www.diybookscanner.org/forum/vie ... ?f=3&t=527. Please let us know about the Adobe Lightroom 3 discussion.univurshul wrote: I'm actually busy testing software tools that focus on preparing images pre-Scan Tailor. I'll have a discussion posted about Adobe Lightroom 3 soon.
-
- Posts: 596
- Joined: 06 Jun 2009, 23:57
Re: HELP - Scan Tailor Project --> .pdf
I see that sam2p has Windows binaries as well as Linux. The author claims that it's better than ImageMagick for creating PDFs, and the reasons he gives seem reasonable.
I'll give it a try. Just doing a straight no-fiddling conversion of a single TIF file from an old scan, the sam2p version was quite a bit smaller (308K vs 465K), and I can't see the difference between them. That's not necessarily a big deal if I'm going to use Acrobat's Clearscan option after the PDF has been built, but it does appear to confirm the author's claim of smaller files. He also claims faster creation and finer control. I still need to learn more about the PDF format.