Export color tiff from scantailor in manual correction mode

Johannes Baiter's Spreads and SpreadPi are the latest control systems and postprocessors for DIY scanning. http://spreads.readthedocs.org

Moderator: peterZ

Post Reply
knjigor
Posts: 4
Joined: 15 Nov 2014, 05:00
Number of books owned: 98000
Country: Serbia

Export color tiff from scantailor in manual correction mode

Post by knjigor »

Hi, I know that this is about scantailor but it is related to spreads, not to st itself. Is it possible (or is it a bug) to export tiff's from scantailor in manual correction mode in color/grayscale mode, when I'm using standalone st it export them as I define, but if I'm using it from spread web or gui (uncheck skip manual correcrion) it always export them in bw mode?
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Export color tiff from scantailor in manual correction m

Post by duerig »

Sounds like it is a bug in ScanTailor's cli mode. There are a number of these and I found a few myself when I was trying it out. Unfortunately, I am not sure what to do about it because the official maintainer for ScanTailor isn't around much. you can look to see if there is a bug report already posted to Github.

For me, I have found it most useful to use Spreads just for the capture part and then process the resulting images using ST or other tools directly.
knjigor
Posts: 4
Joined: 15 Nov 2014, 05:00
Number of books owned: 98000
Country: Serbia

Re: Export color tiff from scantailor in manual correction m

Post by knjigor »

Currently I use ST alone for postprocessing of images, but it seemed to me it would be much more convenient to get color pdf and then process it in ABBYY (I got legal version on windows machine), do OCR and do and additionally crop if necessary (this particularly when using WEB mode where I can pre crop pages). I`m currently working on prototype of diy book scaner (hope that I will finish it soon and post some pictures), and I have MSI U160 netbook taht hase Lubuntu and spreads instaleld for processing and control, and output is raw pdf that is going to postpostprocessing in ABBYY. This workfolw we have used with http://www.qidenus.com/product/mastered/ book scanner at University I work at, because it postprocessing software is slow and bugy, and ABBYY is extra fast.
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Export color tiff from scantailor in manual correction m

Post by duerig »

Hmm. It has been a while since I used Speeads as a post processor. But maybe what you want to do is disable the ST plugin entirely and just use the autorotate and pdfbeads plugins. That should give you color PDFs which you can use in ABBYY.

Does ABBYY do binarization and deskewing as well as ocr?
knjigor
Posts: 4
Joined: 15 Nov 2014, 05:00
Number of books owned: 98000
Country: Serbia

Re: Export color tiff from scantailor in manual correction m

Post by knjigor »

Actualy that was first thing that I tried but pdfbeads for some reason won`t create pdf from jpg, just from tiff (was googleing for solution, but still didn`t find it), so I have to use ST.

Here is output from terminal with autorotate, gui, web and pdfbeads enabled:

Code: Select all

Workflow: Starting postprocessing...
Workflow: Running 'process' hooks
spreadsplug.autorotate: Rotating images
bagit: Adding path /home/igor/scans/004/data/done to payload
bagit: Adding path /home/igor/scans/004/bag-info.txt to payload
bagit: Adding path /home/igor/scans/004/pagemeta.json to payload
Workflow: Done with postprocessing!
Workflow: Generating output files...
Workflow: Running 'output' hooks
spreadsplug.pdfbeads: Assembling PDF.
spreadsplug.pdfbeads: Running /usr/local/bin/pdfbeads -d -M /tmp/tmpo6vxvH/metadata.txt /tmp/tmpo6vxvH/000_rotated.jpg /tmp/tmpo6vxvH/001_rotated.jpg /tmp/tmpo6vxvH/002_rotated.jpg /tmp/tmpo6vxvH/003_rotated.jpg -o /home/igor/scans/004/data/out/book.pdf
spreadsplug.pdfbeads: pdfbeads stdout:

spreadsplug.pdfbeads: pdfbeads stderr:
/usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require': iconv will be deprecated in the future, use String#encode instead.
pdfbeads: no pages to process

bagit: Path /home/igor/scans/004/data/out is an empty directory , will be skipped.
bagit: Adding path /home/igor/scans/004/bag-info.txt to payload
Workflow: Done generating output files!
If I use ST, i get B/W tiff and B/W pdf but it is not usable for me because we want to create realistic foto digitized books (we are working with old and rare books that we have in our library).

As for ABBYY, it is realy greate peace of software, and it isn`t rely expencive (profesional licence is about 140 euros with dealer costs, so if you buy it directly it could be even less). It is a profesional OCR software for windows and it is working excelent, it has a realy solid post processing part where you can work with images (from pdf, tiff, png, jpg...), it autodetecs pictures, text and tables in document and process them with separate algorithms, and combine to output file (you can export to epub, djvu, pdf, pdf/a, doc, docx...). I personaly use it just for processing books in pdf`s that are digitized on flat scanner or bookscanner, wery little for OCR.

We have book scaner at Universiti, but as I`m working at faculty that is part of it I can use it, but since it isn`t in my building/office it isn`t allways convinient to carry books, and I can`t work 20-30 hours a week on it (have to share usage), so I`m working on book scaner that I could use all time, that would be in my ofice, and teach others to use it/them.
cday
Posts: 456
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Export color tiff from scantailor in manual correction m

Post by cday »

duerig wrote:Does ABBYY do binarization and deskewing as well as ocr?
Yes, and much more...

You can download the FineReader 12 User's Guide here http://finereader.abbyy.com/guide/

It's a very capable program but understanding all the available options so as to get the best out of it requires some study.
Post Reply