Version of ABBYY?
Moderator: peterZ
Version of ABBYY?
Is there a particular version of ABBYY that is suited for assembling my JPGS to PDF and doing OCR? There seem to be a lot of versions of their software.
-
- Posts: 18
- Joined: 29 Dec 2012, 21:50
- E-book readers owned: 10x iRex DR1000, 15x iRex DR800
- Number of books owned: 10000
- Country: Spain
- Contact:
Re: Version of ABBYY?
I used ABBYY 7 for ages, because it was the only one that did a decent OCR job, even if it left insane amounts of post-processing to the user (such as de-hyphenating words split at line breaks) for which I wrote custom scripts anyway. Every other OCR I tested yielded inferior results, and every version after 7 was bloated, much slower and not significantly better in any way that was meaningful to me.
All of that changed with ABBY 11. It's still heavier on the machine, but it is finally what I would call 'suitable for human consumption', and now I don't need most of my scripts. I'm not sure about what you mean by 'assembling my JPGS to PDF and doing OCR', though. I do that, but I am peculiar so I'm not sure you mean that exactly. I currently assemble PDFs from images with IrfanView but I'm writing more custom software to eventually phase it out and do it from a command line prompt.
HTH.
All of that changed with ABBY 11. It's still heavier on the machine, but it is finally what I would call 'suitable for human consumption', and now I don't need most of my scripts. I'm not sure about what you mean by 'assembling my JPGS to PDF and doing OCR', though. I do that, but I am peculiar so I'm not sure you mean that exactly. I currently assemble PDFs from images with IrfanView but I'm writing more custom software to eventually phase it out and do it from a command line prompt.
HTH.
-
- Posts: 18
- Joined: 22 Dec 2011, 20:00
- E-book readers owned: kindle
- Number of books owned: 4000
- Location: Nr. London, UK
Re: Version of ABBYY?
Finereader allows you to load a series of JPEGS (or other image files), OCR, and then save to various file formats - PDF with the text embedded is one of them. There are various settings to the PDFs so you can play around with them until you hit the output size that suits.
Acrobat X will allow you to batch process image files so that you produce text embedded PDFs but it will only work as individual files, so if you want a folder of JPEGS ending up as one multi-page PDF you will have to assemble them - easy enough to do in Acrobat, but another step.
For OCR I would always allow the software to work from the image files as converting to any other format before performing OCR may add in extra levels of compression and artefacts and reduce the quality of the OCR.
Acrobat X will allow you to batch process image files so that you produce text embedded PDFs but it will only work as individual files, so if you want a folder of JPEGS ending up as one multi-page PDF you will have to assemble them - easy enough to do in Acrobat, but another step.
For OCR I would always allow the software to work from the image files as converting to any other format before performing OCR may add in extra levels of compression and artefacts and reduce the quality of the OCR.