PROOF READING software of OCR PDF (mac)

Convert page images into searchable text. Talk about software, techniques, and new developments here.

Moderator: peterZ

Post Reply
seasalt

PROOF READING software of OCR PDF (mac)

Post by seasalt »

hello - does anyone know name/link of proofreading software of OCR scanned images for MAC?
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: PROOF READING software of OCR PDF (mac)

Post by rob »

I'm not sure any such software exists. One possibility that Distributed Proofreaders uses is to run the text through a dictionary, and spit out words that are not in the dictionary. It also does some other common corrections for "scannos" (such as tli -> th). But as for grammar, I don't know.
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
seasalt

Re: PROOF READING software of OCR PDF (mac)

Post by seasalt »

great - thx for link - guiguts is windows with GUI. Directed me to a link gutcheck (the script that guiguts uses) and if can get to work on mac 10.6X (I am not very technical)
seasalt

Re: PROOF READING software of OCR PDF (mac)

Post by seasalt »

thanks to rob's reply, I found a python script that works that looks ideal for Proof Reading OCR text (one page at a time + removes scannos that guigut/guiprep often misses - it is called proofer.py and can be DL from link http://git.sugarlabs.org/e-book-making- ... es/master

Apparently proofer.py works beautifully in windows, but not able to use in MAC (as missing PyGTK)
if anyone knows how I could get this python script working in mac, i would love to know!

information sourced from:
ebook enlightment pdf available on web

The proofer.py utility requires PyGTK. While there is a PyGTK download for Windows, there is none for the Macintosh. PyGTK is included with every Linux distribution.

To download and install PyGTK for Windows you'll need to follow the instructions here:
http://www.pygtk.org/downloads.html

On Windows a version of GTK+ is included with The GIMP install, but is not adequate for running PyGTK. You'll need to uninstall it, install the new GTK+ bundle, and replace the PATH entry for GTK to point to the new one. If that sounds like a lot more work than you normally go through to install a Windows program it is. You may find running proofer.py on Windows more trouble than its worth. The other Python programs should still be useful on Windows.

The Python programs themselves can be downloaded here:
http://git.sugarlabs.org/e-book-making- ... ees/master

The trick to downloading them is to click on the program name on this page, which will give you a formatted listing of the code. When you get that look to the upper right of that listing for a link named Raw blob data. Click on that to download the program.
Post Reply