Unpaper - extrat text and images from scanned pages

Discussions, questions, comments, ideas, and your projects having to do with DIY Book Scanner software. This includes the Stereo Data Maker software for the cameras, post-processing software, utilities, OCR packages, and so on.

Moderator: peterZ

Post Reply
jumpjack
Posts: 21
Joined: 04 Mar 2014, 00:53

Unpaper - extrat text and images from scanned pages

Post by jumpjack »

Do you know this software?
http://unpaper.berlios.de/unpaper.html

It looks quite cool... but ubfortunately it's linux only: does anybody of you know how to use it on a Windows VirtualMachine running Linux? I'm not skilled at all in Linux, so I'd need a pre-compiled binary, but I don't even know if it exists...
How can I use Unpaper in Ubuntu, for example?
Ore does any similar program exist for Windows platform?
Anonymous1

Re: Unpaper - extrat text and images from scanned pages

Post by Anonymous1 »

Isn't this what Scan Tailor does, except with a GUI and a Windows build? I would like the despeckling, but that's already been refined quite well in many other applications.
User avatar
strider1551
Posts: 126
Joined: 01 Mar 2010, 11:39
Number of books owned: 0
Location: Ohio, USA

Re: Unpaper - extrat text and images from scanned pages

Post by strider1551 »

Anonymous wrote:Isn't this what Scan Tailor does, except with a GUI and a Windows build?
Unpaper certainly fills the same niche as scantailor, but they do have slightly different feature sets. For one thing, unpaper is run from the command line and could be worked into a more automated software flow than could be done with scantailor. Unfortunately, in my experience unpaper has been very reliable in giving me very poor results, especially compared to scantailor. Of course, I've never invested much time in working with it either...
jumpjack wrote:How can I use Unpaper in Ubuntu, for example?
Get yourself to a command line (probably applications->accessories->terminal or the like). To install unpaper: "sudo apt-get install unpaper". To use it, move to the directory that contains your images (e.g. "cd ~/directory_with_scans") and then "unpaper input_file output_file".

The documentation you linked to should help you understand the available options. You can also read the manual with "man unpaper".
Anonymous1

Re: Unpaper - extrat text and images from scanned pages

Post by Anonymous1 »

I was thinking the exact same thing. I am working on a scanner design which downloads the images directly from the camera each shot. This could be useful for pre-ST processing (as it gets confused when I have tons of edges). I might use this too.
E^3
Posts: 41
Joined: 12 Jul 2010, 21:06

Re: Unpaper - extrat text and images from scanned pages

Post by E^3 »

Hi Folks,

Wow nice tips, it could be easy for me now to have image processing

Thanks!

E^3
steve1066d
Posts: 296
Joined: 27 Nov 2010, 02:26
E-book readers owned: PRS-505
Number of books owned: 1250
Location: Minneapolis, MN
Contact:

Re: Unpaper - extrat text and images from scanned pages

Post by steve1066d »

If you are looking for unpaper compiled on windows, grab this software:

http://pdfread.sourceforge.net/

It includes unpaper.exe
Steve Devore
BookScanWizard, a flexible book post-processor.
Post Reply