Absolutely. If we had dev hours, that's where we'd put them. Hopefully we will have some of those in the near future. I'd certainly love to see some of the page de-warping algorithms straighten out my page images.daniel_reetz wrote: Agreed, but as noted here Tulon is still pretty much a one-man show. It would be great to get him some development support, especially as the Scan Tailor userbase continues to expand.
Introducing djvubind for djvu file creation
Moderator: peterZ
Re: Introducing djvubind for djvu file creation
- strider1551
- Posts: 126
- Joined: 01 Mar 2010, 11:39
- Number of books owned: 0
- Location: Ohio, USA
Re: Introducing djvubind for djvu file creation
Back from retreat.
Making it onto the diybookscanner news blog... life goal, check!
Making it onto the diybookscanner news blog... life goal, check!
I just want to note that I think it may be completely functional in a Mac environment without any porting, so long as the dependencies are available. Obviously, without access to a Mac (or the interest in Macs) I'm not going to claim Mac compatibility as a feature, and I definitely don't know how to package software for Mac.daniel_reetz wrote:anyone interested in making a Mac or Windows port?
Re: Introducing djvubind for djvu file creation
Unfortunately of the dependencies you mentioned, minidjvu isn't available in macports and doesn't build cleanly for me on it's own on a Snow Leopard (10.6) system. Of the others here are the versions that are available in macports:strider1551 wrote: I just want to note that I think it may be completely functional in a Mac environment without any porting, so long as the dependencies are available. Obviously, without access to a Mac (or the interest in Macs) I'm not going to claim Mac compatibility as a feature, and I definitely don't know how to package software for Mac.
python 3.1.2
djvulibre 3.5.22
imagemagick 6.6.3-9
tesseract 2.04
So it's just one dependency that seems to be a sticking point. I'm willing to test anything if you have any ideas. Or maybe n9yty can have a go at getting minidjvu installed on a mac, he seems pretty good with mac development.
- strider1551
- Posts: 126
- Joined: 01 Mar 2010, 11:39
- Number of books owned: 0
- Location: Ohio, USA
Re: Introducing djvubind for djvu file creation
I just put up the latest version (0.2.0). I make note of it here because djvubind now supports the cuneiform ocr engine as requested. The source is available now, and an updated ebuild and debian package will hopefully be up tomorrow evening.
The default ocr engine is still tesseract, mainly because I encountered two annoying issues with cuneiform. The first is that it returns a non-successful status if given a blank image. The second, and far more important, issue is that I encounter frequent crashes due to buffer overflows. A year-old bug report suggests that it may be due to gentoo cflags, but I also encountered it in an Ubuntu virtual machine. On may current book, cuneiform crashes on 32% of the pages. The number of errors increase without the --singlecolumn option, though I did not make records. If you select cuneiform as the ocr engine and it crashes, djvubind will run the image through tesseract.
Also, there are presently two semi-undocumented options --tesseract-options and --cuneiform-options. These will allow you to pass arguments along to the ocr engine, which may be especially relevant if the text is not English (see the man pages of your respective engines). They may change in the future since I haven't convinced myself that this is the best way to go about things, but I wanted that capability available.
I prefer that bugs and suggestions be given on the project issue tracker rather than here. I don't intend to post on this thread again until I'm closer to porting to Mac (albeit without minidjvu), but keep an eye on the project page if you use djvubind for other upgrades.
The default ocr engine is still tesseract, mainly because I encountered two annoying issues with cuneiform. The first is that it returns a non-successful status if given a blank image. The second, and far more important, issue is that I encounter frequent crashes due to buffer overflows. A year-old bug report suggests that it may be due to gentoo cflags, but I also encountered it in an Ubuntu virtual machine. On may current book, cuneiform crashes on 32% of the pages. The number of errors increase without the --singlecolumn option, though I did not make records. If you select cuneiform as the ocr engine and it crashes, djvubind will run the image through tesseract.
Also, there are presently two semi-undocumented options --tesseract-options and --cuneiform-options. These will allow you to pass arguments along to the ocr engine, which may be especially relevant if the text is not English (see the man pages of your respective engines). They may change in the future since I haven't convinced myself that this is the best way to go about things, but I wanted that capability available.
I prefer that bugs and suggestions be given on the project issue tracker rather than here. I don't intend to post on this thread again until I'm closer to porting to Mac (albeit without minidjvu), but keep an eye on the project page if you use djvubind for other upgrades.
- strider1551
- Posts: 126
- Joined: 01 Mar 2010, 11:39
- Number of books owned: 0
- Location: Ohio, USA
Re: Introducing djvubind for djvu file creation
Now available for Macs.
A kind benefactor (raphael.straub@web.de) has submitted djvubind to Macports and has informed me that a limited test ran fine. If you don't use Macports, wait for the next version so that I can adjust a few things that his Macport script takes care of already. If you do find bugs, Mac related or not, let me know on the issue tracker.
The latest version is also multi-threaded and fixes some ocr issues, so I recommend upgrading. For what it's worth, there is also a mailing list now (djvubind-discuss@googlegroups.com) that mainly just acts as a feed what's happening on the issue tracker.
A kind benefactor (raphael.straub@web.de) has submitted djvubind to Macports and has informed me that a limited test ran fine. If you don't use Macports, wait for the next version so that I can adjust a few things that his Macport script takes care of already. If you do find bugs, Mac related or not, let me know on the issue tracker.
The latest version is also multi-threaded and fixes some ocr issues, so I recommend upgrading. For what it's worth, there is also a mailing list now (djvubind-discuss@googlegroups.com) that mainly just acts as a feed what's happening on the issue tracker.
-
- Posts: 496
- Joined: 04 Mar 2014, 00:53
Re: Introducing djvubind for djvu file creation
Awesome.... uhhh, for software idiots such as myself, what does this potentially mean if minidjvu isn't yet available on OSX?
djvubind requires minidjvu, correct? The 2 create a suite of processing workflow from what I understand
djvubind requires minidjvu, correct? The 2 create a suite of processing workflow from what I understand
- strider1551
- Posts: 126
- Joined: 01 Mar 2010, 11:39
- Number of books owned: 0
- Location: Ohio, USA
Re: Introducing djvubind for djvu file creation
Oh, I should have mentioned that. Macports apparently has a port for minidjvu as well. If you're not going the macports route, the latest code in the repository will automatically use cjb2 (from djvulibre) if minidjvu isn't available... so either pull from the repo or wait for the next version.
- strider1551
- Posts: 126
- Joined: 01 Mar 2010, 11:39
- Number of books owned: 0
- Location: Ohio, USA
Re: Introducing djvubind for djvu file creation
I released version 1.0.0 today. Some pretty big improvements that may interest some here:
- Mac is fully supported whether you have minidjvu or not.
- Windows is supported thanks to Darko, who sent me a patch that took care of pretty much everything.
- Works with tesseract-3.0.0, which was released about a month ago.
- Configuration file support that lets you set options to every encoder and ocr program, amongst other things.
I don't think most Window users will find djvubind very intuitive because it is a command line program. Plus, I can't make it into a familiar .exe until py2exe catches up with python3. BUT, I've tested it on a Windows XP machine and everything seems to work smoothly. The command can be horrifically long (C:\Python31\python.exe "C:\Program Files\djvubind\bin\djvubind"), but it works, gosh darn it!
As always, bug reports, comments, feature requests are welcome.
- Mac is fully supported whether you have minidjvu or not.
- Windows is supported thanks to Darko, who sent me a patch that took care of pretty much everything.
- Works with tesseract-3.0.0, which was released about a month ago.
- Configuration file support that lets you set options to every encoder and ocr program, amongst other things.
I don't think most Window users will find djvubind very intuitive because it is a command line program. Plus, I can't make it into a familiar .exe until py2exe catches up with python3. BUT, I've tested it on a Windows XP machine and everything seems to work smoothly. The command can be horrifically long (C:\Python31\python.exe "C:\Program Files\djvubind\bin\djvubind"), but it works, gosh darn it!
As always, bug reports, comments, feature requests are welcome.
-
- Posts: 496
- Joined: 04 Mar 2014, 00:53
Re: Introducing djvubind for djvu file creation
Strider1551,strider1551 wrote: - Mac is fully supported whether you have minidjvu or not.
...bug reports, comments, feature requests are welcome.
Not seeing any bootable Mac app in the package. Is there a builder app I need to run this? Keep in mind I'm very illiterate with all things software to the extent of .dmg and .app. (Many thanks for your hard work)
-Wait, Are you using a Book Liberator when scanning? Does the Book Liberator not flatten your pages? If so, I'm assuming you're referring to warping at the gutter?bkrpr wrote:... I'd certainly love to see some of the page de-warping algorithms straighten out my page images.
- strider1551
- Posts: 126
- Joined: 01 Mar 2010, 11:39
- Number of books owned: 0
- Location: Ohio, USA
Re: Introducing djvubind for djvu file creation
I am terribly ignorant of how Mac packages applications (apart form recognizing the words .dmg and .app). And since I don't have a Mac, I won't be learning anytime soon. So unless someone else volunteers, your options are to run it from the unpacked source archive or install it manually.univurshul wrote:Not seeing any bootable Mac app in the package. Is there a builder app I need to run this? Keep in mind I'm very illiterate with all things software to the extent of .dmg and .app. (Many thanks for your hard work)
From source:
Let's suppose you have everything unpacked at ~/djvubind-1.0.0 . Open a terminal, get to the directory with your images, and use the full path back to djvubind with whatever options:
Code: Select all
cd ~/current_book
~/djvubind-1.0.0/bin/djvubind --verbose
Installing it to the system let's you simply call "djvubind". Installing it will need to be done as a user with administrative access (does mac use sudo or su or something else?), then using it can be any user.
Code: Select all
cd ~/djvubind-1.0.0
# To see where it will put things...
./setup.py install --dry-run
# To actually do it
./setup.py install
# And then to use it...
cd ~/current_book
djvubind --verbose
Actually, if someone wants to write up a thing on how to install and use for mac/windows from the end-user perspective, I'd be happy to include it in the README.