Bitonal conversion using Photoshop (Gimp & Imagemagic...)

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.

Moderator: peterZ

xorpt
Posts: 42
Joined: 24 Feb 2012, 01:37
E-book readers owned: Sony PRS-T1
Number of books owned: 2000

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by xorpt »

See here on how to use the gimp command line to execute a script:

http://www.gimp.org/tutorials/Basic_Batch/
Benedictus
Posts: 15
Joined: 14 Jan 2013, 00:38
E-book readers owned: None
Number of books owned: 2000
Country: Spain

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by Benedictus »

xorpt wrote:Yes this unique command replicates as much as possible the steps of the Gimp script
convert %%1 -colorspace Gray -level 20%%,80%%,.67 ^( +clone -bias 50%% -morphology convolve DoG:1,0,1 -level 0%%,50%%,.45 ^) -compose dissolve -define compose:args=70 -composite -threshold 80%% "%%~ndp1_dog.png"
-colorspace Gray : changes the colorspace to Grayscale
-level 20%%,80%%,.67 : applies level (black point 20%, white point 80%, gamma 0.67)
( +clone ) : duplicates the current layer
-bias 50%% -morphology convolve DoG:1,0,1 -level 0%%,50%%,.45 : applies Difference of Gaussians on the duplicate layer
-compose dissolve -define compose:args=70 -composite : merges the two layers together
-threshold 80%% : applies a threshold on the resulting image to keep only two colors

The results cannot be exactly the same because the implementation of similar functions in Gimp and Imagemagick is not exactly the same: they are different programs :) but still, it's close, so I think you should test both. So I don't think that you can use Gimp as a preview tool for IM...
That's it. With this and IM documentation there is a good starting point. I'll dig into it as soon as I have some spare time. Thanks.

Just a few questions. Is the "70" in "-compose dissolve -define compose:args=70 -composite" the IM counterpart of the layer opacity slider in the GIMP script?

And what about the "First threshold" and "Second threshold" in the GIMP script. I tend to use them a lot. Looking at the GIMP script code, I seem to understand that "-threshold 80%%" is the "Second threshold", but the "First threshold" seems to be applied to the dog layer. Is it one of the "50%" in "-bias 50%% -morphology convolve DoG:1,0,1 -level 0%%,50%%,.45"?
xorpt wrote:Maybe what you could try is to use a Gimp script-fu without the GUI interface, only the command line. I haven't tried it, but I suppose it's possible. Probably the speed would be better, since it would not have to display the result in real-time
But, as I understand it, the overhead that adds having to display the results applies only to the single bitonal conversion script. The only thing being displayed in the batch bitonal conversion script is a progress bar. It is right to assume that script-fu without the GUI would be roughly as fast (or as slow) as Batch bitonal converter?
Plustek OpticBook A300
xorpt
Posts: 42
Joined: 24 Feb 2012, 01:37
E-book readers owned: Sony PRS-T1
Number of books owned: 2000

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by xorpt »

You are probably right about the display, but I guess it's worth testing.

Also, did you try activating the multi-core feature of Gimp? I think it's deactivated by default. You should activate it in the gimprc file and in the Preferences

Regarding the DoG in IM, I'm sorry but I don't remember precisely why I used bias and levels with the DoG filter and not thresholding. I probably found an example in the Gimp help and found it gave better results...
Benedictus
Posts: 15
Joined: 14 Jan 2013, 00:38
E-book readers owned: None
Number of books owned: 2000
Country: Spain

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by Benedictus »

xorpt wrote:Also, did you try activating the multi-core feature of Gimp? I think it's deactivated by default. You should activate it in the gimprc file and in the Preferences
It is enabled.

I've been playing with the IM commands. The results are every bit as good as GIMP's and it ended up being marginally faster than GIMP. GIMP took 1 hour and 9 minutes to process the 432 pages of the book I was processing, while IM took exactly an hour. These numbers should be taken with a grain of salt, as I was doing other stuff in the computer during both processes. Besides, my command in IM was working with a much lower radius in the DoG operation, which is the most time consuming step, so it is possible that IM is actually slower than GIMP.

IM, with these commands, has, as I see it now, two advantages over GIMP. The first one, which can also be regarded as a disadvantage, is that it relies on less parameters than GIMP to produce the same results. In the process of getting my book to look right, I ended up tweaking just the 2 levels and the DoG parameters. There is not much in it to get lost, so the process is very straightforward. I've been also looking into more sophisticated stuff with IM (as in this script), but it looked like it needed some work and I was happy enough with the results of your commands. The second advantage is the fact that, IM being a command line tool, can make you do so much more stuff in a single step, and the whole workflow with the ST output becomes much faster this way.

There are 2 drawbacks as well. The first one is the lack of a GUI. I relied a lot on the Ctrl+Z/Ctrl+Y keyboard shortcuts in GIMP to compare the bitonal conversion with the original scan and I missed it deeply when working with IM. I finally ended up using the "Compare" feature of XnView, which is a good replacement.

Maybe the worst thing of IM is the fact that it tends to produce larger files than GIMP, Photoshop and IrfanView, even when using tricks like "-strip" and "-"tiff:rows-per-strip=x". In this book, the final result showed a difference of 2.5 MB (90.5 vs 93 MB) after thresholding (I potrace the files to remove speckles and beautify the characters a bit and threshold them afterwards) between Photoshop and IM. I have still to test if it makes a difference once the files are converted to PDF. I guess that it won't make a difference when using compressed PDFs, but I always keep a compressed and a "lossless" ("Quality loss not allowed" in ABBYY FineReader) copy of my books, and I wonder if the extra MB added by IM will be present in the last case.
Plustek OpticBook A300
murgen
Posts: 19
Joined: 22 Sep 2012, 03:45
E-book readers owned: Kindle
Number of books owned: 1000
Country: Belgium

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by murgen »

Anybody managed to do a gimp batch execution of bitonal-converter-batch without starting the GUI?

I am trying this:
:\app\ocr\GIMPPortable\App\gimp\bin\gimp-console-2.8.exe -i -f -d --verbose -b (bitonal-converter-batch 'C:\data\perma\tmp4\ok1' 'jpg' 'tif' 20 220 0.67 6 1 1 1 33 240 120) -b "(gimp-quit 0)"

# this is the end of the log : nothing happen


Starting extension: 'extension-script-fu'
No batch interpreter specified, using the default 'plug-in-script-fu-eval'.
batch command executed successfully
EXIT: gimp_exit
EXIT: gimp_real_exit
EXIT: app_exit_after_callback
I tried also with path as follow :'C:\\data\\perma\\tmp4\\ok1' or replacing the quotes by dooubles quotes to no avail.
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by duerig »

Murgen, I think I see the problem. I would bet that each argument with a '-b' option has to be quoted. So you need a command line like:

c:\app\ocr\GIMPPortable\App\gimp\bin\gimp-console-2.8.exe -i -f -d --verbose -b "(bitonal-converter-batch 'C:\\data\\perma\\tmp4\\ok1' 'jpg' 'tif' 20 220 0.67 6 1 1 1 33 240 120)" -b "(gimp-quit 0)"

Best of luck.
murgen
Posts: 19
Joined: 22 Sep 2012, 03:45
E-book readers owned: Kindle
Number of books owned: 1000
Country: Belgium

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by murgen »

This is very frustrating:
Function witout, all string with quotes, numerics without quotes

Code: Select all


C:\data\perma\tmp4\ok1>c:\app\ocr\GIMPPortable\App\gimp\bin\gimp-console-2.8.exe -i -f -d --verbose -b "(bitonal-converter-batch 'C:\\data\\perma\\tmp4\\ok1' 'jpg' 'tif' 20 220 0.6 7 6 1 1 1 33 240 120)" -b "(gimp-quit 0)"
INIT: gimp_load_config
Parsing 'C:\Documents and Settings\P0957\.gimp-2.8\unitrc'
Parsing 'c:\app\ocr\GIMPPortable\App\gimp\etc\gimp\2.0\gimprc'
.
.
No batch interpreter specified, using the default 'plug-in-script-fu-eval'.
batch command experienced an execution error:
Error: (<unknown> : 19216122) eval: unbound variable: bitonal-converter-batch

EXIT: gimp_exit
EXIT: gimp_real_exit
If I enclose the function into quotes:

Code: Select all

c:\app\ocr\GIMPPortable\App\gimp\bin\gimp-console-2.8.exe -i -f -d --verbose -b "('bitonal-converter-batch' 'C:\\data\\perma\\tmp4\\ok1' 'jpg' 'tif' 20 220 0.67 6 1 1 1 33 240 120)" -b "(gimp-quit 0)"
.
.
No batch interpreter specified, using the default 'plug-in-script-fu-eval'.
batch command experienced an execution error:
Error: (<unknown> : 19216124) illegal function

EXIT: gimp_exit
EXIT: gimp_real_exit
To check that the function is registered correctly and is not corrupted, I connect into the GUI, open an image then:

--> FILTER --> DIY Book Scanning --> batch bitonal convert V.0.4 --> select input jpg --> validate

it produces an _<image>.tif in the expected directory for each <image>.jpg

I must have hit some kind of bug.
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by duerig »

I don't think you have hit a bug. Rather, I think you are in shell-quote purgatory. Let me give you one more thing to try:

c:\app\ocr\GIMPPortable\App\gimp\bin\gimp-console-2.8.exe -i -f -d --verbose -b "(bitonal-converter-batch 'C:\\\\data\\\\perma\\\\tmp4\\\\ok1' 'jpg' 'tif' 20 220 0.67 6 1 1 1 33 240 120)" -b "(gimp-quit 0)"

If that doesn't work, then let me know and I'll tinker around with it a bit myself to see if I can make it run.
murgen
Posts: 19
Joined: 22 Sep 2012, 03:45
E-book readers owned: Kindle
Number of books owned: 1000
Country: Belgium

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by murgen »

Double quotes around the parentheses result in 'illegal function'.
Simple quote and quadruple quotes results in 'batch command executed successfully' but nothing produced.
Writing 'C:\Documents and Settings\P0957\.gimp-2.8\pluginrc'
Starting extension: 'extension-script-fu'
No batch interpreter specified, using the default 'plug-in-script-fu-eval'.
batch command executed successfully
EXIT: gimp_exit
EXIT: gimp_real_exit
Writing 'C:\Documents and Settings\P0957\.gimp-2.8\templaterc'
Writing 'C:\Documents and Settings\P0957\.gimp-2.8\parasiterc'
Writing 'C:\Documents and Settings\P0957\.gimp-2.8\unitrc'
EXIT: app_exit_after_callback
it is not a function issue. I removed the script from the script directory, refreshed the gimp script-fu so that the DIY converter is gone and re-run the bat file. I got the same result while I should have received an error message stating that function 'bitonal-convert' does not exists.
I assume my gimp install is bad, that the initial command is correct and will search until I can run a simple batch test from command line.
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Bitonal conversion using Photoshop (Gimp & Imagemagic...

Post by duerig »

Murgen, I've installed gimp and this script myself and have now successfully run it. Here is the command line you should use on windows:

Code: Select all

"C:\Program Files\GIMP 2\bin\gimp-console-2.8.exe" -f -i -d -b "(bitonal-converter-batch \"c:\\path\\to\\image\\folder\" 1 0 20 220 0.67 6 1 1 1 33 240 120)" -b "(gimp-quit 0)"
Important things:

The whole argument to -b must be in quotes. Quotes within that are quoted using \" and backslashes are quoted using \\.

The path must be to a folder containing the proper type of input file. If it is not a folder, then the script will silently not process anything. Don't point it to an individual file.

The inType and outType are numbers and not strings. Here is their correspondence:

0 - tif
1 - jpg
2 - bmp
3 - png

The endings of the actual files may be capitalized or not, but must be exactly those letters. .JPG and .jpg are both fine but not .jpeg.

The output files are put in the same directory as the input files are in.
Post Reply