I'm not a programmer, and I never expected even to use the command line until I got into processing scans, so my proof-of-concept script should be viewed in that light.
The attached ZIP file Punch_hole_removal.zip
The punch hole removal script Convert_script.bat
A copy of the NConvert utility nconvert.exe
required to run the script (actually not the current version)
Two JPEG files lh_erase
which are the white rectangles placed over the punch holes
A folder Convert
A folder Test pages
containing the two example JPEGs in the first post of the thread
To use the script, place the files to be processed in the Convert
folder and then double-click on the script: a Windows command window should open, text will flash by, and when it stops the files in the Convert
folder should have been processed; to close the command window press any key.
Download the ZIP archive, then extract it to any convenient location, and then check that the above files are all present: if the nconvert.exe
file is missing, it will likely have been removed by your security software and it will be necessary to download a copy from XnView.com
If you encounter any security warnings when double-clicking on the script, you can assume that they are standard Windows alerts for unknown executables, and that it is safe to proceed.
The script uses relative addressing for convenience, to simply the code and allow it to be run anywhere, so the files and folders in the extracted folder must remain in exactly the same relative positions, unless the script is edited.
The script uses the NConvert watermark code to place a copy of the small white rectangle lh-erase
over each of the punch hole positions on the left of the page, and a copy of the tall narrow white rectangle rh_erase
over the punch hole positions on the right of the page, all white rectangles being placed on all pages to avoid the need to determine whether pages are left or right.
The sizes and positions of the white rectangles placed on the pages are set for the two example pages, and positioned using pixel coordinates relative an origin at the top left of the page. Pages that are significantly different cannot be processed successfully using this basic method, although there would be some scope for adjusting the size of the white rectangles and their exact positions to increase tolerance of slightly different page characteristics, if an increased risk of clipping or obscuring any text in the margins is accepted. The present code assumes there will be no text in the right margin, so that a single white rectangle can be used.
The downloaded script runs this code:
Code: Select all
nconvert -wmfile lh_erase.jpg -wmpos 20 148 -o "Convert\%%.jpg" -overwrite Convert\*.jpg
nconvert -wmfile lh_erase.jpg -wmpos 20 622 -o "Convert\%%.jpg" -overwrite Convert\*.jpg
nconvert -wmfile lh_erase.jpg -wmpos 20 1104 -o "Convert\%%.jpg" -overwrite Convert\*.jpg
nconvert -wmfile lh_erase.jpg -wmpos 20 1582 -o "Convert\%%.jpg" -overwrite Convert\*.jpg
nconvert -wmfile rh_erase.jpg -wmpos 1130 135 -o "Convert\%%.jpg" -overwrite Convert\*.jpg
Inspection of the code may allow the position of the white rectangles placed to be tweaked, and different sized masking white JPEGs could also be substituted for those downloaded.
The fill colour used is pure white taken from a small area of one of the original example images which were black and white, I believe. If necessary, the fill colour could be adjusted to match another page background colour by creating new masking JPEGs. It might be possible to automate production of matching masking rectangles, but that would take more development.
The script works for me on my computer for the two test pages included, but the command line is very unforgiving and I can't guarantee that it will straight off for someone else, although I do think it should be safe to run...