Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ

script to auto-split two scanned pages in single pages

General discussion about software packages and releases, new software you've found, and threads by programmers and script writers.
Post Reply
User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

script to auto-split two scanned pages in single pages

Post by dingodog » 20 Feb 2011, 18:48

working with graphicsmagick and imagemagick (removing gm from script) (I started to write this script to automate some repetitive tasks)

assuming you have scanned two pages at one time and you want split in single pages

and your first image has name 0-000.jpg

you can split in single pages all your scans, by calculating the cropping dimensions and coordinates, by reading width and height from first image and dividing by two its width, to have the dimensions and the cropping coordinates for even and odd pages

this is my first attempt to create a splitting script, so, many thing may be enhanced

for graphicsmagick

Code: Select all

if [ ! -e even-odd ]; then mkdir even-odd; fi
first="`ls -1 *.jpg | head -n1`"
let "halfwidth=`gm identify -format '%w \n' "$first"`/2" 
width="`gm identify -format '%w \n' "$first"`"
height="`gm identify -format '%h \n' "$first"`"
quality="$(gm identify -verbose $(ls *.jpg | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.jpg ; do gm convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.jpg}-A.jpg" ; mv `ls *.jpg | grep A` even-odd ; gm convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.jpg}-B.jpg" && mv `ls *.jpg | grep B` even-odd
 done ; 
for imagemagick

Code: Select all

if [ ! -e even-odd ]; then mkdir even-odd; fi
first="`ls -1 *.jpg | head -n1`"
let "halfwidth=` identify -format '%w \n' "$first"`/2" 
width="`identify -format '%w \n' "$first"`"
height="`identify -format '%h \n' "$first"`"
quality="$(identify -verbose $(ls *.jpg | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.jpg ; do convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.jpg}-A.jpg" ; mv `ls *.jpg | grep A` even-odd ;  convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.jpg}-B.jpg" && mv `ls *.jpg | grep B` even-odd
 done ; 
2nd step, joining all images in a single pdf

I use sam2p and pdftk

Code: Select all

#!/bin/bash

directory=`pwd`

for file in $directory/*.jpg
do
   filename=${file%.jpg}
   sam2p $filename.jpg $filename.pdf
done

Code: Select all

pdftk *.pdf cat output out.pdf && pdftk out.pdf output fixed.pdf && mv fixed.pdf out.pdf
Last edited by dingodog on 08 Dec 2011, 20:14, edited 1 time in total.

L0g1cM0del
Posts: 17
Joined: 04 Mar 2014, 00:53

Re: script to auto-split two scanned pages in single pages

Post by L0g1cM0del » 21 Feb 2011, 20:24

I'm not sure how to use this. Is this something that you enter from the command line or do you have to put this somewhere in graphicsmagick/imagemagick? I didn't see a way to include it in either of the magicks displays.

Sorry, I'm still trying to learn and understand about programming things.

User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: script to auto-split two scanned pages in single pages

Post by dingodog » 21 Feb 2011, 20:38

simply, you can select all text (it is better to copy the 2nd version of script) you need to have installed graphicsmagick or imagemagick in order to use this script

working for graphicsmagick

Code: Select all

#!/bin/sh
if [ ! -e even-odd ]; then mkdir even-odd; fi
first="`ls -1 *.jpg | head -n1`"
let "halfwidth=`gm identify -format '%w \n' "$first"`/2"
width="`gm identify -format '%w \n' "$first"`"
height="`gm identify -format '%h \n' "$first"`"
for FILE in *.jpg ; do gm convert -crop "$halfwidth"x"$height"+0+0 "$FILE" "$FILE-A.jpg" ; mv `ls *.jpg | grep A` even-odd ; gm convert -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "$FILE-B.jpg" && mv `ls *.jpg | grep B` even-odd
done ;
working for imagick

Code: Select all

#!/bin/sh
if [ ! -e even-odd ]; then mkdir even-odd; fi
first="`ls -1 *.jpg | head -n1`"
let "halfwidth=` identify -format '%w \n' "$first"`/2"
width="` identify -format '%w \n' "$first"`"
height="` identify -format '%h \n' "$first"`"
for FILE in *.jpg ; do  convert -crop "$halfwidth"x"$height"+0+0 "$FILE" "$FILE-A.jpg" ; mv `ls *.jpg | grep A` even-odd ;  convert -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "$FILE-B.jpg" && mv `ls *.jpg | grep B` even-odd
done ;
then past in a new text file, saving and renaming this file as splitpages (without extension) giving permission to execute

now you can copy or move this script in a directory like /usr/bin or /usr/local/bin

cp splitpages /usr/bin

so you can invoke this script simply opening a shell in directory where you have your images scanned at two pages for image, and typing in a terminal

splitpages

this script will take automatically the width and hight of images (from first image of set), create a new directory named even-odd, split at half facing pages scans, in order to have single pages and finally will move in created directory (even-odd) adding suffixes: like A for even and B for odd, in order to order properly the images

L0g1cM0del
Posts: 17
Joined: 04 Mar 2014, 00:53

Re: script to auto-split two scanned pages in single pages

Post by L0g1cM0del » 21 Feb 2011, 21:23

Thanks for the clarifications. I have them both installed on Windows right now, but I will try it out on linux later on. It does seem very useful. Thanks again.

thepapermen

Re: script to auto-split two scanned pages in single pages

Post by thepapermen » 24 Nov 2011, 13:50

Thanks for a great script. I've recently used it and it works flawlessly!

However, I've got to make one important clarification: If you see let: not found error while running it, it means that your system uses different shell to execute sh cripts and you need to explicitly call /bin/bash instead of /bin/sh.

Thanks for a great work!

User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: script to auto-split two scanned pages in single pages

Post by dingodog » 24 Nov 2011, 23:36

improved version of script (it strips the final extension from file and replace with -A.jpg and -B.jpg without reduplication)

for graphicsmagick

Code: Select all

if [ ! -e even-odd ]; then mkdir even-odd; fi
first="`ls -1 *.jpg | head -n1`"
let "halfwidth=`gm identify -format '%w \n' "$first"`/2" 
width="`gm identify -format '%w \n' "$first"`"
height="`gm identify -format '%h \n' "$first"`"
quality="$(gm identify -verbose $(ls *.jpg | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.jpg ; do gm convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.jpg}-A.jpg" ; mv `ls *.jpg | grep A` even-odd ; gm convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.jpg}-B.jpg" && mv `ls *.jpg | grep B` even-odd
 done ; 
for imagemagick

Code: Select all

if [ ! -e even-odd ]; then mkdir even-odd; fi
first="`ls -1 *.jpg | head -n1`"
let "halfwidth=` identify -format '%w \n' "$first"`/2" 
width="`identify -format '%w \n' "$first"`"
height="`identify -format '%h \n' "$first"`"
quality="$(identify -verbose $(ls *.jpg | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.jpg ; do convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.jpg}-A.jpg" ; mv `ls *.jpg | grep A` even-odd ;  convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.jpg}-B.jpg" && mv `ls *.jpg | grep B` even-odd
 done ; 

sumphotons
Posts: 1
Joined: 05 Jan 2012, 18:33
E-book readers owned: Kindle, iPad
Number of books owned: 0

Re: script to auto-split two scanned pages in single pages

Post by sumphotons » 12 Jan 2012, 18:05

Thanks for the script! It works flawlessly but I have a suggestion to make it more efficient. Your script invokes 'identify' 4 times to calculate height, width and quality and 'convert' twice. All of that can be done by one invocation of 'convert'. Use the following form:

Code: Select all

convert *.jpg -crop 2x1@ Page%d.jpg
This will split all the input files in half with the same quality setting as the original and create files with names Page0.jpg, Page1.jpg, Page2.jpg, etc. where the even numbered files are the left hand pages and odds are the right hand ones.

One issue with using wild cards for specifying input files is the memory used by 'convert' - it seems to be proportional to the number of files that match the wild card. So if you have a lot of files, use a 'for' loop and use the above command on individual files. Something like this ...

Code: Select all

i=0
for FILE in *.jpg
do
    convert $FILE -crop 2x1@ temp%d.jpg
    mv temp0.jpg Page${i}.jpg
    i=`expr $i + 1`
    mv temp1.jpg Page${i}.jpg
    i=`expr $i + 1`
done

User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: script to auto-split two scanned pages in single pages

Post by dingodog » 13 Jan 2012, 17:21

thanks for remarks, for now I'm planning to add other features, and finally I'll rewrite script looking at speed and efficiency

meanwhile I improved script in order to allow to enter the extension of images that we want split at half (without dot)

assuming we save this script in a file named splithalfimgs, the syntax will be

for png images

Code: Select all

splithalfimgs png
for jpg images

Code: Select all

splithalfimgs tiff
and so on...

for graphicsmagick

Code: Select all

#!/bin/sh
#
#script to split an image in its two halves
# 
#usage:
#
#splithalfimgs image extension (without dot)
#
#for instance: (for png):
#
#splithalfimgs png
#
#(for jpg)
#
#splithalfimgs jpg
#
#and so on...
if [ ! -e even-odd ]; then mkdir even-odd; fi
filetype=$1
first="`ls -1 *.$filetype | head -n1`"
let "halfwidth=`gm identify -format '%w \n' "$first"`/2" 
width="`gm identify -format '%w \n' "$first"`"
height="`gm identify -format '%h \n' "$first"`"
if [ "$filetype" == "jpg" ]
then
quality="$(gm identify -verbose $(ls *.$filetype | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.$filetype ; do gm convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ; gm convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd
 done
 else
  for FILE in *.$filetype ; do gm convert -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ; gm convert -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd; done
 fi
 exit 0
 
for imagemagick

Code: Select all

#!/bin/sh
#
#script to split an image in its two halves
# 
#usage:
#
#splithalfimgs image extension (without dot)
#
#for instance: (for png):
#
#splithalfimgs png
#
#(for jpg)
#
#splithalfimgs jpg
#
#and so on...
if [ ! -e even-odd ]; then mkdir even-odd; fi
filetype=$1
first="`ls -1 *.$filetype | head -n1`"
let "halfwidth=` identify -format '%w \n' "$first"`/2" 
width="` identify -format '%w \n' "$first"`"
height="` identify -format '%h \n' "$first"`"
if [ "$filetype" == "jpg" ]
then
quality="$( identify -verbose $(ls *.$filetype | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.$filetype ; do  convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ;  convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd
 done
 else
  for FILE in *.$filetype ; do  convert -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ;  convert -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd; done
 fi
 exit 0
 

User avatar
dingodog
Posts: 106
Joined: 22 Jul 2010, 18:19
Number of books owned: 1000
Country: on the net
Location: on the net
Contact:

Re: script to auto-split two scanned pages in single pages

Post by dingodog » 24 Sep 2012, 20:07

Woof Woof!
script update

splithalfimgs v. 0.3 (added ability to split at half book scans consisting into tiff files showing 2 pages at once)

script recognizes what compression is used for tiff image and re-applies the same compression when cutting

for graphicsmagick

Code: Select all

#!/bin/bash
#
#script to split an image in its two halves
# 
#usage:
#
#splithalfimgs image extension (without dot)
#
#for instance: (for png):
#
#splithalfimgs png
#
#(for jpg)
#
#splithalfimgs jpg
#
#and so on...
if [ ! -e even-odd ]; then mkdir even-odd; fi
filetype=$1
first="`ls -1 *.$filetype | head -n1`"
let "halfwidth=`gm identify -format '%w \n' "$first"`/2" 
width="`gm identify -format '%w \n' "$first"`"
height="`gm identify -format '%h \n' "$first"`"
if [ "$filetype" == "jpg" ]
then
quality="$(gm identify -verbose $(ls *.$filetype | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.$filetype ; do gm convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ; gm convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd
 done
 elif [ "$filetype" == "tiff" ]
 then
compression="$(gm identify -verbose $(ls *.$filetype | head -n1) | grep Compression | cut -d: -f2 | xargs)"
for FILE in *.$filetype ; do gm convert -compress $compression -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ; gm convert -compress $compression -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd
 done
 else
  for FILE in *.$filetype ; do gm convert -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ; gm convert -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd; done
 fi
 exit 0
for imagemagick

Code: Select all

#!/bin/bash
#
#script to split an image in its two halves
# 
#usage:
#
#splithalfimgs image extension (without dot)
#
#for instance: (for png):
#
#splithalfimgs png
#
#(for jpg)
#
#splithalfimgs jpg
#
#and so on...
if [ ! -e even-odd ]; then mkdir even-odd; fi
filetype=$1
first="`ls -1 *.$filetype | head -n1`"
let "halfwidth=` identify -format '%w \n' "$first"`/2" 
width="` identify -format '%w \n' "$first"`"
height="` identify -format '%h \n' "$first"`"
if [ "$filetype" == "jpg" ]
then
quality="$( identify -verbose $(ls *.$filetype | head -n1) | grep Quality | cut -d: -f2)"
for FILE in *.$filetype ; do  convert -quality $quality -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ;  convert -quality $quality -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd
 done
 elif [ "$filetype" == "tiff" ]
 then
compression="$( identify -verbose $(ls *.$filetype | head -n1) | grep Compression | cut -d: -f2 | xargs)"
for FILE in *.$filetype ; do  convert -compress $compression -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ;  convert -compress $compression -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd
 done
 else
  for FILE in *.$filetype ; do  convert -crop "$halfwidth"x"$height"+0+0 "$FILE" "${FILE%%.$filetype}-A.$filetype" ; mv `ls *.$filetype | grep A` even-odd ;  convert -crop "$width"x"$height"+"$halfwidth"+0 "$FILE" "${FILE%%.$filetype}-B.$filetype" && mv `ls *.$filetype | grep B` even-odd; done
 fi
 exit 0

markvdb
Posts: 90
Joined: 28 Dec 2010, 18:45
Number of books owned: 0
Country: Belgium

Re: script to auto-split two scanned pages in single pages

Post by markvdb » 25 Sep 2012, 05:56

dingodog wrote:Woof Woof!
script update

splithalfimgs v. 0.3
Want to make this even more useful and visible? Then you might want to:
* add a clear copyright header to this script (gplv3 for example, see http://www.gnu.org/licenses/gpl-howto.html for instructions)
* fork the official diybookscanner git repository at https://github.com/markvdb/diybookscanner and let me pull in your work?

I'll gladly help you with any questions you might have about this.

Mark
Mark
http://diybookscanner.eu - official EU diybookscanner kits - subscribe to our newsletter

Post Reply