Automatically Orient Scanned Photos With OpenCV

Most pictures taken these days are digital, and include information inside the picture’s exif tags about the correct orientation. For these types of pictures you can simply use exifautotran and you’re done. Easy Peasy.

What about pictures that don’t have exif info? Maybe you’ve got an old/cheap camera, scanned images, or another source that dosn’t include the orientation information. I’m in the middle of scanning about 5000 pictures, and I didn’t want to manually rotate all the images…so I started looking for solutions.

OpenCV: Open Source Computer Vision

I figured if I could detect a face in a photo, then it was right side up. I started searching for solutions and the Internet seemed to agree that the best free solution for facial detection was OpenCV, but I couldn’t find a script that did just what I wanted.

OpenCV “is a library of programming functions for real time computer vision.” It can be used in robotics, for doing cool stuff with web cams, motion detection, and more.  I was mainly interested in because of its facial recognition features.

Then I found a this script by Jo Vermeulen from way back in 2008. It’s made for use with webcams, and the particular script is just a toy. He does have another script which interacts with DBus to provide present/away status changes for your IM client. With just a little bit of work I was able to repurpose his script to do what I needed.

Whatsup: An OpenCV Python Script To Detect Correct Photo Orientation


  •  Install libcv2.1
  • Install python-opencv
  • Download one or more Haar Cascades and save them to /usr/local/share/ (more on this below)

The script I made is called whatsup.

Usage: whatsup [--debug] path_to_file

whatsup  returns the number of degrees it thinks your photo needs to be rotated to be right side up.  You can then use that number of degrees in whatever program you want, be it imagemagick, jpegtran, gd, or whatever. whatsup assumes that your image is squared up (not square) and only returns multiples of 90 (0,90,180,270).

Using the –debug option will show the image with a box around the feature that gave the whatsup the final result.

#!/usr/bin/env python

# This script reads in a file and tries to determine which orientation is correct 
# by looking for faces in the photos
# It starts with the existing orientation, then rotates it 90 degrees at a time until
# it has either tried all 4 directions or until it finds a face

# INSTALL: Put the xml files in /usr/local/share, or change the script. Put whatsup somewhere in your path

# Usage: whatsup [--debug] filename
# Returns the number of degrees it should be rotated clockwise to orient the faces correctly

# Some code came from here:
# The rest was cobbled together by me from the documentation here [1] and from snippets and samples found via Google
# [1]

import sys
import os
import cv

def detectFaces(small_img,loadedCascade):
    tries = 0 # 4 shots at getting faces. 

    while tries < 4:
	faces = cv.HaarDetectObjects(small_img, loadedCascade, cv.CreateMemStorage(0), scale_factor =1.2, min_neighbors =2, flags =cv.CV_HAAR_DO_CANNY_PRUNING)
	if(len(faces) > 0):
	    if(sys.argv[1] == '--debug'):
		for i in faces:
		    cv.Rectangle(small_img, (i[0][0],i[0][1]),(i[0][0] + i[0][2],i[0][1] + i[0][3]), cv.RGB(255,255,255), 3, 8, 0)
	    return tries * 90

	# The rotation routine:  
	tmp_mat = cv.GetMat(small_img)
	tmp_dst_mat = cv.CreateMat(tmp_mat.cols,tmp_mat.rows,cv.CV_8UC1) # Create a Mat that is rotated 90 degrees in size (3x4 becomes 4x3)
	dst_mat = cv.CreateMat(tmp_mat.cols,tmp_mat.rows,cv.CV_8UC1) # Create a Mat that is rotated 90 degrees in size (3x4 becomes 4x3)

	# To rotate 90 clockwise, we transpose, then flip on Y axis
	cv.Transpose(small_img,tmp_dst_mat) # Transpose it
	cv.Flip(tmp_dst_mat,dst_mat,flipMode=1) # flip it

	# put it back in small_img so we can try to detect faces again
	small_img = cv.GetImage(dst_mat) 
	tries = tries + 1
    return False 

# Detect which side of the photo is brightest. Hopefully it will be the sky.
def detectBrightest(image):
    image_scale = 4 # This scale factor doesn't matter much. It just gives us less pixels to iterate over later
    newsize = (cv.Round(image.width/image_scale), cv.Round(image.height/image_scale)) # find new size
    small_img = cv.CreateImage(newsize, 8, 1) 
    cv.Resize( image, small_img, cv.CV_INTER_LINEAR )

    # Take the top 1/3, right 1/3, etc. to compare for brightness
    width = small_img.width
    height = small_img.height
    top = small_img[0:height/3,0:width]  
    right = small_img[0:height,(width/3*2):width]
    left = small_img[0:height,0:width/3]
    bottom = small_img[(height/3*2):height,0:height]

    sides = {'top':top,'left':left,'bottom':bottom,'right':right}

    # Find the brightest side
    greatest = 0
    winning = 'top'
    for name in sides:
	sidelum = 0
	side = sides[name]
	for x in range(side.rows - 1):
	    for y in range(side.cols - 1):
		sidelum = sidelum + side[x,y]
	sidelum = sidelum/(side.rows*side.cols)
	if sidelum > greatest:
	    winning = name

    if(sys.argv[1] == '--debug'):
	if winning == 'top':
	    first = (0,0)
	    second = (width,height/3)
	elif winning == 'left':
	    first = (0,0)
	    second = (width/3,height)
	elif winning == 'bottom':
	    first = (0,(height/3*2))
	    second = (width,height)
	elif winning == 'right':
	    first = ((width/3*2),0)
	    second = (width,height)

    returns = {'top':0,'left':90,'bottom':180,'right':270}
    # return the winner
    if sys.argv[1] == '--debug':
	print "The " + winning + " side was the brightest!"
    return returns[winning]

# Try a couple different detection methods
def trydetect():
    # Load some things that we'll use during each loop so we don't keep re-creating them
    grayscale = cv.LoadImageM(os.path.abspath(sys.argv[-1]),cv.CV_LOAD_IMAGE_GRAYSCALE) # the image itself

    # Get more at:
    cascades = ( # Listed in order most likely to appear in a photo

    for cascade in cascades:
	loadedCascade = cv.Load(cascade) 
	image_scale = 4 
	while image_scale > 0: # Try 4 different sizes of our photo
	    newsize = (cv.Round(grayscale.width/image_scale), cv.Round(grayscale.height/image_scale)) # find new size
	    small_img = cv.CreateImage(newsize, 8, 1 ) 
	    cv.Resize( grayscale, small_img, cv.CV_INTER_LINEAR )
	    returnme = detectFaces(small_img,loadedCascade)
	    if returnme is not False:
		return returnme
	    image_scale = image_scale - 1
    return detectBrightest(grayscale) # no faces found, use the brightest side for orientation instead

# Usage Check
if ((len(sys.argv) != 2 and len(sys.argv) != 3) or (len(sys.argv) == 3 and sys.argv[1] != '--debug')):
    print "USAGE: whatsup [--debug] filename"

# Sanity check
if not os.path.isfile(sys.argv[-1]):
    print "File '" + sys.argv[-1] + "' does not exist"

# Make it happen
print str(trydetect()),

About OpenCV Feature Detection

In order to detect features, like faces, OpenCV needs to be trained. You then use the training file, called a Haar Cascade, to define the detection. OpenCV provides lots of different ready training files here.

To use whatsup you’ll need to download one or more harrcascade*.xml files and put them in /usr/local/share (or edit whatsup to point at the place you decide to save them).

What you need to be aware of is that detection works best at the resolution the training file was created for. So  if you’re using haarcascade_frontalface_default.xml then you want to be giving the detection 24×24 pixel faces.

In order for this to happen, whatsup tries scaling the images to different sizes and tries detecting faces in those different sizes. I am using scans that are roughly 1200×800 and so I start with a scaling factor of 4 so that the first image tried is 1/4th the size of the original. If your images are larger then you probably need to start with a larger scaling factor.

What If No Faces/Features Are Detected

If no features are detected in any of the image sizes, then whatsup determines which side of the photo is brightest, and returns the number of degrees needed to rotate the brightest side upwards.

The assumption is that if no people are found, then maybe it’s a landscape photo and the sky should go at the top.

There are lots of times where this is incorrect (eg. a lit ski hill at night), but for my photos it will be true more times than not

How Accurate Is It / Improving Accuracy

I am getting better than 80% accuracy, but probably not 90%.

The better you know your photos the smarter you can make the script for your use.  If you choose Haar Cascade files that are more applicable to your photo set you are less likely to get false positive.

If you know what sizes your faces typically are you can choose appropriate scales or order the image scaling to happen in the most likely order. You could even make your own hasscascade.xml files if you have certain features you want to look for.

Whatsup in Daily Use

I have actually incorporated whatsup into a script that gets run every time I scan something on my scanner, but this is the bash script I used for testing and developing it.

Make sure whatsup is in your path, and that you have jpegtran and jpegexiforient installed. Jpegtran does lossless jpeg rotations, jpegexiforient sets that missing Exif flag that lets programs know which way to display a photo.

Save this script in the same directory as your jpegs you want to test this on and run it.

for i in saved/* 
    echo -n "Processing $i : "
    degrees=`whatsup $i`
    if [ $degrees -gt 0 ]
	echo $degrees
	cp $i /tmp/tmp.jpg
        jpegtran -rotate $degrees /tmp/tmp.jpg > $i
	jpegexiforient -1 $i
	sleep 1
	echo ""


While whatsup doesn’t modify your photo, any program you would use it with does, including the bash script above. Please make responsible use of backups and testing as I disclaim any liability for any lost data.

I’m not a pro python coder, so the script could probably be optimized somehow.

This entry was posted in Computers, Programming and tagged , , , . Bookmark the permalink.

2 Responses to Automatically Orient Scanned Photos With OpenCV

  1. Hugo says:

    Thanks for this!

    I found a couple of bugs. First, this miscalculates the size:

    bottom = small_img[(height/3*2):height,0:height]

    Should be:

    bottom = image[(height/a*b):height, 0:width]


    if sidelum > greatest:
    greatest = sidelum # TODO add this line, or it always reports the last-checked side as the winner
    winning = name

    Also, like you point out, different things can be tuned differently to detect your own images better. I’ve found the face detection is nearly always eventually finding false positives in the image at 1:1 scale, and only 2/300 even got to the sky-detection step, so I might skip face-detection at that scale, or try switching around the haar cascades. Also, the brightness detection works slightly better (for my test photos, when skipping face-detection) with just the blue channel image rather than a grayscale. Converting to HSL colour space should allow even better blue-detection.

    Checking some papers on “automatic image orientation” shows there’s lots of other things that can be done to improve detection, but a lot require image training first or complex mathematical models and share any source code, and your approach works fairly well and is much simpler!

    I’ll make some more changes to my modified code and let you know when I upload it somewhere.

    • stuporglue says:

      Thanks! I’d be happy to link to it. This was my first step into python, open-cv and this sort of image manipulation so there are probably more things that could be done too.

Leave a Reply

Your email address will not be published. Required fields are marked *