Import Photos (and Videos) Into Date Based Directory Structure

In Linux, Shotwell organizes my photos into Year/Month/Day directories, and that’s the structure I like. I use rsync to copy all my photos to my server where they get backed up.

On Windows 7 I couldn’t seem to make the native Windows import or Picasa import them in the same directory structure. Since I’ve been trying to learn Python, I have been using it for little projects here and there, and used it for this project.

You need Python 2.5+. I’m using 2.7.

Features

It will import all photos and videos (according to mime type, determined by file extension) into date based directories. For jpegs, it tries to use the EXIF tags if possible. As a fallback, and for all other files, it uses the results of os.stat(file)[9] (ctime).

If duplicate file names are encountered, it checks the md5 sum and either doesn’t copy (if the md5s match) or creates numbered versions (DSC0001.JPG, DSC0001_1.JPG, DSC0001_3.JPG).

If you are looking for a different directory structure, it can probably accommodate your needs. The directory names are made with strftime, so any variable strftime understands can be used.

Usage

import_photos.py SRC_DIR DEST_DIR

Eg. G: is my memory card. E:\Photos is the FAT32 partition my Linux and Windows installs share.

import_photos.py G:\ E:\Photos

You can put that into a .bat file for the convenience of single-click imports if your memory card is always in the same place.

Code

I should still be considered a Python newbie, so this code may not be the most Python-ish way to do things. It does seem to work just fine though.

You’ll need exif-py.

 #!/usr/bin/env python 

import os
import sys
import signal
import mimetypes
import EXIF
import time
import shutil
import hashlib

# SETTINGS

# These types will be copied using their EXIF data if possible (falls back to their c_time)
jpegTypes = ['jpeg','pjpeg']

# These types will be copied using their c_time
supportedTypes = ['image','video']

# This is the directory structure that the pictures will be copied into
dirformat = "%Y" + os.sep + "%m" + os.sep + "%d"

# CODE

# Handle Ctrl-c without doing a backtrace
def signal_handler(signal, frame):
        sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)

# Handle incorrect parameters
if len(sys.argv) is not 3:
    print "USAGE: import_photos.py srcpath destpath"
    print ""
    print "Example: import_photos.py G:\ E:\Photos"
    exit()

# Grab parameters
srcdir = sys.argv[1]
destdir = sys.argv[2]

# Get the md5 hash of a file
def md5file(filename):
    f = open(filename,mode='rb')
    d = hashlib.md5()
    while True:
        data = f.read(8192)
        if not data:
            break;
        d.update(data)
    f.close()
    return d.hexdigest()

# Try to copy a file. Handle duplicate names
def copyFile(timestamp,filename):
    destpath =  destdir + os.sep + time.strftime(dirformat,timestamp) + os.sep + os.path.basename(filename)

    if not os.path.exists(os.path.dirname(destpath)):
        os.makedirs(os.path.dirname(destpath))

    # keep modifying destpath with incrementing numbers until we find
    # an unused number or find a duplicate file. If the file is a
    # duplicate, return instead of copying
    if os.path.exists(destpath):
        basename = os.path.basename(filename)
        filenamepart,extension = os.path.splitext(basename)
        counter = 1
        srchash = md5file(filename)

    while os.path.exists(destpath):
        if md5file(destpath) == srchash:
            print "Not re-copying existing file " + filename
            return
        destpath = destdir + os.sep + time.strftime(dirformat,timestamp) + os.sep + filenamepart + "_" + str(counter) + extension
        counter += 1

    print "Copying " + filename + " to " + destpath
    shutil.copy2(filename,destpath)

# Check all files and send appropriate files off to get copied
def probeFile(filename):
    maintype,subtype= mimetypes.guess_type(filename) 
    if maintype is not None:
        destPath = None
        category,subtype = maintype.split('/')

        if subtype in jpegTypes:

            # Try to get exif taken date
            jpg = open(filename,'rb')
            tags = EXIF.process_file(jpg,details=False,stop_tag="EXIF DateTimeOriginal")
            jpg.close()

            if "EXIF DateTimeOriginal" in tags:
                origTime = tags["EXIF DateTimeOriginal"]
                timestamp = time.strptime(str(origTime),"%Y:%m:%d %H:%M:%S")
            else:
                create_date = os.stat(filename)[9]
                timestamp = time.gmtime(create_date)

            copyFile(timestamp,filename)
        elif category in supportedTypes:
            create_date = os.stat(filename)[9] # [9] is st_ctime
            copyFile(time.gmtime(create_date),filename)
        else:
            print "I didn't copy " + filename + " because it's not on our list of supported types (" + maintype + ")"

# Walk the directory and handle media files
for (root, subFolders, files) in os.walk(srcdir):
    for file in files:
        probeFile(root + os.sep + file)

This entry was posted in Programming and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *