Tag Archives: forking

Bulk DVD Creation With Tovid

Today I’m going to show you how I automated the DVD creation part of the VHS home video conversion process. I’m assuming you have already captured the video to your computer somehow, and just want to make DVDs with your captured video.

Note: All of these tools are Linux command line tools. Less convenient than a GUI, but much more appropriate for bulk operations and scripting.

Preparing the Videos

Tovid can take any video format and make a DVD from it by calling ffmpeg and other libraries when it needs to. In practice, I find that tovid and ffmpeg fight like sibling when they get along it’s true love, but when they fight things get broken.

Encoding is the slowest part of making a DVD, so we’ll do it separately. This way we can tweak the DVD quickly without re-encoding each time. FFMpeg can do it with the command:

ffmpeg -i SourceFile.whatever -target ntsc-dvd outputfile.mpg

This will produce mpeg2 files which don’t need to be further encoded in order to be put on a DVD.

Splitting Videos

If the resulting outputfile.mpg is too big for a DVD, you will need to split it. FFMpeg can split your file using a starting second (-ss) and a time (-t).  With this syntax you can break your file into as many pieces as you need to fit them on a DVD.

ffmpeg -i SourceFile.whatever -ss 0 -t 7200 -target ntsc-dvd outputfile-part1.mpg
ffmpeg -i SourceFile.whatever -ss 7201 -target ntsc-dvd outputfile-part2.mpg

Creating the Metadata

Tovid will want a couple of pieces of metadata, depending on what menu options you choose. Main titles, submenu titles and a disc title.

Since I was digitizing family videos, I wanted the disc title to be the years spanned. So I created a CSV file with 4 columns. the year would become the disc title.

filename,year,long description,short description
SourceFile.whatever,1990,The family goes sledding at Ice Hill,Sledding

The Tovid Wrapper Script

I know Tovid is already a wrapper for dvdauthor, ffmpeg and whatever other tools it uses behind the scenes, but what’s another layer, right?

I put all of my videos, the CSV file and this script in a folder:

#!/usr/bin/env php
<?php

/*
 * @brief Make as few DVDs as needed to fit all of the videos listed in a CSV file
 *
 * Copyright Michael Moore <stuporglue@gmail.com>
 * This script assumes that
 *      * All of the videos have already been encoded into mpeg2 format for DVD
 *      * All of the videos paths are relatie to INPUTDIR
 *
 * As many videos are fit into one DVD as possible. Videos are added in the order
 * listed in the CSV file. If a video is too big for a DVD the script will tell
 * you, and then exit. DVDs are named either "year", "year1 - year2" or
 * "year disc n", using the year from the CSV.
 *
 * Requires php-cli installed so that we can fork processes.
 *
 * Our metadata.csv file format:
 * pathtofile,year,looooooooooooong title here,short title
 *
 * We assume the file has a header row.
 */

//
//         Settings
//
define('DVDSIZE',8500000000); // Our estimate of the number of bytes available on a DVD+R DL
define('MENUOVERHEAD',45*1024*1024); // Generous leeway for the menu system
define('INPUTDIR',"/home/myuser/videos/input/"); // starting point for all file paths. Use / for abs. paths
define('TMPDIR','/tmp/tovid/'); // Where do you want the tmp files?
define('OUTDIR',"/home/myuser/videos/DVDs/"); // Destination?
define('MAXCHILDREN',5); // How many tovids to run simultaneously? Probably 1-2 less than the # of cores available?
$fh = fopen(INPUTDIR.'/metadata.csv','r');

//
//         No Lifeguard on duty!
//
@mkdir(TMPDIR);
chdir(TMPDIR); // Tovid likes to dump in the cwd. Chdir so that tovid dumps its temp files somewhere usefulish

define('MAXVOB',1073709056); // The max size of a single VOB
global $pids;
$pids = Array();

$currentdvdsize = MENUOVERHEAD;
$currentdvdfiles = Array();
fgetcsv($fh); // remove header from csv file
while($file = fgetcsv($fh)){

    // Fast check!
    if((filesize(INPUTDIR .'/'. $file[0]) + $currentdvdsize) > DVDSIZE){
    makeDVD($currentdvdfiles); // Make a DVD!

    $currentdvdfiles = Array(); // Reset!
    $currentdvdsize = MENUOVERHEAD;
    }

    $currentdvdfiles[] = $file;

    // Calculate how much space this video will take on the disc
    // dvdauthor seems to:
    // 1) Never mix videos in the same VOB
    // 2) Make all VOBs except the last one come out to MAXVOB size (the last one can be whatever size smaller than MAXVOB)
    $currentdvdsize += (MAXVOB * ceil(filesize(INPUTDIR .'/'. $file[0])/MAXVOB));

    if($currentdvdsize > DVDSIZE){
    die("It looks like {$file[0]} is too big to fit on a DVD!\n");
    }
}

// Make any remnants
if(count($currentdvdfiles) > 0){
    makeDVD($currentdvdfiles);
}

function makeDVD($files){
    global $pids;

    // Make title
    $last = count($files) - 1;
    if($files[0][1] != $files[$last][1]){
    $title = "{$files[0][1]} - {$files[$last][1]}";
    }else{
    $title = "{$files[0][1]}";
    }

    $count = 2;
    $origtitle = $title;
    while(file_exists(OUTDIR . "/$title")){
    $title = "$origtitle disc $count";
    $count++;
    }

    // List of files
    $input = Array();
    $shorttitles = Array();
    $fulltitles = Array();
    foreach($files as $file){
    $input[] = INPUTDIR . "/{$file[0]}";
    $fulltitles[] = $file[2];
    $shorttitles[] = $file[3];
    }

    $cmd = "tovid disc
    -files  " . implode(" ",array_map('escapeshellarg',$input)) . "
    -titles " . implode(' ',array_map('escapeshellarg',$shorttitles)) . "
    -menu-title " . escapeshellarg($title) . "
    -menu-fontsize 18
    -title-color '#ff7700'
    -title-stroke black
    -titles-fontsize 18
    -titles-color '#ff7700'
    -showcase-titles-align east
    -rotate 5
    -wave default
    -submenus
    -submenu-titles " . implode(' ',array_map('escapeshellarg',$fulltitles)) . "
    -submenu-title-color '#ff7700'
    -submenu-stroke black
    -loop 0
    -submenu-length 20
    -noask
    -out " . escapeshellarg(OUTDIR . "/$title");

    // Now wait for our turn...
    while(count($pids) >= MAXCHILDREN){
    $status = NULL;
    $exited_pid = pcntl_wait($status);            
    if(pcntl_wexitstatus($status) != 0){
        print "FAILURE IN " . TMPDIR . "/tovid.$exited_pid!!!\n{$pids[$exited_pid]}\n";
    }
    unset($pids[$exited_pid]);
    }

    $cmd = str_replace("\n",' ',$cmd);

    print "LAUNCHING!!!\n$cmd\n";

    $pid = pcntl_fork();
    if($pid == -1){
    die("COULDNT FORK!");
    }else if($pid){
    $pids[$pid] = $cmd;
    sleep(3); // give tovid a chance to claim its temp directories
    }else{
    $cmd .= " > " . TMPDIR . "/tovid." . getmypid() . " 2>&1";
    exec($cmd);
    exit();
    }
}

Using the Script

Save the PHP script above to a file named tovidBatch.php and make it executable.

Edit the defined constants to fit your directories and output media (DVDSIZE). If desired, edit the tovid disc command to build your DVDs the way you want them.

Finally, run (on a command line)

php tovidBatch.php

Your videos are on their way!

Sample DVD Menu
Sample DVD Menu
Posted in Computers, Something Interesting | Tagged , , , , , , , , , , | Leave a comment

Writing a daemon with PHP

PHP isn’t used to write daemons very often, and other languages (like Perl or C) might be more suited to your typical daemon. There are times when PHP is the right choice though, for instance if the rest of your project is a PHP website and you want to keep the same code language across the project. Everything here is available elsewhere online, but I couldn’t find a page that brought it all together neatly (fork, exec, waitpid, signal handling), so here it is.

In this case, I wanted to be able to use the same DB connection file and config file as the rest of the project.

The commands below which start with pcntl (Process CoNTRol) will only work from command line PHP, not from PHP run as an Apache module. Basically you can’t launch a daemon directly from PHP run through Apache.

Forking: Basics of a daemon

The basic thing a daemon must do is get detached from whatever process launched it. Every process except init has a parent process. When a daemon disconnects from its parent process init will adopt it. We can use the pstree command to visually see the process hierarchy.

Detaching from the parent process is done by forking. Forking makes two running copies of your program. They will both be running the same code at the same place. The new branch (the child) will have a new PID, the original branch (the parent) will continue to have its original PID. The pcntl_fork() command will return  different values for the parent and child. The pcntl_fork() will return the child’s PID to the parent and will return 0 to the child.

The following snippet will take advantage of this return value difference to make the parent exit.

#!/usr/bin/php
<?
// Daemonize
$pid = pcntl_fork(); // parent gets the child PID, child gets 0
if($pid){ // 0 is false in PHP
    // Only the parent will know the PID. Kids aren't self-aware
    // Parent says goodbye!
    print "Parent : " . getmypid() . " exiting\n";
    exit();
}
print "Child : " . getmypid() . "\n";

If you put this in a script and run it, you will see the following output:

$ ./daemonize.php
Child : 10248
Parent : 10246 exiting

Notice that they have different PIDs. If we could’ve seen the parent PIDs, we would’ve seen that the child PID was now a child of init (PID 1). We’ll be able to see that with the next code sample.

A Featurefull PHP Daemon

Really, the above script daemonizes so we could call it good and be done, but it’s not that useful yet. We’ll add a few more features and end up with something we can actually use.

Daemon Feature : Loop Forever

Most daemons keep running and either keep doing something, or waiting for a signal of some sort and then doing something. In the code below we’ll add a while(TRUE) loop so we’ll never exit.

Daemon Feature : Close handles and Change directory

We want to close any files we don’t need so that we don’t get in anyone’s way. By default the directory you’re in when you run a process is kept open by the program. as the CWD (current working directory). If you wanted to unmount the drive that path is on, umount might complain about open files.

Closing STDIN, STDOUT and STDERR is a good idea, because they’re going to disappear when you close that terminal anyways. Remember once you’ve closed them, don’t print anything — it’ll throw an error and your daemon will die.

Daemon Feature : Exec Worker Processes

It is often convenient to have a daemon be a simple controlling program which launched worker processes to do the heavy lifting. The daemon I wrote watch a user table in a database. When certain user criteria were met, the daemon launched a worker process to do some batch processing for that user. This is represented in the code below by the “if(count($pids) < 6)”, that’s our condition here.

When the condition is met, we fork again. This time, the parent process sticks around (the ‘else’ part of the if(!$pids) statement) while the child proces execs a worker process (worker.php). Exec replaces the currently running process so in theory we don’t need to worry about exiting the child. We keep the exit() around though in case launching the worker process fails for some reason.

Daemon Feature : Good Parenting (waitpid)

Every time we fork and exec we get another child. Those child worker processes won’t run forever and we want to avoid Zombie processes, so we need to reap them when they’re done. There are two ways to handle this situation. The simplest way is to tell our daemon to ignore SIGCHLD signals like so:

pcntl_signal(SIGCHLD, SIG_IGN);

If we are ignoring SIGCHLD, the child processes will be reaped automatically upon completion.

The other option is to use waitpid to reap children that have finished.  In the example below we use pcntl_waitpid with the -1 and WNOHANG arguments. -1 tells pcntl_waitpid to reap any child which has exited. WNOHANG tells it to return immediately even if no child has exited yet. Assuming a child has exited, pcntl_waitpid will return the PID of the process which exited.

We collect our child PIDs as we fork and save them in $pids. The pcntl_waitpid loop removes pids that have exited, so $pids should always have a list of child PIDs that haven’t exited yet.

Daemon Feature : Shutting Down Cleanly

We’ll use the signal handling discussed yesterday so we can quit on demand instead of needing to be killed. One thing we’ll do is iterate over any remaining child PIDs and send them the same signal we were sent. This allows them to also shut down cleanly. Once each of the child processes has closed then we exit the daemon.

Brining it all together : PHP Daemon Code

The code here along with the code from the signal handling post should give you everything you need to get going. Have fun!

#!/usr/bin/php -q
<?php
ini_set('display_errors',0);
print "Parent : ". getmypid() . "\n";

global $pids;
$pids = Array();

// Daemonize
$pid = pcntl_fork();
if($pid){
 // Only the parent will know the PID. Kids aren't self-aware
 // Parent says goodbye!
 print "\tParent : " . getmypid() . " exiting\n";
 exit();
}

print "Child : " . getmypid() . "\n";

// Handle signals so we can exit nicely
declare(ticks = 1);
function sig_handler($signo){
 global $pids,$pidFileWritten;
 if ($signo == SIGTERM || $signo == SIGHUP || $signo == SIGINT){
 // If we are being restarted or killed, quit all children

 // Send the same signal to the children which we recieved
 foreach($pids as $p){ posix_kill($p,$signo); } 

 // Women and Children first (let them exit)
 foreach($pids as $p){ pcntl_waitpid($p,$status); }
 print "Parent : "
 .  getmypid()
 . " all my kids should be gone now. Exiting.\n";
 exit();
 }else if($signo == SIGUSR1){
 print "I currently have " . count($pids) . " children\n";
 }
}
// setup signal handlers to actually catch and direct the signals
pcntl_signal(SIGTERM, "sig_handler");
pcntl_signal(SIGHUP,  "sig_handler");
pcntl_signal(SIGINT, "sig_handler");
pcntl_signal(SIGUSR1, "sig_handler");

// All the daemon setup work is done now. Now do the actual tasks at hand

// The program to launch
$program = "worker.php";
$arguments = Array("");

while(TRUE){
 // In a real world scenario we would do some sort of conditional launch.
 // Maybe a condition in a DB is met, or whatever, here we're going to
 // cap the number of concurrent grandchildren
 if(count($pids) < 6){
 $pid=pcntl_fork();
 if(!$pid){
 pcntl_exec($program,$arguments); // takes an array of arguments
 exit();
 } else {
 // We add pids to a global array, so that when we get a kill signal
 // we tell the kids to flush and exit.
 $pids[] = $pid;
 }
 }

 // Collect any children which have exited on their own. pcntl_waitpid will
 // return the PID that exited or 0 or ERROR
 // WNOHANG means we won't sit here waiting if there's not a child ready
 // for us to reap immediately
 // -1 means any child
 $dead_and_gone = pcntl_waitpid(-1,$status,WNOHANG);
 while($dead_and_gone > 0){
 // Remove the gone pid from the array
 unset($pids[array_search($dead_and_gone,$pids)]); 

 // Look for another one
 $dead_and_gone = pcntl_waitpid(-1,$status,WNOHANG);
 }

 // Sleep for 1 second
 sleep(1);
}

Warnings / Notes:

If you daemonize, close your terminal, and then your daemon tries to print, you’re going to have problems. Your daemon will still try to print to STDOUT, which will now be closed. This will likely cause it to fail in an unpleasant manner.

Posted in Programming, Projects | Tagged , , , , , , , , | 12 Comments