Code Review 2000

2013-06-20

A few nights ago, I had the sudden desire to look at some of my old code. Of course the oldest stuff is either lost to time or bound on an obsolete magnetic format, and I am not sure how much can be gained from Apple II assembly from when I was 12. I only really got serious about full backups in the mid-2000′s, so a lot of small programs have vanished.

But I wrote a CD ripper + CDDB + MP3 encoder frontend called Choad in the year 2000. It was a Perl script that wrapped cdparanoia (an excellent error-correcting ripper of the time), LAME (still the standard in MP3 encoding), and a few Perl libraries to provide a complete ripping solution. Insert a CD, receive correctly named and tagged MP3s.

Years ago I lost the Choad source in a hard drive failure, or server disappearance, or something. I searched for it on the web yesterday, and found nothing at first. But when I looked a little deeper, I found a fork named Chood that a fellow named Adam Gurno put together a couple years later. Chood adapted Choad for Ogg Vorbis output. And Adam Gurno distributed Chood as both a full distribution based on Choad 0.822, and as a patch to Choad 0.822.

Thanks Adam! One patch -N later and I had my final version of Choad.

So what better to do with code I wrote in the year 2000, twenty-one years old and living in a rave warehouse, than rip it apart? Time for a code review.

I’ve elided big sections of code here. If you’d like to see the whole thing in context, I put it on GitHub. Pull requests welcome! Yeah, sure.

Disclaimer: I wrote this in Perl. People did that then. You will be exposed to a crazy alternate universe of sigils and coercions.

#!/usr/pkg/bin/perl
#
# choad
#
# a command-line front end to cdparanoia and lame
#
# copyright 2001  pete gamache  gamache@ftso.org
#                       ,
# all rights reversed  (K)

Wait, what? I let this out the door with /usr/pkg/bin/perl as the executable? What an asshole. That was the path on my NetBSD box. Each of the several package systems Choad ended up on had to patch this. I am sorry I did this.

require 5;
use strict;

Perl noobz: use strict; is a pragma that forces the programmer not to be a total prick. Declare your variables, quote your strings, etc. Not using it is bad and wrong. I am pleased I was using it at this point.

Of course, as my friend Maitland points out, use warnings; is conspicuously absent, so I shouldn’t smile too broadly.

### constants #######################################################
my $true = 22;
my $false = 0;
my $choad = $0; $choad =~ s/.*\/([^\/]+)$/$1/;
my $choad_version = "0.822";

I guess I really wanted to express the concepts of truth and falsity without resorting to Perl idioms directly. That implies I hadn’t read enough Perl yet, which is true. I thought there was a code smell when in fact, I was the smell.

And uncommented regexes, even short ones, are why so many people hate Perl and Perl programmers. Yup, I agree. Comment, fucko.

### defaults ########################################################
my $cdparanoia = `which cdparanoia`; $cdparanoia =~ s/\n//;
my $cdpflags = $cdpflags_default;
my $lame = `which lame`; $lame =~ s/\n//;
my $lameflags = $lameflags_default;
 
my $cddb_host = "freedb.freedb.org";
my $cddb_port = 888;
 
my $auto = $false;
my $nocddb = $false;
my $we_have_names = $false;  # if we have artist/title/track list
 
my $maketracks = $true;
my $encode = $true;
my $submit = $false;         # new to 0.8
 
 
my $trackname = "";
 
my $strip_the = $true;
 
my $from = 1;
my $artist = "artist-$$";
my $album = "album-$$";
my $genre = "";
my @tracks = ();
my @onlyv = ();
my $how_many_tracks = 0;

Duuuude, come on here. A few of these are fine. But holy globals! It is not OK just because you declare it with my! Perl programmers can really piss me off. It’s like being a kid who swears a lot, because he’s so used to getting his mouth washed out with soap that he just doesn’t care anymore.

And precisely what the fuck do all these do? I really shit the bed here. Comments are awesome and this is a barren comment wasteland. It’s an especially dick move considering I am using the $$ special variable, which I am so brain-damaged as to remember means the PID of the running process.

### main routine ####################################################
parse_config_file ("/etc/choadrc");
parse_config_file ("$HOME/.choadrc");
parse_command_line (@ARGV);
 
my $MSF_info = query_CD();
get_CDDB_info() unless $nocddb == $true;
 
if ($we_have_names == $true) {
    print make_tracks_list();
    submit_CD() unless ($submit == $false);
} else {
    print "$how_many_tracks tracks on disc.\n";
}
 
$trackname = $trackname_default unless ($trackname ne "");

Well, there I go declaring a variable $MSF_info inline with the rest of the code. It gets used in a subroutine elsewhere, and I had no idea where the hell it came from because I didn’t even have the goddamn common courtesy to put it with the rest of the obscurely-named globals. Shithead.

I’m at least glad to see I broke out the config file and command line parsing into their own subroutines. Didn’t drop that ball. Let’s take a peek at parse_config_file:

sub parse_config_file {
    my $CHOADRC = $_[0];
    my $i=0;
    open RC, $CHOADRC or return;
    while (<RC>) {
        $i++;
        if (! (/^[\#;]/ || /^\s*$/)) {    # ignore comments and blank lines
            /^\s*(\S*)\s*=?\s*(.*)\s*$/;  # grab useful bits from line
            my @line = ("-$1", $2);
            parse_command_line (@line);
        }
    }
}

Nice! parse_config_file just calls parse_command_line a bunch of times. The config file format is just a stream of command line options. I made a good call here, even if I left an unused loop variable $i hanging around.

Of course, parse_command_line just sticks a bunch of shit in globals, so whatever. Back to the main routine:

# now do the encoding
if ($encode == $true) {
    if (scalar @onlyv > 0) {
        while (<@onlyv>) {
            my $i = shift @onlyv;
            my $n = shift @onlyv;
 
            if ($n eq "auto") {$n = $trackname;}
            make_appropriate_dirs ($n);
            rip_track_with_name ($i, parse_format_string($n, $i));
        }
    } else {  #batch mode
        my $curdir = make_appropriate_dirs ($trackname);
        if ($maketracks == $true) {
            my $tracksfile = ">" . $curdir . "tracks";
            open TRACKS, $tracksfile or die $!;
            print TRACKS "# autogenerated by $choad $choad_version\n";
            print TRACKS make_tracks_list();
            close TRACKS;
        }
 
        for (my $i=$from; $i <= $how_many_tracks; $i++) {
            rip_track_with_name ($i, parse_format_string($trackname, $i));
        }
    }
} # and that's all

I am guessing I named @onlyv that to indicate that it is a vector. That one gets me half an hour in the dunk tank. I’d like to ask 2000-me why this is a better choice than @onlys. Dipshit.

This mostly looks sane otherwise. Kind of.

And we’ve just fallen off the end of the main program.

Let’s look into get_CDDB_info. At this point, the program has received multiple possible matches for a CD, because the CDDB format sucks. Choad must ask the user which is the real album.

my @discs = $cddb->get_discs($cddb_id, $track_offsets, $total_seconds);
my $discchoice = 1;
my $i=0;
if (scalar @discs > 1) {
    if ($auto == $false) {
        foreach my $disc (@discs) {
            $i++;
            ($genre, my $cddb_id, my $title) = @{$disc};
            print "$i  $title\n";
        }
        ASK: print "\nWhich disc? ";
 
        while ((my $c .= getc STDIN) ne "\n") {
            $discchoice = $c;
        }
 
        if ($discchoice > scalar @discs) {
            print "Too high; try again.";
            goto ASK;
        }
    }
    $we_have_names = $true;
} elsif (scalar @discs == 0) {
    print STDERR "No CDDB entry for this disc (id=0x$cddb_id)\n";
    $we_have_names = $false;
} else {
    $we_have_names = $true;
}

What the shit. I couldn’t figure out a way to express this any better, apparently, because that is very clearly goto and a label in there. Gah.

Wanna punch me yet? You will.

sub query_CD {
    # returns tab-delimited string list of:
    # <track offset-minutes offset-seconds offset-frames>
    # also sets $how_many_tracks
 
    my $toc = "";
    my $command = "$cdparanoia $cdpflags -Q 2>/tmp/choad-$$";
 
    `$command`;
    open QUERY, "/tmp/choad-$$" or die $!;
    while (<QUERY>) {
        print $_;
        if (/\s*([0-9]+)\.\s+[0-9]+\s+\[[0-9]+:[0-9]+\.[0-9]+\]\s+[0-9]+\s+\[([0-9]+):([0-9]+)\.([0-9]+)\]/) {  # that's a mouthful
            $toc .= "$1 $2 $3 $4/";
            $how_many_tracks = $1;
        }
        elsif (/^TOTAL\s+[0-9]+\s+\[([0-9]+):([0-9]+)\.([0-9]+)\]/) {
            $toc .= "999 $1 $2 $3";  # lead-out track
        }
    }
    close QUERY; #print "made it past query\n";
    `/bin/rm /tmp/choad-$$`;
 
    return $toc;
}

You might notice, if you take the time to scroll all the way over there, that that big long turd of regex ends with # that's a mouthful. This bonehead can take a mouthful of curb. Regular expressions get commented, period. There’s not even sample output from cdparanoia. Fuck this guy.

At least it ends with thorough documentation, in the form of a few subroutines like usage and longhelp.

sub usage {
    return sprintf  <<END
usage: $choad [options] [-only n1 file1 n2 file2 ...]
 
options:
  -choadrc <filename>        read <filename> as choadrc configuration file
  -cdp /path/to/cdparanoia   specify path to cdparanoia
                             [default=$cdparanoia]
  -cdpflags "flags"          pass "flags" to cdparanoia
                             [default="$cdpflags_default"]
  -lame /path/to/lame        specify path to lame
                             [default=$lame]
  -lameflags "flags"         pass "flags" to lame
                             [default="$lameflags_default"]
  -server <host>             use <host> as CDDB server
                             [default=$cddb_host]
  -port <port>               use <port> as CDDB server port
                             [default=$cddb_port]
  -nocddb                    do not attempt CDDB lookup
  -tracks <filename>         use <filename> for artist/title/track list
                             instead of CDDB lookup (implies -nocddb)
  -nomaketracks              do not make "tracks" file when encoding CD
  -submit                    submit track list to CDDB (use with -tracks)
  -noencode                  do not encode CD
  -auto                      ask user no questions (Do The Right Thing)
  -trackname "string"        use "string" as track name format string
                             [default="$trackname_default"]
  -the                       do not strip leading "the" from artist name
  -from <n>                  start encoding at track <n>
  -h                         print help message
  -V                         print version
  -longhelp                  print long help message
 
If "-only <n> <file> ..." is present, $choad will encode each track <n>,
using <file> as the format string for the filename.  If <file> == "auto",
then the default format string is used.
 
If "-only ..." is not present, $choad will encode the entire CD.
 
END
    ;
}

Not so bad! But I started to put that stuff near the top of scripts like this over time. It’s not just because it’s helpful to the uninitiated. I like to put the docs up top because that is the mission statement of the code. Once I get the docs right, the code usually falls out of me a lot faster.

So what do I see overall? I see BASIC residue, in abuse of globals. I see someone who hadn’t coded in a group setting before, in the shitty or nonexistent comments. I see someone who hadn’t read a lot of code yet, in the ignorance of idioms. I see a dearth of modularity.

But on the bright side, I scratched my itch, ripping my CD library on a handful of cheapo unix PCs. And I shipped code. It’s my first open source project and it worked. People sent patches and everything. Felt great.

Epilogue:

Not long after I finished Choad, Mac OS X — and iTunes — arrived, and Choad went by the wayside. Eventually, around 2006, I wanted to rip my CDs for the very last time, to FLAC and MP3, with many machines working together. And maybe some of these machines don’t have monitors, or CD-ROM drives, or whatever. Nothing existed that did what I wanted.

Sourceforge declined to host a project named Choad, so there was born Cretin: the CD Ripper, Encoder, and Tagger with an Inoffensive Name. And I haven’t written one since.