Code Review 2000
A few nights ago, I had the sudden desire to look at some of my old code. Of course the oldest stuff is either lost to time or bound on an obsolete magnetic format, and I am not sure how much can be gained from Apple II assembly from when I was 12. I only really got serious about full backups in the mid-2000′s, so a lot of small programs have vanished.
But I wrote a CD ripper + CDDB + MP3 encoder frontend called Choad in the year 2000. It was a Perl script that wrapped cdparanoia (an excellent error-correcting ripper of the time), LAME (still the standard in MP3 encoding), and a few Perl libraries to provide a complete ripping solution. Insert a CD, receive correctly named and tagged MP3s.
Years ago I lost the Choad source in a hard drive failure, or server disappearance, or something. I searched for it on the web yesterday, and found nothing at first. But when I looked a little deeper, I found a fork named Chood that a fellow named Adam Gurno put together a couple years later. Chood adapted Choad for Ogg Vorbis output. And Adam Gurno distributed Chood as both a full distribution based on Choad 0.822, and as a patch to Choad 0.822.
Thanks Adam! One patch -N later and I had my final version of Choad.
So what better to do with code I wrote in the year 2000, twenty-one years old and living in a rave warehouse, than rip it apart? Time for a code review.
I’ve elided big sections of code here. If you’d like to see the whole thing in context, I put it on GitHub. Pull requests welcome! Yeah, sure.
Disclaimer: I wrote this in Perl. People did that then. You will be exposed to a crazy alternate universe of sigils and coercions.
#!/usr/pkg/bin/perl
#
# choad
#
# a command-line front end to cdparanoia and lame
#
# copyright 2001 pete gamache gamache@ftso.org
# ,
# all rights reversed (K)
Wait, what? I let this out the door with /usr/pkg/bin/perl
as the executable? What an asshole. That was the path on my NetBSD box. Each of the several package systems Choad ended up on had to patch this. I am sorry I did this.
require 5;
use strict;
Perl noobz: use strict;
is a pragma that forces the programmer not to be a total prick. Declare your variables, quote your strings, etc. Not using it is bad and wrong. I am pleased I was using it at this point.
Of course, as my friend
Maitland points out, use warnings;
is conspicuously absent, so I shouldn’t smile too broadly.
### constants #######################################################
my $true = 22;
my $false = 0;
my $choad = $0; $choad =~ s/.*\/([^\/]+)$/$1/;
my $choad_version = "0.822";
I guess I really wanted to express the concepts of truth and falsity without resorting to Perl idioms directly. That implies I hadn’t read enough Perl yet, which is true. I thought there was a code smell when in fact, I was the smell.
And uncommented regexes, even short ones, are why so many people hate Perl and Perl programmers. Yup, I agree. Comment, fucko.
### defaults ########################################################
my $cdparanoia = `which cdparanoia`; $cdparanoia =~ s/\n//;
my $cdpflags = $cdpflags_default;
my $lame = `which lame`; $lame =~ s/\n//;
my $lameflags = $lameflags_default;
my $cddb_host = "freedb.freedb.org";
my $cddb_port = 888;
my $auto = $false;
my $nocddb = $false;
my $we_have_names = $false; # if we have artist/title/track list
my $maketracks = $true;
my $encode = $true;
my $submit = $false; # new to 0.8
my $trackname = "";
my $strip_the = $true;
my $from = 1;
my $artist = "artist-$$";
my $album = "album-$$";
my $genre = "";
my @tracks = ();
my @onlyv = ();
my $how_many_tracks = 0;
Duuuude, come on here. A few of these are fine. But holy globals! It is not OK just because you declare it with my! Perl programmers can really piss me off. It’s like being a kid who swears a lot, because he’s so used to getting his mouth washed out with soap that he just doesn’t care anymore.
And precisely what the fuck do all these do? I really shit the bed here. Comments are awesome and this is a barren comment wasteland. It’s an especially dick move considering I am using the $$
special variable, which I am so brain-damaged as to remember means the PID of the running process.
### main routine ####################################################
parse_config_file ("/etc/choadrc");
parse_config_file ("$HOME/.choadrc");
parse_command_line (@ARGV);
my $MSF_info = query_CD();
get_CDDB_info() unless $nocddb == $true;
if ($we_have_names == $true) {
print make_tracks_list();
submit_CD() unless ($submit == $false);
} else {
print "$how_many_tracks tracks on disc.\n";
}
$trackname = $trackname_default unless ($trackname ne "");
Well, there I go declaring a variable $MSF_info
inline with the rest of the code. It gets used in a subroutine elsewhere, and I had no idea where the hell it came from because I didn’t even have the goddamn common courtesy to put it with the rest of the obscurely-named globals. Shithead.
I’m at least glad to see I broke out the config file and command line parsing into their own subroutines. Didn’t drop that ball. Let’s take a peek at parse_config_file
:
sub parse_config_file {
my $CHOADRC = $_[0];
my $i=0;
open RC, $CHOADRC or return;
while (<RC>) {
$i++;
if (! (/^[\#;]/ || /^\s*$/)) { # ignore comments and blank lines
/^\s*(\S*)\s*=?\s*(.*)\s*$/; # grab useful bits from line
my @line = ("-$1", $2);
parse_command_line (@line);
}
}
}
Nice! parse_config_file
just calls parse_command_line
a bunch of times. The config file format is just a stream of command line options. I made a good call here, even if I left an unused loop variable $i
hanging around.
Of course, parse_command_line
just sticks a bunch of shit in globals, so whatever. Back to the main routine:
# now do the encoding
if ($encode == $true) {
if (scalar @onlyv > 0) {
while (<@onlyv>) {
my $i = shift @onlyv;
my $n = shift @onlyv;
if ($n eq "auto") {$n = $trackname;}
make_appropriate_dirs ($n);
rip_track_with_name ($i, parse_format_string($n, $i));
}
} else { #batch mode
my $curdir = make_appropriate_dirs ($trackname);
if ($maketracks == $true) {
my $tracksfile = ">" . $curdir . "tracks";
open TRACKS, $tracksfile or die $!;
print TRACKS "# autogenerated by $choad $choad_version\n";
print TRACKS make_tracks_list();
close TRACKS;
}
for (my $i=$from; $i <= $how_many_tracks; $i++) {
rip_track_with_name ($i, parse_format_string($trackname, $i));
}
}
} # and that's all
I am guessing I named @onlyv
that to indicate that it is a vector. That one gets me half an hour in the dunk tank. I’d like to ask 2000-me why this is a better choice than @onlys
. Dipshit.
This mostly looks sane otherwise. Kind of.
And we’ve just fallen off the end of the main program.
Let’s look into get_CDDB_info
. At this point, the program has received multiple possible matches for a CD, because the CDDB format sucks. Choad must ask the user which is the real album.
my @discs = $cddb->get_discs($cddb_id, $track_offsets, $total_seconds);
my $discchoice = 1;
my $i=0;
if (scalar @discs > 1) {
if ($auto == $false) {
foreach my $disc (@discs) {
$i++;
($genre, my $cddb_id, my $title) = @{$disc};
print "$i $title\n";
}
ASK: print "\nWhich disc? ";
while ((my $c .= getc STDIN) ne "\n") {
$discchoice = $c;
}
if ($discchoice > scalar @discs) {
print "Too high; try again.";
goto ASK;
}
}
$we_have_names = $true;
} elsif (scalar @discs == 0) {
print STDERR "No CDDB entry for this disc (id=0x$cddb_id)\n";
$we_have_names = $false;
} else {
$we_have_names = $true;
}
What the shit. I couldn’t figure out a way to express this any better, apparently, because that is very clearly goto and a label in there. Gah.
Wanna punch me yet? You will.
sub query_CD {
# returns tab-delimited string list of:
# <track offset-minutes offset-seconds offset-frames>
# also sets $how_many_tracks
my $toc = "";
my $command = "$cdparanoia $cdpflags -Q 2>/tmp/choad-$$";
`$command`;
open QUERY, "/tmp/choad-$$" or die $!;
while (<QUERY>) {
print $_;
if (/\s*([0-9]+)\.\s+[0-9]+\s+\[[0-9]+:[0-9]+\.[0-9]+\]\s+[0-9]+\s+\[([0-9]+):([0-9]+)\.([0-9]+)\]/) { # that's a mouthful
$toc .= "$1 $2 $3 $4/";
$how_many_tracks = $1;
}
elsif (/^TOTAL\s+[0-9]+\s+\[([0-9]+):([0-9]+)\.([0-9]+)\]/) {
$toc .= "999 $1 $2 $3"; # lead-out track
}
}
close QUERY; #print "made it past query\n";
`/bin/rm /tmp/choad-$$`;
return $toc;
}
You might notice, if you take the time to scroll all the way over there, that that big long turd of regex ends with # that's a mouthful
. This bonehead can take a mouthful of curb. Regular expressions get commented, period. There’s not even sample output from cdparanoia. Fuck this guy.
At least it ends with thorough documentation, in the form of a few subroutines like usage and longhelp.
sub usage {
return sprintf <<END
usage: $choad [options] [-only n1 file1 n2 file2 ...]
options:
-choadrc <filename> read <filename> as choadrc configuration file
-cdp /path/to/cdparanoia specify path to cdparanoia
[default=$cdparanoia]
-cdpflags "flags" pass "flags" to cdparanoia
[default="$cdpflags_default"]
-lame /path/to/lame specify path to lame
[default=$lame]
-lameflags "flags" pass "flags" to lame
[default="$lameflags_default"]
-server <host> use <host> as CDDB server
[default=$cddb_host]
-port <port> use <port> as CDDB server port
[default=$cddb_port]
-nocddb do not attempt CDDB lookup
-tracks <filename> use <filename> for artist/title/track list
instead of CDDB lookup (implies -nocddb)
-nomaketracks do not make "tracks" file when encoding CD
-submit submit track list to CDDB (use with -tracks)
-noencode do not encode CD
-auto ask user no questions (Do The Right Thing)
-trackname "string" use "string" as track name format string
[default="$trackname_default"]
-the do not strip leading "the" from artist name
-from <n> start encoding at track <n>
-h print help message
-V print version
-longhelp print long help message
If "-only <n> <file> ..." is present, $choad will encode each track <n>,
using <file> as the format string for the filename. If <file> == "auto",
then the default format string is used.
If "-only ..." is not present, $choad will encode the entire CD.
END
;
}
Not so bad! But I started to put that stuff near the top of scripts like this over time. It’s not just because it’s helpful to the uninitiated. I like to put the docs up top because that is the mission statement of the code. Once I get the docs right, the code usually falls out of me a lot faster.
So what do I see overall? I see BASIC residue, in abuse of globals. I see someone who hadn’t coded in a group setting before, in the shitty or nonexistent comments. I see someone who hadn’t read a lot of code yet, in the ignorance of idioms. I see a dearth of modularity.
But on the bright side, I scratched my itch, ripping my CD library on a handful of cheapo unix PCs. And I shipped code. It’s my first open source project and it worked. People sent patches and everything. Felt great.
Epilogue:
Not long after I finished Choad, Mac OS X — and iTunes — arrived, and Choad went by the wayside. Eventually, around 2006, I wanted to rip my CDs for the very last time, to FLAC and MP3, with many machines working together. And maybe some of these machines don’t have monitors, or CD-ROM drives, or whatever. Nothing existed that did what I wanted.
Sourceforge declined to host a project named Choad, so there was born Cretin: the CD Ripper, Encoder, and Tagger with an Inoffensive Name. And I haven’t written one since.