Friday, October 04, 2013

A Lot Can Happen In Ten Years

I write this five days short of this blog's 10th anniversary. From humble beginnings, a lot can happen in 10 years: 836 blog posts, three Notre Dame head football coaches, the same number of popes, and a few near misses by Cleveland sports teams.

A lot can happen in an individual's life too: meet a girl, get married, buy a house, have a kid, prepare for a second. Priorities can change in a big way over 10 years. So with that, it's time for this blogger to hang up the keyboard.

There's been some great moments. The Who Will Be The Next Pope? post that got linked by the top Google result and blew up blog traffic (and yes, the current Pope Francis was mentioned on that list). This site linking to EDSBS so early in their existence that Spencer actually took the time to write a personal email of thanks. Sabermetric Bracketology. Lively message boards, fantasy leagues, and Hall of Fame voting. The list goes on, so it's worth getting lost in the archives.

I may not have time to bring quality discussion to the table, but plenty of sites like Her Loyal Sons and Let's Go Tribe, among countless others, still do. So I'll still be active in the blog world as a consumer, if not a producer.

Thanks everyone for making this possible. I couldn't have done it without everyone reading, emailing, posting on the message board, and keeping me going all these years. It's been a fun ride.
Go Irish!
Go Tribe!
Go Browns!
Go Cavs!

Saturday, March 09, 2013

Baseball Hack 22: translate.pl and rosters.pl

I'm reviving the blog when it applies to my 35 for 35. Yes, I'd love to blog here more, but life has brought along many more important things over these past few years. That being said, here's some Windows-compatible code for hack 22 in the book Baseball Hacks. First, translate.pl. Run this from the same directory your zipped Retrosheet event files are stored, and it will unzip them and concatenate all the play-by-play data to pbp.csv. This code requires the Perl extension Archive::Extract, and also takes advantage of readdir functionality only available in Perl 5.2 or later.
#!/usr/bin/perl
use Archive::Extract;

$outfile = '"C:\Users\John\Desktop\Baseball Hacks\retrosheet\pbp.csv"';
print `type all_hdr.txt > $outfile`;

opendir RSDIR, "." or die "can't open directory .: $!\n";
while (readdir RSDIR) {
 if ( $_ =~ /(\d\d\d\deve)\.zip$/ ) {
   print "Unzipping $_\n";
   my $ae = Archive::Extract->new( archive => $_ );
   my $ok = $ae->extract( to => '.\\' . substr($_, 0, -4) );
   opendir YRDIR, substr($_, 0, -4) or die "can't open directory .: $!\n";
   chdir(substr($_, 0, -4)) or die "can't change to directory .: $!\n";
   while (readdir YRDIR) {
    if ( $_ =~ /(\d\d)(\d\d)(\w\w\w)\.EV[AN]$/ ) {
     $century = $1; $year = $2; $team = 3;
     print `..\\BEVENT.EXE -y $century$year -f 0-96 $_ >> $outfile`;
    }
   }
   chdir("..") or die "can't change to directory .: $!\n";
   close YRDIR;
 }
}
close RSDIR;
print "done\n";
Similarly, here is rosters.pl, which loops through the unzipped event directories and concatenates all roster files for all years into a single file. You must specify this file on the command line, e.g. ./rosters.pl > rosters.csv
#!/usr/bin/perl

print "retroID,lastName,firstName,bats,throws,team,pos\n";

opendir RSDIR, "." or die "can't open directory .: $!\n";
while (readdir RSDIR) {
 if ( $_ =~ /(\d\d\d\d)eve$/ ) {
   opendir YRDIR, $_ or die "can't open directory .: $!\n";
   chdir($_) or die "can't change to directory .: $!\n";
   while (readdir YRDIR) {
    if ( $_ =~ /(\w{3})(\d{4})\.ROS$/ ) {
     $team = $1;
     $year = $2;
     open FILE, "<$_";
     while () {
      s/\n//;
      s/\cM//;
      s/\"//g;
      if (/[a-z]{5}\d{3}/) {
       print "$year,$_\n";
      }
     }
     close FILE;
    }
   }
   chdir("..") or die "can't change to directory .: $!\n";
   close YRDIR;
 }
}
close RSDIR;
print "done\n";
Once you've run this Perl code to create pbp.csv and rosters.csv, you can add them to your SQL database using the instructions in the book.