Mike Grundy @mikegrundy - Tumblr Blog

putting your code out to the world is kind of like hanging your ass out the sunroof on I5 during rush hour. You feel exposed. I offer this one little piece of advice: If they flay your code, at least they are reading it.

Me, here, a long time ago

You're firing up an ec2 instance one morning while the coffee is brewing. It fails with a

Client.VolumeLimitExceeded: Volume limit exceeded error.

Wtf? Crap. Too many volumes sitting around. Double crap, nobody tagged them sensibly. This little bit of python code will tell you which vols aren't attached to a running image, or a snapshot. If you have too many to sort through manually, write a quick and dirty script that attaches the volume, runs dump_e2fs and gets the last mount time for more data points.

import boto ec2 = boto.connect_ec2() vol = ec2.get_all_volumes() for unattachedvol in vol: state = unattachedvol.attachment_state() if state == None: if unattachedvol.snapshot_id == "": print unattachedvol.create_time, unattachedvol.size, unattachedvol.status, unattachedvol.id

A couple of handy dandy tools

I was helping someone out with MongoDB's geosphere indexes on the mailing list today, and realized I should mention some tools I use to help the debugging process along. They have no direct relationship with MongoDB, but I've turned to these tools frequently to help analyze things:

JSON Formatter and Validator: Paste in json that has been through the mailer wringer, validate it and make it legible!

GeoJSONLint: Paste in geoJson objects, validate them, and draw them on the map! Very helpful for visualizing multiple objects, intersections, and self intersections.

Oh, Bonus tool I found today: Convert Geographic Units: Can convert between UTM grid coordinates and lon/lat. Total fun for the geoNerd.

The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' but 'That's funny...'

Isaac Asimov (via adafruit)

Route 53 DNS updates

So, I've been trying to work out a simpler way to update Route 53 with instances I've created for tests. Looking through the boto docs, it seems I would have to grab everything, parse everything, update, commit, hat dance etc. As I was going through I didn't know what the properties of something were, so I dug up the code. And found a much easier way to handle the updates. The Zone() object. It's late, I'm tired, Here's code:

def dnsUpdate( hostDict, dnsName): # Put yer key stuffs in her key='BBBBBBBBBBBBBB' access='nnnnnnnnnnnnnnnnnnnnnnnnn' # Connect route53 = connection.Route53Connection(key, access) # Get the Zone obj for our domain dnsZone=route53.get_zone(dnsName) # aname is our alias, hname is the real host name for aname, hname in hostDict.iteritems(): record = dnsZone.get_cname(aname) if record is not None: # sure hope it's just one record if record.resource_records[0] != hname: dnsZone.update_cname(aname, hname) else: dnsZone.add_cname(aname, hname)

Q.E.D

This. This is how much we can fit on four bikes. Yeah, we were kinda amazed too

http://rainmakersny.tumblr.com

Perl runs as fast as C, right?

Had a colleague ask why things were running so slow in a perl program (of course, the immediate answer from another colleague was "PROFILE IT"). But he was confused about how fast things had the potential to run, compared to c. Thinking that basic things would run just as fast or that because CPUs are so fast that speed conquers the problem. But the truth is NO. He didn't understand that perl things have a lot of overhead. Overhead that handles all the magic. So, we boiled it down to counting. (As my other colleague said "Consider just counting to a million, or a billion. It's like what in C, 4 instructions?") Basic incrementing and comparison in C, small amount of operations. Like the whole loop in a few cycles. The counterpart in perl is hundreds of ops per loop. We think he got it in the end (and was going to go profile his code). But then I thought it would be nice to have an obvious example. This is run in cygwin under winxp on my laptop:

me@windersbox:~$ cat fst.c int main(void){ unsigned i = 0; while ( ++i < 4294967295); } me@windersbox:~$ gcc fst.c fst.c: In function `main': fst.c:3: warning: this decimal constant is unsigned only in ISO C90 me@windersbox:~$ time ./a.exe real 0m14.813s user 0m14.702s sys 0m0.109s me@windersbox:~$ time perl -e '$i=0; while(++$i < 4294967295){};' real 5m28.984s user 5m26.577s sys 0m0.140s

Both got a solid %25 of the cpu while running. Yes, they use Windows XP at work. It's like doing brain surgery with stone tools. That probably won't be a problem when I start my new job (Yay!)

Verifying public/private key pairs

Found this question about verifying public/private key pairs. Of course I had to riff on the awesome answer a little:

openssl rsa -pubin -in key.pub -modulus -noout | md5sum openssl rsa -in key.priv -modulus -noout | md5sum

The md5 sums match and the keys are golden.

Some Code Coverage Basics

Some notes about setting up code coverage collection and reporting.

Code coverage? Why? What? Well, say you have a bunch of test cases and you'd like to see if they are actually testing different parts of the code. Compiling (in the case of c/c++ code) with some coverage options will instrument the code. Then when it's run files will be written with the coverage data.

When you build, youll need to modify the compilation and linking flags. On my last project we decided it was easier to pass these flags in when we were doing an instrumented build (we used automake, so it understood what to do with the following parameters):

CFLAGS="-fprofile-arcs -ftest-coverage" LIBS="-lgcov" make all

If you are just compiling away at something you can just add the CFLAGS options to gcc and be fine. But once you split compile and link, you need to specify the -lgcov. Dunno why. It is what it is.

You should be able to easily adapt that to your build system. We used lcov (http://ltp.sourceforge.net/coverage/lcov.php) to generate the detailed source analysis and publish it as html. The BUILD_TAG and JOB_NAME variables are set by Jenkins (Continuous Integration system) and place the output into the existing coverage directory:

lcov -t $BUILD_TAG -o ${JOB_NAME}.info -c -d . genhtml -t $BUILD_TAG -o ./coverage/ ${JOB_NAME}.info

Since we were managing our builds with Jenkins, we used a tool from Sandia to get the coverage results in XML format. We would then pick up the file as a post build action (Publish Cobertura Coverage Report). More info at: https://software.sandia.gov/trac/fast/wiki/gcovr . The command we run is:

gcovr -d -x -e "/usr*" -e "mailer/t/*" -o results.xml

The -e options are to specifically exclude results from system directories and our testcases. In lcov we would also run the following commands to exclude them from the html coverage reports:

lcov --remove ${JOB_NAME}.info "/usr*" -o ${JOB_NAME}.info lcov --remove ${JOB_NAME}.info "mailer/t/*" -o ${JOB_NAME}.info

Here are some good links on gcov and lcov:

Using gcov and lcov (presentation) Use gcov and lcov to know your test coverage (Super Awesome) Is there a way to focus lcov on a directory

and a little from the gcovr folks:

gcovr update info

Convert DKIM DNS entry into standard pubkey file

In one line, of course:

dig 20120113._domainkey.google.com TXT | grep -v "^;" | sed -r -e 's/.*p=(.*)/\1/;s/(.{65})/\1\n/g' \ -e '1i-----BEGIN PUBLIC KEY-----' \ -e '$a-----END PUBLIC KEY-----' \ -e '/^$/d'

OK, so not one line when I make it readable here. The first part of what would normally be the hostname is the DKIM selector. I stuck the grep into the pipeline, instead of in sed, bc i was too lazy to figure out why the semicolon wasn't eacaping properly.

I used a variant of this to verify the DNS files we generated. Then I wanted to write it up and added dig to the mix. Yay!

Where did that lost commit go?

This is a handy little ditty to find all the commits related to a file, then get the most recent changes from each commit:

for commit $(git log --all --format=%H FILENAME); do git --no-pager log -p -1 $commit FILENAME; done

You could also add a

git branch --contains $commit

so you have the branch names handy. Something like this:

for commit in $(git log --all --format=%H FILENAME); do echo -e \\nContaining Branches:; git branch --contains $commit; git --no-pager log -p -1 $commit FILENAME; done

Really handy if there is a commit to a branch that got deleted that you're trying to find.

Create a simple histogram in perl

It's "Oh crap! I haven't posted anything useful in a long time" time! So, I'll take a little snip from one of those quick hack programs I wrote that evolved into a test verification tool (sigh).

This function creates a basic histogram from values passed in an array. It is specifically set up to take timestamps, group them by hour, and print a histogram next to the values:"

# Prints messages sent by wall clock hour sub hourlyDistribution { my @rawtimes = shift; my %time_distrib; my $factor = 1; my $maxlen = 0; my $wchar = screenwidth(); foreach (@rawtimes) { #this works, but in my case the times are in GMT #my $bucket = int ($_ / (60 * 60)) % 24; # this does the distribution in local time my $bucket = (localtime($_))[2]; $time_distrib{$bucket}++; $maxlen = $time_distrib{$bucket} unless $maxlen >= $time_distrib{$bucket}; } # ok the 16 is a totally hardcoded "this much room for stuff before the histogram" # but the rest of the logic is solid, if there are more elements than the screen can # hold, create a scaling $factor so the graph will fit. Not very granular, uses integer # math, but hey, we're printing asterisks. if ( ( $wchar - 16 ) < $maxlen ) { $factor = $maxlen / ( $wchar - 16 ); } print "\nHour\tCount\tGraph\n"; print "----\t-----\t-----\n"; foreach ( sort { $a $b } keys %time_distrib ) { print "$_\t$time_distrib{$_}\t" . '*' x ( $time_distrib{$_} / $factor ) . "\n"; } }

So, screenwidth gets, wait for it, the screen width in columns.

# Set a var if we can't load Term::Readkey # Helpful on a server you can't load modules on to. my $noreadkey = 0; eval "use Term::ReadKey; 1" or $noreadkey = 1; sub screenwidth () { my ( $wchar, $hchar, $wpixels, $hpixels ); if ( $noreadkey ) { #hardwidth gets set to your default. 40, 80, whatever. return $hardwidth; } else { # Yes, you could just return (GetTerminalSize())[0] # But then it would be all mysterious. ( $wchar, $hchar, $wpixels, $hpixels ) = GetTerminalSize(); return $wchar; } }

Monsieur Boulet nails it in a panel from this comic

Follow the hood ornament cause the rear view mirror is just for checking how good you look

David Lee Roth

Color study. Will do a few more. Then order some temp tats.

This is not my dog

Few companies that installed computers to reduce the employment of clerks have realized their expectations... They now need more, and more expensive clerks even though they call them 'operators' or 'programmers.'

Peter Drucker

Trending Blogs

Recently Viewed Blogs

Mike Grundy