NerdVana

The glass is 2x larger than it needs to be

procmon: pidstat redefined

Written by Eric Schwimmer

I love pidstat (part of the sysstat package). It's highly useful when you want to monitor the performance of a single process (or small subset of processes). However it's output is a little ... verbose, and not terribly configurable. So I wrote a small Bash wrapper for it that prints stats for one or more processes (as determined by the regular expression that you pass to the script) in a more compact format, ala dstat:

#!/bin/bash

fatal() { echo $1; exit 1; }
[[ $# == 1 ]] || fatal "Usage: procmon <proc name>"
[[ $(which pidstat) == "" ]] && fatal "Missing 'pidstat' binary"

pids=$(pgrep $1)
[[ $pids == "" ]] && fatal "No processes with that name found"

numNoHeaderLines=20
while [[ 1 ]]
do
    i=0
    printf '%8.8s|%5.5s|%6.6s|%10.10s|%6.6s|%10.10s|%10.10s\n' \
        'Time' 'PID' '%CPU' 'CSW/s' '%Mem' 'Kb read/s' 'Kb write/s'
    while [[ "$i" -lt "$numNoHeaderLines" ]]
    do
        pidstat -p $pids -d -r -u -h 5 1 | tail -1 | \
            awk '{"date +%T -d @"$1 | getline date;
            printf "%8.8s|%5.5s|%6.6s|%10.0f|%6.6s|%10.0f|%10.0f\n",
            date, $3, $7, $17+$18, $13, $14, $15}' || exit
        ((++i))
    done
done

I'm thinking about cranking out a script that would do most of what pidstat does, but add network traffic data, as well as the ability to print individual columns (e.g. %CPU utilization without having to print VSZ/RSS). Eh, much /proc wrangling would definitely be required.

smtprouted: a SMTP rewrite/dedupe proxy

Written by Eric Schwimmer

At work we send a lot of mail. Like A LOT of mail. So we need to have processes in place to test large mail loads and their impact on various internal work-processes. Of course, when we send these large test batch emails, we can't have them forwarded to the users that they are addressed to (this would defeat the whole 'test' thing). So instead we have all of the boxes sending test email relay their messages through a catchall box configured with postfix, dovecot and Roundcube, so devs can view the results of their test email runs.

However, problems arose when devs started adding Cc: and Bcc: lines to their test mails. Our catchall box accepts any message forwarded to any address, and then drops it in a test account on the box. When there are multiple Cc: and Bcc: lines attached to a message, this would cause the message to be duplicated a large number of times. And since this catchall box was getting 200k unique messages a day anyway, this would cause the inbox to fill up very quickly.

So I wrote a little Python SMTP daemon that would accept incoming email, and de-duplicate it, if necessary. I also gave it the ability to forward message to arbitrary endpoints, based on regular expression matching of individual email address, and the ability the rewrite destination addresses on the fly.

You can find it on github, here

Deleting very large S3 buckets

Written by Eric Schwimmer

Surprisingly, there is no way to just delete an S3 bucket. You must first delete all of the objects contained within the bucket, and then delete the bucket itself. Normally this is just a minor inconvenience, but when the bucket in questions contains hundreds of millions of objects, you run into a serious problem.

The tools that Amazon provides you with to delete files are simply not capable of deleting files at this scale. We had a bucket with 400 million+ objects (including revisions) that we had to delete. First we tried deleting them through the AWS console/web UI, but that just hung our browser indefinitely (and didn't delete any of the files). Next we tried with the aws CLI tool, but given the rate that it was deleting files, it would have taken six months to finish deleting them (during which time we would be, of course, paying for their storage).

There were some other tools out there (written in Ruby, sadly) which looked promising, but which failed to perform. The best of them reduced the overall delete time to just over a month, and didn't supported versioned objects (which means that we would have been left to delete the old versions by hand).

So I wrote my own: s3wipe

Fast Parallel Network Copy

Written by Eric Schwimmer

For when you need to copy a bunch of files between two network hosts in a hurry. This script is especially useful if both hosts are configured with LACP bonded interfaces, as these type of interfaces only do load-balancing on a per-stream basis (i.e. a single network file copy, via scp for example, will only ever be able to go as fast as the slave interface that it gets bound to). Parallel rsync is one possible solution, but it is not best suited for an initial/one-time file copy (checking for existence of remote files imposes non-negligible overhead, and using SSH also slows things down quite a bit if you are using a relatively heavyweight cipher like aes128-cbc, which is the default SSHv2 cipher on many systems).

Enter ptaripe! It leverages tar (file bundling), netcat (network transfer), screen (process control), lzop (fast [de]compression), pipe viewer (transfer rates) and dstat (net+disk I/O) in such a way that is guaranteed to flat-line your network:

#!/bin/bash

[[ $# != 2 && $# != 3 ]] \
    && echo "Usage: ptarpipe <local-path> <remote-host> [remote-path]" \
    && exit 1

LOCAL_PATH=$1
REMOTE_HOST=$2
REMOTE_PATH=$3
NUM_THREADS=4
LISTEN_PORT=10000

[[ "$REMOTE_PATH" == "" ]] && REMOTE_PATH=$LOCAL_PATH

# Create a working directory
WORK_DIR=/tmp/ptarpipe.$$
mkdir -p $WORK_DIR
cd $LOCAL_PATH

# Create a list of all the files that need to be copied. Randomize
# the file list, so that no one thread ends up an unfair number of 
# large files
find . \( -type f -o \( -type d -empty \) \) | sort -R > $WORK_DIR/files

# Create one file per thread, containing a list of files
# to be copied by that thread
NUM_FILES=$(wc -l $WORK_DIR/files | awk '{print $1}')
(( FILES_PER_THREAD = (NUM_FILES + NUM_THREADS - 1) / NUM_THREADS ))
cd $WORK_DIR
split -d -l $FILES_PER_THREAD files

# Now create our screen config that will launch all of the sending
# and receiving processes
SCREEN_RC=$WORK_DIR/screenrc
echo "startup_message off"  > $SCREEN_RC
i=0
while [[ $i < $NUM_THREADS ]]
do
    cat >> $SCREEN_RC << EOT
screen -t remote-$i ssh $REMOTE_HOST 'mkdir -p $REMOTE_PATH && \
nc -ld $LISTEN_PORT | lzop -d | tar xvp -C $REMOTE_PATH'
screen -t local-$i sh -c 'sleep 5; cd $LOCAL_PATH; \
tar cfp - -T $WORK_DIR/x$(printf '%-2.2d' $i) | \
pv -c -N DISK | lzop | pv -c -N NET | nc $REMOTE_HOST $LISTEN_PORT
EOT
    (( LISTEN_PORT++ ))
    (( i++ ))
done
echo "screen -t dstat dstat 5" >> $SCREEN_RC

# Start the copy, and then clean up when we are done
screen -c $SCREEN_RC
rm -rf $WORK_DIR

Python: Rapid Reverse Readline

Written by Eric Schwimmer

There are many solutions out there with which to beat this particular ex-equus, but none of them were quite performant enough for me. The particular software I was working on was reading multi-terabyte files, starting from the end, so loading the entire file into memory was obviously a non-starter. And given the ginormity of the files I was parsing, the code had to be as tight as possible. Here is what I came up with:

# revReadline: Take an read-mode filehandle and
# return a generator that will read the file backwards, 
# line by line
def revReadline(fh, bufSize=4096):
    fh.seek(-bufSize,2)
    filePos = 1
    fragment = ""

    while filePos > 0:
        readBuffer = fullBuffer = ""
        while "\n" not in readBuffer and filePos > 0:
            readBuffer = fh.read(bufSize)
            fullBuffer += readBuffer
            fh.seek(bufSize*-2, 1)
            filePos = fh.tell()

        lines = fullBuffer.split("\n")
        lines[-1] += fragment
        while len(lines) > 1:
            yield lines.pop()

        fragment = lines[0]

    yield fragment  

Like a phoenix from the ashes...

Written by Eric Schwimmer

Nerdvana rises again! All hail Nerdvana!