NerdVana

Nerd? We prefer the term 'Intellectual Badass'

freq with me

Written by Eric Schwimmer

I often need to count the frequency of events occurring in log files and what not, and I was never able to find a super clean way to do it in real time, so this perl script was born:

#!/usr/bin/perl

use IO::Select;
use Fcntl;

my (%counts, $start, @lines, $flags, $buffer);

my $select = IO::Select->new();
$select->add(\*STDIN);

my $flags;
fcntl(STDIN, F_GETFL, $flags);
$flags |= O_NONBLOCK;
fcntl(STDIN, F_SETFL, $flags);
my $start = time;

while(1)
{
    print "\033[2J";
    print "\033[0;0H";
    sleep(1);
    @lines = ();
    if ($select->can_read(0))
    {
        @lines = <STDIN>;
        $lines[0] = $buffer . $lines[0];
        $buffer = ($lines[$#lines] =~ /\n$/) ? "" : pop @lines;

        for my $line (@lines)
        {
            chomp $line;
            $counts{$line}->[0]++;
        }

        my $elapsed = time - $start;
        for my $line (sort keys %counts)
        {
            my $rates = $counts{$line};
            my $sec_rate = $rates->[0];
            $rates->[3] += $sec_rate;
            my $total = $rates->[3];

            push @{$rates->[1]}, $sec_rate;
            shift @{$rates->[1]} if scalar @{$rates->[1]} > 60;
            push @{$rates->[2]}, $sec_rate;
            shift @{$rates->[2]} if scalar @{$rates->[2]} > 300;
            my $min1_rate = avg($rates->[1]);
            my $min5_rate = avg($rates->[2]);

            printf "%s -- total: %s, 1 sec: %s, 1 min: %.2f/s, 5 min: %.2f/s\n",
                $line, $total, $sec_rate, $min1_rate, $min5_rate;

            $rates->[0] = 0;
        }
    }
}

sub avg 
{
    my ($list) = @_;
    my $total = 0;
    $total += $_ for (@$list);
    return $total / scalar @$list;
}

Assuming you have jq installed and your logs are in JSON format, you can do something nifty like this:

tail -F /data/nginx/access_log | jq .status | freq

And it will spit out a real time display that looks like this (updated in real time):

"200" -- total: 45226, 1 sec: 676, 1 min: 674.75/s, 5 min: 675.01/s
"204" -- total: 133, 1 sec: 2, 1 min: 2.00/s, 5 min: 1.99/s        
"304" -- total: 70, 1 sec: 1, 1 min: 1.03/s, 5 min: 1.04/s         
"403" -- total: 3, 1 sec: 0, 1 min: 0.08/s, 5 min: 0.08/s          
"404" -- total: 253, 1 sec: 3, 1 min: 4.02/s, 5 min: 3.89/s        
"500" -- total: 52, 1 sec: 0, 1 min: 0.80/s, 5 min: 0.78/s         
"502" -- total: 1, 1 sec: 0, 1 min: 0.05/s, 5 min: 0.05/s          

RBL Nagios script

Written by Eric Schwimmer

We send a lot of email, and consequently we have a lot of email servers. Occasionally, one of them will be accidentally marked as spam on one of the various DNS real-time blacklists (RBLs), which can cause email delivery problems for us. This nagios check will tell you if the host it is run on is listed in any of the major RBL servers:

#!/bin/bash

RBL_SERVERS=(
    'cbl.abuseat.org'
    'dnsbl.cyberlogic.net'
    'bl.deadbeef.com'
    'spamtrap.drbl.drand.net'
    'spamsources.fabel.dk'
    '0spam.fusionzero.com'
    'mail-abuse.blacklist.jippg.org'
    'korea.services.net'
    'spamguard.leadmon.net'
    'ix.dnsbl.manitu.net'
    'relays.nether.net'
    'no-more-funn.moensted.dk'
    'psbl.surriel.com'
    'dyna.spamrats.com'
    'noptr.spamrats.com'
    'spam.spamrats.com'
    'dnsbl.sorbs.net'
    'spam.dnsbl.sorbs.net'
    'bl.spamcannibal.org'
    'bl.spamcop.net'
    'pbl.spamhaus.org'
    'sbl.spamhaus.org'
    'xbl.spamhaus.org'
    'dnsbl-1.uceprotect.net'
    'dnsbl-2.uceprotect.net'
    'dnsbl-3.uceprotect.net'
    'db.wpbl.info'
    'access.redhawk.org'
    'blacklist.sci.kun.nl'
    'dnsbl.kempt.net'
    'dul.ru'
    'forbidden.icm.edu.pl'
    'hil.habeas.com'
    'rbl.schulte.org'
    'sbl-xbl.spamhaus.org'
)

RevIP() { local IFS; IFS=.; set -- $1; echo $4.$3.$2.$1; }
MY_IP=$(curl -s ifconfig.co)
REV_IP=$(RevIP $MY_IP)
HITS=$(
    printf '%s\n' "${RBL_SERVERS[@]}" |
    xargs -P0 -I{} dig +nocmd $REV_IP.{} a +noall +answer |
    sed "s/^$REV_IP\.\(\S\+\)\.\s.\+/\\1/"
)
[[ -z $HITS ]] && echo "OK: $MY_IP not listed in any RBL blacklists" && exit 0
echo -e "ERROR: $MY_IP found in one or more blacklists\n$HITS" && exit 2

fstrim cron job

Written by Eric Schwimmer

This little guy will trim all of the mounts on your system that support it, skipping rotational drives (which sometimes support trim, bizarrely) and NVMe drives (which usually support trim, but whose vendors often recommend against running it on). It will pick apart mdadm arrays to see if their base members are trimmable, and just skip the whole deal if run on a VM:

#!/bin/bash

# Quick function to check if a device is trimmable
trimmable() {
    dev=$1
    basedev=${dev%%[0-9]*}
    ( [[ $dev = *nvme* ]] \
            || egrep -sq '^1$' /sys/block/$/queue/rotational ) \
        && return 0
    egrep -vsq '^0$' /sys/block/$/queue/discard_max_bytes \
        && return 1
    return 0

}

# Exit early if we are in a VM
grep -q '^flags.* hypervisor ' /proc/cpuinfo && exit 0

# Iterate over all of our mounts
findmnt -sen -o SOURCE,TARGET | while read dev_path mount; do
    dev=${dev_path#/*/}

    # If this is a mdadm array, look at its members individually
    if [[ "$dev" = md[0-9]* ]]; then
        for dev_path in /sys/block/$dev/md/dev-*; do
            md_dev=${dev_path##*-}
            trimmable $dev || continue 2
        done

    # Otherwise check if the base device is trimmable
    else
        trimmable $dev || continue
    fi

    fstrim -v $mount
done

s3 bucket usage script

Written by Eric Schwimmer

We needed a summation of all the S3 bucket sizes in our default region (accountants, amirite?). So this script was born:

#!/bin/bash
end=$(date +%s)
start=$((end - 86400))
while read bucket; do
    printf "%-30.30s : " "$bucket"
    aws cloudwatch get-metric-statistics \
        --namespace AWS/S3 \
        --start-time $start \
        --end-time $end \
        --period 86400 \
        --statistics Average \
        --metric-name BucketSizeBytes \
        --dimensions Name=BucketName,Value="$bucket" \
              Name=StorageType,Value=StandardStorage \
        --unit "Bytes" \
    | jq -r '.Datapoints[0].Average' \
    | sed 's/null/0/' 
done < <(aws s3 ls | awk '{print $NF}') \
| sort -rn -k 3 \
| numfmt --to=si --round=nearest --padding -1 --field 3

Mobbage: http(/2) stress tester/benchmark

Written by Eric Schwimmer

I did another less-big thing. We used siege quite extensively at Redfin to do performance testing of various endpoints. However, due to a few limitations, I ended up having to code a replacement that we've been using in-house for a while, and which we subsequently open sourced:

https://github.com/redfin/mobbage

Dirpy: dynamic image modification

Written by Eric Schwimmer

I wrote a big thing: a standalone Python/uwsgi daemon named Dirpy, which can dynamically modify/resize local or remote images based on a complex (yet easily understood) command hierarchy encoded in the query string.

We use it quite a bit at Redfin, and it's reaped us some big savings. It cut the disk utilization on our image servers in half (almost exactly), as we no longer needed to store pre-rendered derivative imaged for all of the various viewport sizes that we support (iPhone, iPad, desktop, etc). It also reduced the amount of developer time required to generate these derivative images from hours (spent mucking about in Java code) to mere seconds (adding 2 lines to an Nginx config). And... it provided noticeably higher quality images than the previous Java-based resizing code (as Dirpy leverages Pillow/PIL, which supports tri-lobed Lanczos resampling filters). So, Win Win Win.

It also scales pretty well. A single 40 core image server is capable of re-sizing over 1000 images/second, using Dirpy (assuming a 1024x768 base image size). Not too shabby!

Check it out on the Redfin github page: https://github.com/redfin/dirpy DERP!