StatsD and You
The easy way to get application level metrics
http://cleverdomain.org/ppw-statsd/
Who am I?
Who are you?
- Developers
- Who maintain apps that people depend on
- Who are “on-call” to some degree
What’s this about?
- StatsD
- And Graphite, kind of.
- StatsD works with other things
- I’ll be showing Graphite
More sources here and here
StatsD is a metrics aggregation system originated by Flickr and implemented by StatsD a few years ago
What can I do with StatsD?
- Find out
- when things happen
- how often things happen
- how long things take
- what’s happening right now
- when things changed
realtime and historical views both useful when debugging a problem
What can I do with StatsD?
- Find out
- how your app is performing
This is a graph of how long some service call takes — point out percentiles
What can I do with StatsD?
- Find out
- how your app is performing
This is a graph of the number of hits to some service — see how busy it is and how the load breaks down among different sources
StatsD features
- Accepts measurements from all over your network with UDP
- Counters
- Timers
- Sets (count uniques)
- Gauges
- Aggregates them into 10-second chunks
- Sends them off to somewhere that will store the data
- Counters: how many times does some event happen
- Timers: how long does something take to do?
- Sets: register a pageview with each username and find out how many different users viewed pages
- Gauges: for global, slow stats like resource usage or % completion of a process
Installing Statsd
npm install -g statsd
statsd exampleConfig.js
Installing Graphite
- Carbon
- Whisper
- Django
- mod_wsgi
- Graphite-Web
- No distribution packages for most of it
- Sorry :(
Installing Graphite
- I ended up using Chef
- You can get it for most systems
- You can use Chef-solo without any servers
- It will install Graphite and all the requirements for you
- Cookbook docs
Docker
docker build git://github.com/dotcloud/collectd-graphite
or
docker build git://github.com/silas/docker-graphite
How does it integrate with my app?
Clients for every language means all your apps can send data the same place even if you use lots of languages. Net::Statsd::Client is my module created with help from Shutterstock.
Net::Statsd vanilla
use Net::Statsd;
$Net::Statsd::HOST = "stats.example.com";
sub login {
Net::Statsd::increment("myapp.logins");
}
Global variables are meh :(
Net::Statsd::Client
use Net::Statsd::Client;
my $stats = Net::Statsd::Client->new(
server => "stats.example.com",
prefix => "myapp.",
);
sub login {
$stats->increment("logins");
}
Prefix for DRY, namespacing, possiblity of loading config from Dancer config or similar
Net::Statsd timing
use Net::Statsd;
$Net::Statsd::Server = "stats.example.com";
use Time::HiRes qw(tv_interval gettimeofday);
sub get_something {
my $start = [gettimeofday];
my $client = Service::Client->new;
my $data = $client->get("some stuff");
my $elapsed = tv_interval($start);
Net::Statsd::timing("myapp.get_something.elapsed",
int($elapsed * 1000));
return $data;
}
Lots of repeated boilerplate around timing. What’s the right multiplication factor? Statsd uses ms.
Net::Statsd::Client timing
use Net::Statsd::Client;
my $stats = Net::Statsd::Client->new(
server => "stats.example.com",
prefix => "myapp.",
);
sub get_something {
my $timer = $stats->timer("get_something.elapsed");
my $client = Service::Client->new;
my $data = $client->get("some stuff");
$timer->finish;
return $data;
}
Timer object abstracts away timing, warns if →finish or →cancel not called because it indicates an uncovered piece of code
Sampling
Do you have that many events? Okay.
use Net::Statsd::Client;
my $stats = Net::Statsd::Client->new(
server => "stats.example.com",
prefix => "myapp.",
sample_rate => 0.01,
);
Override per-event if you like.
$stats->increment("lines_of_code_executed", 0.0001);
For counters, statsd automatically scales with sample rate so everything is consistent
Planned Features
- More awesome timing with checkpoints.
Graphite Tricks
- Use drawAsInfinite to mark events like deploys with a vertical line.
- Use drawAsInfinite, removeBelowValue, an area graph, and transparency to highlight out-of-range values.
- Use {highest,lowest}{Average,Current} to show nodes that stand out from the crowd.
Graphite Tricks
Thanks!