Bunyan Logging in
Production at Joyent

Trent Mick (@trentmick)
Joyent

Vancouver Node.js Meetup, 7 Aug 2014

Bunyan Logging ...

  • Blow through usage and some features.

... in production at Joyent

  • Show a few (hopefully) interesting examples from production experiences.
  • Discuss some motivations.

Bunyan is

① A node.js lib for structured service (and CLI) logging

$ npm install bunyan
$ cat foo.js

var bunyan = require('bunyan');
var log = bunyan.createLogger({
    name: 'foo',
    // ...
});

log.info('hi');

var err = new Error('ack');
log.error(err, 'boom!');

Bunyan is

② A (lightweight) JSON log format

  • one line of JSON per record, small set of minimal fields:
    {"name":"foo","hostname":"grape.local","pid":68313,"level":30,
        "msg":"hi","time":"2014-08-07T06:18:53.057Z","v":0}
  • The primary audience of logs is machine processing.

Bunyan is

③ A bunyan CLI for viewing and filtering logs

Reading JSON sucks, but ANSI codes in files + custom parsing code suck more: Bunyan CLI example


bunyan foo.log [bar.log ...]
bunyan -l warn foo.log
bunyan -c 'this.user=="bob"' foo.log

Bunyan features

  • standard levels: trace, debug, info, warn, error, fatal
  • extensible streams for directing logs to stderr, stdout, files, rotating files, custom endpoints
  • specialization of Loggers with log.child
  • custom rendering of logged objects with serializers
  • runtime log snooping via DTrace support

See README for more.

  • ~40 service/agent/tool log types (~90% bunyan)
  • 1-5 service instances, one agent instance per server
  • ~12 DCs (counting test and staging DCs)

Motivation: Log more details

  • Make it convenient in code to log more and appropriate details.
  • Painful, limited (traditional):
    log.info('%s %s %s %s', req.method, req.url,
        res.statusCode, res.latency)
  • Easier to code (bunyan):
    log.info({req: req}, 'received request');
    log.debug({res: res}, 'responded');
    log.error({err: err, path: path}, 'save failed');

Log more details: render nicely

Make having the extra details not a burden when browsing logs by rendering them nicely.

HTTP request

HTTP response

HTTP response + internal error details

Error + underlying error cause

Maintain error cause chains with node-verror. Bunyan will render these.

Log more details: filtering

Motivate logging of more structured details by making filtering convenient and compelling.


# ERROR level and above
$ bunyan foo.log -l error

# Watch incoming requests.
$ tail -f foo.log | bunyan -c 'this.req'

# Server errors in HTTP responses.
$ tail -f foo.log | bunyan -c 'this.res && this.res.statusCode >= 500'

Log more details: serializers

  • Problem: Dumping full objects -> extraneous goop in the logs.
  • Define how an obj is logged with serializer functions for well-defined log field names.
  • Bunyan defines for err, req, and res. Use is explicit, e.g.:
    var log = bunyan.createLogger({
        //...
        serializers: {
            err: bunyan.stdSerializers.err,
            // custom
            user: function (u) { if (u) return {login: u.login, name: u.name}; }
        }
    });
    
    log.info({user: theUser}, 'logged in');
    log.error({err: e, user: theUser}, 'logged failed');

[2014-08-07T17:25:55.126Z]  INFO: imgapi/74427 on e540e25b-6afc-41ee-a940-a1da39628676: handled: 200 (req_id=dd8d4ad0-1e57-11e4-8499-df7725ba1cdc, route=createimage, action=create-from-vm, audit=true, remoteAddress=172.25.1.39, remotePort=38440, err=false, latency=6354, _audit=true, req.version=~1)
    POST /images?action=create-from-vm&vm_uuid=c691c60d-8f65-6eb2-81d6-d163d65ea421&incremental=true&account=551a64ca-3e08-c0de-cfd2-da81641890ec HTTP/1.1
    x-request-id: dd8d4ad0-1e57-11e4-8499-df7725ba1cdc
    accept: application/json
    content-type: application/json
    user-agent: restify/2.6.1 (ia32-sunos; v8/3.14.5.9; OpenSSL/1.0.1h) node/0.10.30
    date: Thu, 07 Aug 2014 17:25:48 GMT
    content-length: 65
    content-md5: d02bUF89wkeSJT+jGTEENg==
    host: imgapi.nightly-1.joyent.us
    connection: keep-alive

    {
      "name": "1cfeb76f-d951-c1e3-b9e7-97825c7c4efc",
      "version": "1.0.0",
      "owner": "551a64ca-3e08-c0de-cfd2-da81641890ec",
      "uuid": "1c7a2b2d-f5da-eed2-d37c-e3a01de9a8ee",
      "requirements": {
        "min_platform": {
          "7.0": "20140807T022845Z"
        }
      }
    }
    --
    HTTP/1.1 200 OK
    workflow-api: http://workflow.nightly-1.joyent.us
    content-type: application/json
    content-length: 112
    date: Thu, 07 Aug 2014 17:25:55 GMT
    server: imgapi/2.0.0
    x-request-id: dd8d4ad0-1e57-11e4-8499-df7725ba1cdc
    x-response-time: 6353
    x-server-name: e540e25b-6afc-41ee-a940-a1da39628676

    {
      "image_uuid": "1c7a2b2d-f5da-eed2-d37c-e3a01de9a8ee",
      "job_uuid": "071c5d60-1e28-488a-bdaa-0810c902a3da"
    }
    --
    req.timers: {
      "handler-0": 9,
      "bunyan": 47,
      "parseQueryString": 79,
      "reqNoAuth": 4,
      "reqChannel": 14,
      "resume": 90,
      "readBody": 840,
      "parseBody": 260,
      "apiCreateImage": 250,
      "apiCreateImageFromVm": 6350993
    }
    

Necessity: balancing volume

  • Serializers help limit the size of a logged object.
  • Obviously, set runtime log level to control log volume.
  • Not enough.
  • Two interesting examples: RequestCaptureStream and DTrace.

Balancing volume: RequestCaptureStream

  • Provided by the restify web service library.
  • Store low level log records in in-memory ring buffer, per request id.
  • If there is a log.warn or above, then all low level records for that request id are logged.
  • Used in Joyent's public web APIs.

Code for a RequestCaptureStream

var log = bunyan.createLogger({
    name: 'cloudapi',
    serializers: restify.bunyan.serializers,
    streams: [
        {
            level: 'info',
            stream: process.stderr
        },
        {
            type: 'raw',
            stream: new RequestCaptureStream({
                level: bunyan.WARN,
                maxRecords: 100,
                maxRequestIds: 100,
                streams: [ {
                    stream: process.stderr
                } ]
            })
        }
    ]
});
        

Balancing volume: DTrace

  • Typically a prod service log level is INFO, or DEBUG at best.
  • How to get TRACE logging details?
  • Option 1: Update app config and restart.
  • Option 2: DTrace bunyan provider. No restart.
    $ bunyan -p $PID
    $ bunyan -p $NAME    # search processes by name via `pgrep` or `ps`
    $ bunyan -p '*'      # instrument *all* processes!
    $ bunyan -p '*' -l DEBUG
    $ bunyan -p '*' -c 'this.name === "cloudapi"'

`bunyan -p` demo

Motivation: Tie things together

  • Service logs can be a jumble of simultaneous operations.
  • How to meaningfully separate?

Tie things together: module-scope loggers don't help

lib/main.js:    var log = log4j.getLogger('foo')
lib/db.js:      var log = log4j.getLogger('foo.db')
lib/model.js:   var log = log4j.getLogger('foo.model')

Common with log4j-inspired libraries.
Groups on module, not on related operations.

Tie things together: always log context -> PITA

log.info('user=%s start blerging', user)
log.info('user=%s frob the frib', user);
log.info('bliging bloop');    // oops, forgot to log `user`
  • Easy to forget to include relevant context.
  • Labourious to add, so gets skipped.

Tie things together: log.child

req.log = app.log.child({req_id: req.getId()});
// ...
req.log.info({route: route}, 'routed request');
// ...
req.log.debug(err, 'validation failed');
  • Group related logging via logger specialization with log.child.
  • Annoyance: Now need to pass a logger around code.
  • Attach to relevant context object, if available.
    E.g. restify creates a <request>.log for each HTTP request.

Tie things together: req_id

$ bunyan foo.log \
    -c "this.req_id === 'd6740300-1e58-11e4-8f85-2fef7c7c36dd'"

Now we can see all logging for a particular request.

Tie things together: req_id++

How about tracking a request through multiple services?

Example: provision a VM in SmartDataCenter

  • VM API: coordinates creation of VM
  • Server API (CNAPI): handles talking to server agents
  • Network API (NAPI): network provisioning
  • Firewall API (FWAPI): firewall management for VMs

Tie things together: req_id++

Demo tracking VM creation in SmartDataCenter via req_id.

Questions?

Trent Mick (@trentmick)

github.com/trentm/node-bunyan
bunyan-logging@googlegroups.com

Slides: trentm.com/talk-bunyan-in-prod/
and on GitHub: github.com/trentm/talk-bunyan-in-prod