#OSCON – Logging as Event Streams

  • Presenter: Brandon Philips, Software Engineer @ Rackspace
  • Background – Brandon works on Rackspace’s Cloud Monitoring.
  • Application and server logging, not OS level.
  • A log isn’t just something on a disk that you check after an incident.  It’s an event emitter.
  • Why structured logging?  Many producers, many consumers, many programming languages.
  • RackSpace uses JSON – easily accessible across all programming languages.  Each event terminated by newline.
  • Traditional logging like combined log format doesn’t handle new fields well – ie. your script that monitors the file using a regex, it’ll have some problems if a developer adds a new field.  Whereas decoding JSON -> no problem.
  • At RackSpace, a trace ID is assigned to actions at the front-end of their service and it’s passed through their stack and to backend services.
    • Trace ID contains: random + hostname + counter for the process + timestamp + git version hash
    • Twitter does something similar using Zipkin
  • Consolidating logs
    • Always write to local disk!  Then…  ship.  Try to be real time-ish.
    • Rackspace:  Webapp -> svlogd, then to file and over the network to scribe
    • Possible tools: Scribe, Apache Flume, syslog
  • Scribe setup
    • Local scribe routes to data centre scribes which then consolidates to central/main scribe
  • Graylog2
    • Many inputs: Syslog, Scribe, RabbitMQ
    • Indexes into ElasticSearch
    • Facilitates use of the trace ID and determining what happend during a particular transaction
    • Concept of permalink to group events, for example, a customer ID
  • Audit logs for API
    • Rackspace exposes audit logs to customers through their API
    • JSON
  • Slides posted here

July 18, 2012

OSCON 2012

