Setting up TICK Stack on Docker

One of the main components of the MALT stack is metrics and monitoring.  The TICK stack is one of the biggest stacks that provide this functionality.  The TICK stack is from the InfluxData company as part of their architectural platform providing metrics.  The T is for Telegraf, which is their agent/collector used to collect metrics data from a variety of sources.  MALT embraces the collector paradigm heavily as a standard way of consuming data and pushing data.  The base of every collector platform are the inputs and the outputs.  Telegraf supports a wide variety of inputs from the OS-level system to databases to general purpose REST endpoints.  Once consumed, Telegraf can push the data to almost anywhere including both Kafka and InfluxData's own data store InfluxDB, the I in the stack.  The key component of MALT is the centralized stream, such as Kafka.  Telegraf makes this extremely easy as metrics may be collected from every service and then pushed to Kafka.  A central Telegraf agent can then consume from Kafka and push to InfluxDB.  I'll show how to set this up in the next section.  The final two components of the TICK stack is C and K which stand for Chronograf and Kapacitor, respectively.  Chronograf is the main administrative and visualization dashboard, in the absence or in addition to something like Grafana.  Kapacitor is a real time streaming engine that allows for processing alerts and other custom functionality.  Alerting is another central aspect of the MALT stack which we will cover in a future post.

With the understanding out of the way, how can we quickly setup the TICK stack on Docker?  The following instructions will help to provide some basics.


Step 1 : Setup InfluxDB

The first step is getting InfluxDB setup so that it can persist and retrieve metrics.  InfluxDB is a custom, fast, and efficient time-series data store.

InfluxData does a great job of providing a suitable Docker image.  For more information, see their excellent documentation.

Run the following commands to setup a persistent local volume and run the database.

mkdir influxdb
docker run -p 8086:8086 -p 2003:2003 -v influxdb:/var/lib/influxdb -e INFLUXDB_GRAPHITE_ENABLED=true influxdb:alpine

This will start the InfluxDB REST service on port 8086 and the Graphite protocol on port 2003.  This post won't dig into using the Graphite protocol, but a future post will show how to leverage logs to collect metrics into Graphite.

Once started, you can validate the database is running by using the REST service:


curl -X POST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE metrics"

This will create a database named metrics which will be used later.

To setup InfluxDB within Docker compose, use the following service definition:

influxdb:
    image: influxdb:alpine
    container_name: influxdb
    ports:
      - "8086:8086"
      - "2003:2003"
    environment:
      INFLUXDB_GRAPHITE_ENABLED: true
    volumes: - ./influxdb:/var/lib/influxdb:rw

Step 2 : Setup Kafka

As the central cornerstone of the MALT stack, Kafka is a required component for pushing data around.  Kafka also requires Zookeeper.  In most cases you will want to scale Kafka and Zookeeper to at least 3 hosts to provide better scale and resiliency.

To get started locally, spin up a single Zookeeper installation using the base Zookeeper docker image.

docker run -p 2181:2181 -e ZOO_MY_ID=1 -e ZOO_SERVERS=server.1=localhost:2888:3888 zookeeper:latest

To setup a 3-node cluster locally, use the following Docker compose:

zoo1:
    image: "zookeeper:latest"
    ports:
      - "2181:2181"
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888

zoo2:
    image: "zookeeper:latest"
    ports:
      - "2182:2181"
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888

zoo3:
    image: "zookeeper:latest"
    ports:
      - "2183:2181"
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888

To run Kafka in Docker, this image is the best I've come across so far.  Before running this image, you must locate the IP address of the locally running docker container.  Otherwise, if you run within Docker compose, just ensure the hosts are set accordingly.

docker run -p 9092:9092 -e KAFKA_ZOOKEEPER_CONNECT=(IPADDR):2181 wurstmeister/kafka:latest

To run this in distributed mode within Docker compose, the following configuration may be used along with the above Zookeeper configuration.


kafka1:
    image: "wurstmeister/kafka:latest"
    depends_on:
      - zoo1
      - zoo2
      - zoo3
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ADVERTISED_HOST_NAME: kafka1
      KAFKA_ADVERTISED_PORT: 9092
      KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181

  kafka2:
    image: "wurstmeister/kafka:latest"
    depends_on:
      - zoo1
      - zoo2
      - zoo3
    ports:
      - "9093:9092"
    environment:
      KAFKA_BROKER_ID: 2
      KAFKA_ADVERTISED_HOST_NAME: kafka2
      KAFKA_ADVERTISED_PORT: 9092
      KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181

  kafka3:
    image: "wurstmeister/kafka:latest"
    depends_on:
      - zoo1
      - zoo2
      - zoo3
    ports:
      - "9094:9092"
    environment:
      KAFKA_BROKER_ID: 3
      KAFKA_ADVERTISED_HOST_NAME: kafka3

      KAFKA_ADVERTISED_PORT: 9092
      KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181

This will result in 3 Zookeeper nodes running and 3 Kafka nodes running.

To test this Kafka cluster, use the Kafka command line tools.  To get the command line tools on Mac OSX, simply use brew install kafka.

kafka-topics --create --zookeeper localhost:2181,localhost:2182,localhost:2183 --topic metrics --partitions 3 --replication-factor 2

This creates a topic named metrics that will use 3 partitions across the 3 Kafka nodes.  The actual number of partitions and replication factor should depend on how many applications and systems are pushing metrics.

Step 3 : Setup Telegraf Publisher

The Telegraf publisher will be used to consume metrics off Kafka and push into InfluxDB.  This allows all metrics to be centralized.  The Telegraf publisher is just one consumer.  Very easily we could have separate consumers for archiving or sending to alternative data stores.  In fact, without changing a single application, we could change Telegraf from publishing into InfluxDB to publishing to OpenTSDB instead.

InfluxData has a great working Telegraf Docker image that we can easily use.  Before we create the Docker container, we must create our configuration file for Telegraf to use.  This will use a single input plugin for Kafka and a single output plugin for InfluxDB.

[agent]
    interval = "10s"
    round_interval = true
    metric_batch_size = 1000
    metric_buffer_limit = 10000
    collection_jitter = "0s"
    flush_interval = "10s"
    flush_jitter = "0s"
    precision = ""
    debug = false
    quiet = false
    hostname = ""
    omit_hostname = false

[[outputs.influxdb]]
    database = "metrics"
    retention_policy = ""
    write_consistency = "any"
    timeout = "5s"

[[inputs.kafka_consumer]]
    topics = ["metrics"]
    zookeeper_peers = ["zoo1:2181", "zoo2:2181", "zoo3:2181"]
    zookeeper_chroot = ""
    consumer_group = "telegraf_metrics_consumers"
    offset = "oldest"
    data_format = "influx"

With this telegraf.conf file, we can start an example agent with Docker.

docker run -v `pwd`/telegraf.conf:/etc/telegraf/telegraf.conf telegraf:alpine

Note that this will fail since the telegraf.conf file above references the Zookeeper hosts by their Docker compose names.  To resolve this, simply change those names to be the IP address of the docker container with ports 2181-2183 (the exposed ports on the localhost).

To run within Docker compose, use the following configuration:

  telegraf_publisher:
    image: telegraf:alpine
    container_name: telegraf_publisher
    depends_on:
      - influx
      - zoo1
      - zoo2
      - zoo3
    volumes:
      - ./telegraf-publisher.conf:/etc/telegraf/telegraf.conf:ro

Step 4 : Setup Telegraf Agent

This step will setup a simple Telegraf agent that will capture the OS-level/system-level metrics.  To capture application-level metrics, see a later post on using Dropwizard (fka Codahale) Metrics and sending metrics to a TCP input plugin on Telegraf.  This agent will setup the system inputs and use a single Kafka output.

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = false
  hostname = ""
  omit_hostname = false

[[outputs.kafka]]
  brokers = ["kafka1:9092","kafka2:9092","kafka3:9092"]
  topic = "metrics"
  routing_tag = "host"
  compression_codec = 1
  required_acks = -1
  max_retry = 3
  data_format = "influx"

[[inputs.cpu]]
  percpu = true
  totalcpu = true
  fielddrop = ["time_*"]

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs"]

[[inputs.diskio]]
  # no configuration

[[inputs.kernel]]
  # no configuration

[[inputs.mem]]
  # no configuration

[[inputs.processes]]
  # no configuration

[[inputs.swap]]
  # no configuration

[[inputs.system]]
  # no configuration

To add this agent to the Docker compose, use the following configuration:

  telegraf_consumer:
    image: telegraf:alpine
    container_name: telegraf_consumer
    depends_on:
      - kafka1
      - kafka2
      - kafka3
    volumes:
      - ./telegraf-consumer.conf:/etc/telegraf/telegraf.conf:ro

Step 5 : Setup Grafana

Now that we have metrics being collected from our system and being routed through Kafka and into InfluxDB, we can now setup a Grafana dashboard to showcase the data.

Grafana provides an out of the box Docker image that may be used to easily create the Grafana instance.

mkdir grafana
docker run -p 3000:3000 -v `pwd`/grafana:/var/lib/grafana grafana/grafana:latest

This will create a Grafana instance running on port 3000 at http://localhost:3000.  The default login is admin/admin.

To plug this into Docker compose use the following setup:

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    ports:
      - "3000:3000"
    depends_on:
      - influx
    volumes:
     - ./grafana:/var/lib/grafana:rw

Step 6 : Put It All Together

Now that we have all our required components in Docker Compose, simply run docker-compose up and the entire stack should start.  For more information, see the GitHub repository: https://github.com/malt-stack/tick-stack-with-docker

Once running, open your browser to http://localhost:3000 and login with admin/admin.  Then click on Add Data Source

  • Set the name to InfluxDB.
  • Set the type to InfluxDB
  • Set the URL to http://influx:8086
  • Set the database to metrics
  • Click on Add
This should validate the connection and result in the data source being added.

Now go back to the main page and click on Add Dashboard
  • Click on Graph to create a new Graph
  • Click on Panel Title and click Edit
  • Set the Panel Data Source to InfluxDB
  • Set the Measurement to be CPU
  • Click on Where to add a criteria on cpu = cpu-total
  • Click on Value and set the Select field as usage_system
  • Click on time in Group By and select Remove 
This should now show the System CPU as a Graph.

You now have a full TICK stack running under the principles of the MALT stack!

Stay tuned for future posts on plugging in Application Metrics and using other components of the MALT stack including logging and alerting!


Comments

Popular posts from this blog

Setting up the ELK Stack on MALT

Todo Application Demo