Using dstat with scripts and external modules


Here we look at the basic scripting options for dstat as well as an overview of its external modules.

All-powerful

Well, not that. But it can feel a little like that when you fiddle with some more advanced aspects of dstat. Because it can do an awful lot.

In the first article of this series we looked at basic usage of dstat - what it does and how to customize its reporting. Now we’ll look at making a log of dstat results, along with some details about using external modules.

Script usage

For the most part dstat isn’t really designed with scripts in mind - its focus is more on presenting the relevant information for human eyes. If you want to use dstat in a script you can send its output to a text log or to a CSV file (or both).

If you just want a (slightly cluttered) text log of dstat results you can pass it some options to cut down on the output a little bit.

The “–noheaders” option tells dstat not to insert column headers periodically in its results. The headers are still shown when dstat is run.

The “–noupdate” option tells dstat not to try to update its averages every second between reports. It also tells dstat not to use color in its output.

The “–nocolor” option tells dstat not to use colors in its output. Not strictly necessary when logging dstat’s output, but certainly useful if you run the command and don’t like the colored text. Using “–nocolor” also applies “–noupdate”.

The “-t” options adds a column reporting the current date and time.

Quick and dirty text logging

If you want to just run dstat in the background while it records text data over a period of time you can run something like:

dstat -ts --top-io --top-bio --noheaders --nocolor 5 >> /tmp/dstat.txt &

That would take measurements every 5 seconds and write them to the file “/tmp/dstat.txt”. The “&” at the end tells the process to run in the background. To end the dstat task you can either kill its process (using the pid it reported when it was backgrounded) or type “fg” to bring dstat to the foreground, and then hit “control-C” to quit it.

Or you can just log out. The process will be terminated when your shell is ended.

CSV output

You can also tell dstat to write the data it collects to a CSV file. Among its many possible uses, a CSV file can be loaded into a spreadsheet program so you can analyze the results as a graph.

The “–output filename.csv” option tells dstat to write CSV data to a file (replace “filename.csv” with whatever file name you’d like it to write to). Note that dstat will still output human-readable information normally when it writes to the CSV file.

When the “–output” option is used the data is appended to the end of the specified file. It won’t overwrite what’s already there.

If you wanted to use the quick-and-dirty logging command above, but only wanted CSV output, you could run that command as:

dstat -ts --top-io --top-bio --noheaders --nocolor --output /tmp/dstat.csv 5 > /dev/null &

The main output is redirected to /dev/null, which means it won’t get written anywhere. Instead you’ll get a CSV file at /tmp/dstat.csv.

A sample cron script

You can use cron task scheduling to get dstat to run every minute and log its reports to a file. Open this file for editing (with sudo):

/etc/cron.d/dstat

And put the following into that file if you want a text log:

    #  Run dstat and log the reports
    
    * * * * * root dstat -ts --top-io --top-bio --noheaders --nocolor 5 3 >> /var/log/dstat.txt

If you prefer a log of the data in CSV format, put this into the file instead:

    #  Run dstat and log the reports
    
    * * * * * root dstat -ts --top-io --top-bio --noheaders --nocolor --output /var/log/dstat.csv 5 3 > /dev/null

Save the file. That cron.d entry will cause dstat to run every minute, logging its reports to “/var/log/dstat.txt” or “/var/log/dstat.csv”, depending on the format you chose. It takes three samples when it runs (five seconds apart) and reports on swap usage and the top I/O-using processes.

After a minute take a look in /var/log. If the script ran as expected you’ll see a dstat log there.

Tweak the command to your liking. If you want a different number of samples per minute change the second numerical value in the command.

Rotate the created log

If you plan to leave the dstat logger running (and aren’t just going to run it for a day or so to check on a burst of disk activity) you’ll want to rotate its log occasionally to keep it from getting too big.

To that end, create a file for editing at “/etc/logrotate.d/dstat” (again using sudo).

Put the following into the file:

/var/log/dstat.txt {
  rotate 5
  weekly
  compress
  missingok
  notifempty
}

Replace the filename if necessary - if you used the CSV script above, for example, you’d change it to “/var/log/dstat.csv”.

For more details on what that logrotate config will do you can look through our articles on logrotate. You might change the frequency of rotations from “weekly” to “daily” if you change the script to log more reports, then increase the number of archived logs it keeps accordingly.

Module files

One distinction dstat makes regarding its options is between “internal” and “external” modules. The main thing to know about the difference between those types is that “external” modules are defined by files stored in the directory:

/usr/share/dstat

The external modules can sometimes take some extra effort to enable.

Inside a module

To look at a specific example, the file in that directory that describes the module that reports MySQL I/O for version 5 is:

dstat_mysql5_io.py

Looking inside that file will reveal the python code defining the module. This is of most use to users who can read python code, of course.

Python modules

If you try running dstat with the mysql5-io module after a fresh install you’ll get an error message:

$ dstat --mysql5-io
Module dstat_mysql5_io failed to load. (No module named MySQLdb)
None of the stats you selected are available.

That means python needs a module installed before dstat can use the mysql5-io module. When you run into that problem you can often find the package you need by running a search for the module name. Pick the command for your distro:

aptitude search mysqldb
yum search mysqldb
emerge --search mysqldb
pacman -Ss mysqldb

Fortunately that’s a pretty common module so you should see a package in the results that has “python” in the name, like “python-mysqldb”. Install that.

Environment variables

Let’s run that dstat command again:

$ dstat --mysql5-io
Module dstat_mysql5_io failed to load. (Cannot interface with MySQL server)
None of the stats you selected are available.

Still more to do! In this case, the problem is that dstat doesn’t know a username and password it can use to access MySQL to gather the stats it wants. To get the username and password the module checks for two environment variables:

DSTAT_MYSQL_USER
DSTAT_MYSQL_PWD

Pretty straightforward. The first one is the username, the second is the password. Trouble is, you probably don’t want to define the MySQL password in your bashrc or profile config, and defining those variables every time you want to run the check would be a pain.

(You can discover the necessary environment variables from those “os.getenv” calls at the beginning of the module definition file, for the curious.)

Wrapping it up

The solution is to use a shell script “wrapper” that will define those variables before running dstat. At its simplest that would look like:

#!/bin/bash

DSTAT_MYSQL_USER=username
DSTAT_MYSQL_PWD=password

dstat --mysql5-io $@

Put that in a file with a descriptive name (like “mysql5-io.sh”), then set its permissions so no one else can peek inside to see the password:

chmod 700 mysql5-io.sh

And now you can run the script when you want to see the MySQL I/O stats:

$ ./mysql5-io.sh
-mysql5-io-
 recv  sent
1.32B 44.6B
5936k  195M
5936k  195M

The “$@” in the script refers to any arguments passed to the script, which means that any arguments you give the script will be passed as options to dstat. So to add swap use information to the output of the script, run:

$ ./mysql5-io.sh -s
-mysql5-io- ----swap---
 recv  sent| used  free
1.32B 44.6B|2680k 1021M
5946k  195M|2680k 1021M

Learning more

Unfortunately the documentation for the modules isn’t very extensive. If you see an interesting-looking module but get an error trying to run it you’ll probably need to look inside its definition file to figure out how to get it working. That means knowing some python.

Fortunately python is easier to pick up than many other scripting languages. It’s also used by a lot of system tools in many distributions, so a little python knowledge can help in other areas of system administration. A good place to start is the Beginner’s Guide to Python.

Summary

You should have a good grasp on dstat’s potential now. What’s left is experimentation and discovery. And probably the dstat man page. That helps too.



Was this content helpful?




© 2011-2013 Rackspace US, Inc.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License


See license specifics and DISCLAIMER