Thursday, March 18, 2010

XYMON Custom Graphs for your Hobbit Server

If you happen to wonder how you can make xymon graphs more customized and you have used hobbit in the past.  Then this short write-up is for you.  The material here again was taken from part of the gazillions of documents that has been posted on the net and I would like to emphasize on that availability since such wonderful how-tos needs a new home and reference.  Read on.

How to setup custom graphs

This document walks you through the setup of custom graphs in your Xymon installation. Although Xymon comes with pre-defined setups for a lot of common types of graphs, it is also extensible allowing you to add your own tests. For many kinds of tests, it is nice to view them over a period of time in a graph - this document tells you how to do that.

Make a script to collect the data

First create your test data. Typically, this is an extension script that sends in some data to Xymon, using a status or data command. If you use status, it will show up as a separate column on the display, with a green/yellow/red color that can trigger alerts. If you use data, Xymon just collects the data into a graph - you must go to the trends column to see the graph. For this example, we'll use status.
So we create an extension script. Here is an example script; it picks two numbers out of the Linux kernel's memory statistics, and reports these to hobbit.

 #!/bin/sh

 cat /proc/slabinfo | \
    egrep "^dentry_cache|^inode_cache" | \
       awk '{print $1 " : " $3*$4}' >/tmp/slab.txt

 $BB $BBDISP "status $MACHINE.slab green `date`

 `cat /tmp/slab.txt`
 "

 exit 0

Get hobbitlaunch to run the script

Save this script in ~hobbit/client/ext/slab, and add a section to the ~hobbit/client/etc/clientlaunch.cfg to run it every 5 minutes:

 [slabinfo]
         ENVFILE /usr/lib/hobbit/client/etc/hobbitclient.cfg
         CMD /usr/lib/hobbit/client/ext/slab
  INTERVAL 5m
(On the Xymon server itself, you must add this to the file ~hobbit/server/etc/hobbitlaunch.cfg)

Check that the script data arrives in Xymon

After a few minutes, a slab column should appear on your Xymon view of this host, with the data it reports. The output looks like this:

 Sun Nov 20 09:03:44 CET 2005

 inode_cache : 330624
 dentry_cache : 40891068

Arrange for the data to be collected into an RRD file

This is obviously a name-colon-value formatted report, so we'll use the NCV module in Xymon to handle it. Xymon will find two datasets here: The first will be called inodecache, and the second dentrycache (note that Xymon strips off any part of the name that is not a letter or a number; Xymon also limits the length of the dataset name to 19 letters max. since RRD will not handle longer names). To enable this, on the Xymon server edit the ~hobbit/server/etc/hobbitserver.cfg file. The TEST2RRD setting defines how Xymon tests (status columns) map to RRD datafiles. So you add the new test to this setting, by adding slab=ncv at the end:

TEST2RRD="cpu=la,disk,<...lots more stuff...>,hobbitd,mysql=ncv,slab=ncv"
slab is the status column name, and =ncv is a token that tells Xymon to send these data through the built-in NCV module.
By default, the Xymon NCV module expects data to be some sort of counter, e.g. number of bytes sent over a network - it uses the RRD DERIVE datatype by default, which is for data that is continuously increasing in value. Some data are not like that - the data in our test script is not - and for those data you'll have to make an extra setting to tell Xymon what RRD data type to use. The RRDtool rrdcreate(1) man-page has a detailed description of the various RRD datatypes. It is available online at http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/doc/rrdcreate.en.html
Our test script provides data that goes up and down in value (it is the number of bytes of memory used for a Linux kernel bufffer), and for that kind of data we'll use the RRD GAUGE datatype. So we add an extra setting to hobbitserver.cfg:

 NCV_slab="inodecache:GAUGE,dentrycache:GAUGE"
This tells the hobbitd_rrd module that it should create an RRD file with two datasets of type GAUGE instead of the default (DERIVE). The setting must be named NCV_.
The hobbitserver.cfg file is not reloaded automatically, so you must restart Xymon after making these changes. Or at least, kill the hobbitd_rrd processes (there are usually two) - hobbitlaunch will automatically restart them, and they will then pick up the new settings.

Check that the RRD collects data

The next time the slab status is updated, Xymon will begin to collect the data. You can check this by looking for the slab.rrd file in the ~hobbit/data/rrd/HOSTNAME/ directory. If you want to check the data it collects, the rrdtool dump ~hobbit/data/rrd/HOSTNAME/slab.rrd will tell you what it got:

 
 
   0001 
   300  
   1132474725  

  
    inodecache 
RRD datatype------>  GAUGE 
    600 
    0.0000000000e+00 
    NaN 

   
current value----->  330624 
    0.0000000000e+00 
    0 
  
If you go and look at the status page for the slab column, you should not see any graph yet, but a link to hobbit graph ncv:slab. One final step is missing.

Setup a graph definition

The final step is to tell Xymon how to create a graph from the data in the RRD file. This is done in the ~hobbit/server/etc/hobbitgraph.cfg file.

 [slab]
  TITLE Slab info
  YAXIS Bytes
  DEF:inode=slab.rrd:inodecache:AVERAGE
  DEF:dentry=slab.rrd:dentrycache:AVERAGE
  LINE2:inode#00CCCC:Inode cache
  LINE2:dentry#FF0000:Dentry cache
  COMMENT:\n
  GPRINT:inode:LAST:Inode cache \: %5.1lf%s (cur)
  GPRINT:inode:MAX: \: %5.1lf%s (max)
  GPRINT:inode:MIN: \: %5.1lf%s (min)
  GPRINT:inode:AVERAGE: \: %5.1lf%s (avg)\n
  GPRINT:dentry:LAST:Dentry cache\: %5.1lf%s (cur)
  GPRINT:dentry:MAX: \: %5.1lf%s (max)
  GPRINT:dentry:MIN: \: %5.1lf%s (min)
  GPRINT:dentry:AVERAGE: \: %5.1lf%s (avg)\n
[slab] is the name of this graph, and it must match the name of your status column if you want the graph to appear together with the status. The TITLE and YAXIS settings define the graph title and the legend on the Y-axis. The rest are definitions for the rrdgraph(1) tool - you should read the RRDtool docs if you want to know in detail how it works. For now, all you need to know is that you must pick out the data you want from the RRD file with a DEF line, like

  DEF:inode=slab.rrd:inodecache:AVERAGE
which gives you an "inode" definition that has the value from the inodecache dataset in the slab.rrd file. This is then used to draw a line on the graph:

  LINE2:inode#00CCCC:Inode cache
The line gets the color #00CCCC (red-green-blue), which is a light greenish-blue color. Note that you can have several lines in one graph, if it makes sense to compare them. You can also use other types of visual effects, e.g. stack values on top of each other (like the vmstat graphs do) - this is described in the rrdgraph man-page. An online version is at http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/doc/rrdgraph.en.html. The GPRINT lines at the end of the graph definition also uses the inode value to print a summary line showing the current, maximum, minimum and average values from the data that has been collected.
Once you have added this section to hobbitgraph.cfg, refresh the status page in your browser, and the graph should show up.

Add the graph to the collection of graphs on the trends column

If you want the graph included with the other graphs on the trends column, you must add it to the GRAPHS setting in the ~hobbit/server/etc/hobbitserver.cfg file.

 GRAPHS="la,disk,<... lots more ...>,bbproxy,hobbitd,slab"
Save the file, and when you click on the trends column you should see the slab graph at the bottom of the page.

Common problems and pitfalls

If your graph nearly always shows 0

You probably used the wrong RRD datatype for your data - see step 4. By default, the RRD file expects data that is increasing constantly; if you are tracking some data that just varies up and down, you must use the RRD GAUGE datatype. Note that when you change the RRD datatype, you must delete any existing RRD files - the RRD datatype is defined when the RRD file is created, and cannot be changed on the fly.

No graph on the status page, but OK on the trends page

Make sure you have ncv listed in the GRAPHS setting in hobbitserver.cfg. (Don't ask why - just take my word that it must be there).

 reference: http://www.hswn.dk/hobbit/help/howtograph.html

No comments:

Post a Comment