Friday, February 15, 2013

Centos 5 with ganglia and rrdcached

 

 As the number of metrics in your environment grows you start to see huge impact in system IO performance. At one stage the disk utilization stays at 100% and cpu spends a lot of time waiting for IO. At this time the IO exceeds the theoratical IO supported by the disk. We might go ahead and add fast 15K disks or raid10 array. The disk contention stays as long as we keep on expanding the cluster and adding new metrics.

 

 A simple solution would be to add the rrdcached layer in the middle. There are certain things to consider such as updating the rrdtool package and recompiling ganglia with the new rrdtool support.

 

 Just a step by step guide of doing it.

 

A. Steps for building and install rrdtool and ganglia.

 

1. uninstall existing rrdtool

yum -y install rrdtool rrdtool-perl

 

2. Download latest rrdtool

 

wget http://apt.sw.be/redhat/el5/en/x86_64/dag/RPMS/rrdtool-1.4.7-1.el5.rf.x86_64.rpm

wget http://apt.sw.be/redhat/el5/en/x86_64/dag/RPMS/perl-rrdtool-1.4.7-1.el5.rf.x86_64.rpm

wget http://apt.sw.be/redhat/el5/en/x86_64/dag/RPMS/rrdtool-devel-1.4.7-1.el5.rf.x86_64.rpm

 

3. Install all three in a single go. Otherwise it will show some weird perl dependency errors.

 

rpm -ivh rrdtool-1.4.7-1.el5.rf.x86_64.rpm rrdtool-devel-1.4.7-1.el5.rf.x86_64.rpm perl-rrdtool-1.4.7-1.el5.rf.x86_64.rpm

 

4. Get the ganglia source rpm 

 wget http://downloads.sourceforge.net/project/ganglia/ganglia%20monitoring%20core/3.4.0/ganglia-3.4.0-1.src.rpm?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fganglia%2Ffiles%2Fganglia%2520monitoring%2520core%2F3.4.0%2F&ts=1360839947&use_mirror=citylan

 

5. Install the dependencies required for building the rpm.

 

yum -y install libpng-devel libart_lgpl-devel python-devel libconfuse-devel pcre-devel freetype-devel

 

6. Build the rpm

 rpm -ivh ganglia-3.4.0-1.src.rpm

rpmbuild -ba /usr/src/redhat/SPECS/ganglia.spec

 

7. Install the new ganglia versions

 

rpm -ivh /usr/src/redhat/RPMS/x86_64/ganglia-* /usr/src/redhat/RPMS/x86_64/libganglia-3.4.0-1.x86_64.rpm

 

B . Configuring rrdcached to work with ganglia.

 

1. Since gmetad runs with ganglia user and rrdcached require access to write to rrd dir and apache needs access of same directory. Add ganglia to apache group.

usermod -a -G apache ganglia

 

2. correct the permissions

chown -R ganglia:apache /var/lib/ganglia/rrds/

 

3. Update rrdcached sysconfig startup options.

cat /etc/sysconfig/rrdcached

OPTIONS="rrdcached -p /tmp/rrdcached.pid -s apache -m 664 -l unix:/tmp/rrdcached.sock -s apache -m 777 -P FLUSH,STATS,HELP -l unix:/tmp/rrdcached.limited.sock -b /var/lib/ganglia/rrds -B"

RRDC_USER=ganglia

5. Update gmetad sysconfig file so that it knows the rrdcached socket information.
cat /etc/sysconfig/gmetad
RRDCACHED_ADDRESS="unix:/tmp/rrdcached.sock"
6. Update ganglia-web config information so that apache communicates with rrdcached daemon to fetch rrd information.
grep rrdcached_socket /var/www/html/gweb/conf_default.php
$conf['rrdcached_socket'] = "/tmp/rrdcached.sock";
7.  stop gmetad
service gmetad stop
8. start rrdcached
service rrdcached start
9 . start gmetad
service gmetad start
 If everything is fine then you should see graphs populating in ganglia frontend.
At the same time you'll see that the IO disk utilization is reduced awesomely :)