BixData is a system, application, and network monitoring tool which allows you to easily monitor nearly every aspect of your servers. It can be used for general reporting, for sending notifications when problems arise, or for automatic maintenance and repairs – by executing scripts when errors or particular conditions arise.
The BixData system is made of three separate parts. The BixAgent runs on any machine you want to monitor. The BixServer is used to monitor machines remotely and to keep track of many different machines. The BixDesktop is a graphical interface that is used to setup and interact with the rest of the system.
I will give a simple example of how to setup BixData. It will hopefully be obvious how to adapt it to your situation. Although BixData is designed to handle large clusters of servers, I’ll just show you how to maintain one. In this simple example I will assume you have a webserver and your own workstation. If the CPU usage on the webserver ever stays above 70% for 10 minutes you want Bix to restart apache and send you an email. If the webserver becomes unreachable you want to get an email. If the machine is still down after 1 hour you want to send an email to your coworker. All of this is very easy to set up with BixData.
Here is how we will do it. We want to install the BixAgent on the webserver to monitor the load. We want to install the BixServer on your workstation to monitor the webserver. We will also install the BixDesktop on your workstation to setup and change the notifications.
First install BixDesktop on your workstation. (I’ll assume your workstation is running Linux though there is a version of the BixDesktop for OS X and Windows.) Simply download, untar and run:
tar -zxvf BixDesktop-2.4.2-linux-1.tar.gz
cd bixdata; ./rundesktop
Now we install the BixAgent on the webserver, simply download and untar:
tar -zxvf BixAgent-2.4.2-linux-1.tar.gz
Now we run the agent so it doesn’t stop when we close the console:
nohup ./bixagent >out &
Instead of nohup you can also use screen
Now install and run the server on your workstation:
tar -zxvf BixServer-2.4.2-linux-1.tar.gz
nohup ./runserver.sh >out &
BixServer can also show you graphs related to service availability, keep records of notifications and store data from BixAgents. This requires a SQL database. A number of databases are supported, and it is quite easy to setup a connection. However, if you’d like to have BixServer run out of the box simply change the one line above to download the package that includes a database.
Connecting Desktop and Server
Now that everything is installed and running we need to connect the BixDesktop to the BixServer and BixAgent. At the login screen choose the “Guest” account and hit the login button. If you hit the (+) and create your own account, BixDesktop will save your desktop layout and remember usernames and passwords for BixData components.
Since in this example the BixServer is running on the same machine as the BixDesktop you can just refer to it as “localhost”. In other setups you would just use it’s IP.
Click on the recently used link for server://localhost or type “server://localhost” in the connection bar and hit Connect; you will be connected to the BixServer and see the main screen of BixDesktop.
The first tab we are interested in is the Situation Room. From here we can add the webserver that we want to monitor. In the machines list click the + button to add a new server to monitor. This brings up a dialog. Simply add the IP of the webserver that is running the BixAgent.
Now that the BixServer knows about the BixAgent we can setup the notifications.
BixData has a very flexible notification system that can be a little confusing at first but is quite powerful once you become familiar with it. In our case we want to create two notifications. One for when the webserver is completely down and another for when the load on the webserver is too high. We do all this from the Notification Setup tab.
Webserver Down Notification
For the first notification we make what is called a Service check to check if HTTP is running on the webserver. We do this by clicking + by the Service checks. This brings up a dialog that lets you choose a name and a type of service to check. Select the HTTP check.
The new HTTP check is now added to the list of service checks we have configured. Click on this new HTTP check and set the options for your particular setup. In our case we just want to add the webserver from the hosts column as the machine to be checked. All the other values are fine left as default. Click the Apply button to save your changes to the BixServer.
HTTP Check Options
Now we need to set up the two Actions that are needed for this notification. We need an action that mails us and an action that mails our coworker if we haven’t fixed the problem in a set amount of time. Click on the + in the Action box. Select Email Action and name it something distinguishable like “Email me”. Click on the newly created email action. Now you can set what address it should email and remember to set the smtp server to use. Click Apply to save your changes. Do the same thing again to email your coworker except name it “Email Bob” and put in Bob’s email and the smtp server to use. Click Apply to save your changes.
Email Action Options
Now we can create the actual Notification. Click on + by Notifications and give it a name. Click on the new notification you just created.
Notification State Diagram
Here we have what is called the Notification State Diagram. It is really fun to use! This allows you to specify various actions to be taken at any of the various stages of a particular service going down and then back up. First select what service check we want this notification to apply to by checking the checkbox. In this case our HTTP Check from above.
Now we want to add our actions. We want it to email us when the service goes from up to down so in the box labeled Up->Down right click on Do nothing and select Set Action and then select “Email me.”
If the webserver stays down for more than an hour we want to send a mail to Bob, since I’m obviously on a date or something. For this we move to the box labeled While Down. Here we again right click on Do nothing but now select Wait and set the time to 1 hour. Now we see the clock icon has been added to the diagram. Now we right click on Do nothing again and select Set Action and then “Email bob.”. Click Apply to save your changes. Bob will definitely handle the problem since he doesn’t date much. The final state diagram should look like this:
Completed State Diagram
This system is very flexible and becomes quite intuitive once you have used it a couple times. You can have actions repeat or add any number of actions to each change in state.
High Load Notification
This is similar to the process above. We make a new service check this time we want to create of type Remote Agent CPU. We do this by clicking + by the Service Checks. We click on the new service check and set the load threshold to be 70% and add the webserver from the hosts column as the machine to be checked. All the other values are fine left as default.
Remote Agent CPU Load Check
We can reuse the “Email me” action we created above but we need to make a “Restart apache” action in order to try to handle the situation automatically. Click on + in the Actions box and select the Execute Command Remotely type. Name the action “Restart Apache” Click on the newly created action. This action type executes a command on a machine that is running the BixAgent. Now we set what command to execute. Since the BixAgent isn’t running as root we must sudo the apache restart.
sudo apachectl restart
As a security precaution BixAgent will not execute remote commands unless authentication is enabled (SSL is not needed). You can read more about enabling authentication by reading the instructions here: http://www.bixdata.com/security_agent
BixServer can also execute commands locally on the BixServer machine, which obviously does not require authentication
Execute Remote Command Options
Now we create a new notification called “Apache Notification”. This time we want both actions to happen when the state goes from Up to Down. So in the While Down box we change Do nothing to wait 10 minutes then “Email me” and then “Restart apache.” The finished state diagram should look like this:
This notification will wait for 10 minutes after CPU usage went beyond 70% and if the CPU usage remains above 70% BixServer will email you and restart Apache.
Kick back and relax
Your notifications are now all setup. You can close the BixDesktop until you want to make changes again to your setup. You probably will want to make your BixAgents and BixServers start automatically on machine boot up.
You will also want to set up security on any machine that is reachable from the public internet: http://www.bixdata.com/security_agent
There are many other things BixData can monitor and do for you (we only briefly looked at 2 of the 9 tabs). So explore and it can make your life as a sysadmin much easier! For more information: www.bixdata.com