Monitoring your applications using Monit

Lately, I have been adding support for Java applications in depro.pl (a deployment engine I coded a few years ago). I have a cron script that deploys new changes automatically.

While working on it I realized that deploying stand-alone java applications is not the same as deploying PHP web applications. The process is not quite as simple. For example, Java applications can just blow up and refuse to even start. Secondly, despite of our efforts as professional Software Engineers, we do manage to induce quite a few bugs in our code which can lead to your application crashing even if it managed to lift off the ground successfully.

So for example, earlier versions of the TeaBreak aggregation engine (which is written in Java) used to run smoothly but had a memory leak bug which will eventually exhaust its heap space and terminate with an OOME. The point here is that Java applications (or stand-alone applications in general) can seize to run after you’ve successfully deployed them. What I needed was small monitoring engine to alert and restart (if needed) an application.

There are various options out there like Watchdog, monit, nagios, munin etc. Nagios and Munin are quite advanced and decent monitoring packages and typically include graphing modules as well to represent data points visually. They seemed to be an overkill for what my requirements were:

  • Easily configure a java application for monitoring when it is deployed
  • Poll process / port to see if the application is up
  • Alert and restart if necessary

I didn’t need a graphing solution, neither did I require plugins to do other fancy stuff. All I need was an easy way to tell the monitoring engine that Application fooBar has started and that it should now start monitoring it on so-and-so port. I found Monit to be the right fit for the use-case — its simple, light-weight and very easy to configure.

Monit’s Syntax

Monit’s official documentation is a bit confusing (at least at first). It uses a pseudo-natural-language syntax, something like:

set alert mailbox@myDomain.com but not on { instance }

Now, the thing about natural language syntax is that there are multiple ways of saying the same thing so either your interpreter has to be really flexible or your documentation needs to make clear distinction between what can or can’t be done. The reason for this is that the user might start anticipating other natural constructs that might not be correct (which is what bogged me down for a while), e.g.:

set alert mailbox@myDomain.com on { uid } and not on { instance }

Setting Up Monit and Automating Monitoring

Anyway, after spending some time playing around with their documentation and examples, I finally figured out the right configuration for my applications. A couple of things to note:

  • You don’t need to be root to start Monit’s Daemon process
  • When starting up monit looks for a configuration file in a pre-defined order. Alternatively, you can provide it a path to the config file
  • You can include multiple configuration files by using the standard include keyword (glob supported)

So the setup that I went for was to have the main monit configuration file (a.k.a control file) in one of the standard directories that monit looks in. Within the main control file I instructed monit to include all files under a specific directory for my application’s configuration files, something like:

include /home/asim/depro/var/apps.*

This way I can automate the process by dropping a monit config file for my application in the directory whenever I deploy it. Similarly, when removing or stopping an application, I will simply remove the config file. Ofcouse, Monit will need a reload to pick up any configuration changes.

Example Configuration

Here’s an example of a monit config that my application automatically generates and drops it into the directory when it is deployed:

####
## Project Name: helloWorld
## Base DIR:     /home/asim/depro/deploys/helloWorld
####
check process java_helloWorld
    with pidfile /home/asim/depro/deploys/helloWorld/var/StartJavaApp.pl.pid
    start program = "/bin/sh -c '/home/asim/depro/deploys/helloWorld/bin/kickoff_detached.sh'"
                       with timeout 60 seconds
    stop program = "/bin/sh -c '/home/asim/depro/deploys/helloWorld/bin/stop.sh'"
    if cpu > 60% for 3 cycles then restart
    if totalmem > 250.0 MB for 5 cycles then restart
    if children > 250 then restart
    if loadavg(5min) greater than 10 for 8 cycles then alert
    if failed host localhost port 8020 protocol http
       and request "/ping"
       with timeout 10 seconds for 3 cycles
       then restart
    if 3 restarts within 5 cycles then timeout
    group java_services

#--#

Basically, the monit configuration instructs it to monitor PID file, port, HTTP response and usage of other system resources (CPU / Memory / Threads). It either alerts or restarts the application when a threshold is breached.

I found dealing with Monit simple enough and was able to quickly automate it. It does its job as expected. One other thing, if you have configured monit daemon to start up when you reboot your server then all your monitored services will also start-up automatically, without any extra configuration (as they will fail the check and monit will start them).

This entry was posted in High-Availability, Topics and tagged , , , , , , , . Bookmark the permalink.