Rusty Japikse
My Life in Bits
Catching Spinning Zope Databases and Errant Helper Processes
Problem Background
Zope, the database which powers Plone, can occasionally get stuck in a spinning state where the CPU load goes to 99%+ and the database is effectively locked. Solving this is easy, you just need to restart the Zope session. Of course that is a entirely useless approach if you are unable to personally monitor your server all the time. Additionally, some of the helper applications commonly used with Plone occasionally encounter problems of their own and lock up at a high CPU load. This is not ideal behavior either.
Additional reading on spinning Zope databases: http://www.zope.org/Members/4am/debugspinningzope
The Solution
Not wanting to spend my days monitoring our intranet server for spinning Zope sessions or errant helper processes, I wrote a monitoring application in python (zope_monitor.py). This application is then daemonized using a script downloaded from the python cookbook, I have included a copy here, but you can use whatever you like.
To use this script, download zope_monitor.py and zope_monitord.py. Edit the configuration variables in zope_monitor.py. These variables determine which helper processes are monitored for killing (kill -9 PID), the user accounts these processes will be running under, and the time interval that they should be allowed to run before they are stopped. In my configuration, instances of pdftotext, pdftohtml, wvWare, and xlhtml that are being run as 'www' are being monitored. These applications are allowed to run for twenty seconds before they are killed. The status of these processes is checked every two seconds.
I have also set the CPU threshold for a spinning Zope session to 99% with a time interval of one minute before the Zope server is restarted. Be sure to also configure the ZOPE_RESTART variable. This constant contains the command used to restart a Zope session. It is currently set to '/usr/local/etc/rc.d/zope.sh restart', a common path for restarting Zope on a FreeBSD system.
Note: I run this script as root. You might not be comfortable doing this, but you will have to figure out a way to restart the Zope server as a non-root user.
Downloads
- The Zope monitoring script (zope_monitor.py)
- The daemonizing script (zope_monitord.py)