[Cosmo] A Brief Introduction

Brian Kirsch bkirsch at osafoundation.org
Mon Dec 5 15:04:41 PST 2005


Hello All,
My name is Brian Kirsch I am the Email Framework and 
Internationalization engineer for OSAF's Chandler project.

I was asked to help out with Cosmo development tracking down memory 
leaks and performance bottle necks in the current Cosmo server 
implementation.

Although I have been working in Python and C++ during my time at OSAF I 
do have a background in Java Server Development.

I have developed with Tomcat, Struts, and many of the other technologies 
leveraged in Cosmo at previous jobs with Zaplet, Appiant, and JobFlash.

At Excite at Home I wrote a high performant C++ web based file server and 
did quite a bit of memory debugging and profiling for that project using 
GPROF and Rational Purify.

This is my first time profiling and searching for memory leaks in a Java 
server based application.

Java as we all know, posses a unique set of performance and memory 
debugging challenges being a garbage collected language.
Another big issue is Classloaders not getting unloaded because a 
references to it remains in the container.

To make matters more complicated is the number of third party libraries 
(jars) both Cosmo and it underlying application containers leverage.
 

I spent my initial time on this task doing in depth research on the web 
about garbage collection tunning and memory profiling. I also took some 
time to catch up on the latest additions to Java in 1.5 (It's been over 
a year and a half since I have developed in Java).

Next I installed the Java 1.5 JDK on my OS X machine and downloaded 
Cosmo 2.4 from the SVN tree.
I quickly found that very few commercial profilers exist for OS X. In 
fact the only one I found was JProfiler. So I downloaded it and
requested an evaluation license.

The profiler attaches quite easily to a tomcat instance which was nice. 
However the tutorial documentation was out of date which made it hard to 
figure out how to use the thing for memory leak detection.

I then tried to fire up hammer the Cosmo load testing tool written in 
Python. I quickly discovered that hammer requires pycurl which is a 
custom library written in Python. Pycurl requires GCC 4.0 and a newer 
version of libcurl to work. So I ended up spending about three hours 
downloading and configuring OS X with the required dependencies but I 
did finally get pycurl working.

So I fired up JProfiler with Cosmo and began load testing the server 
with the hammer.py threaded web client.
I could see the memory profile increasing over time. With each request 
the number of Collections (Hashtable, Hashmap, etc) item instances 
increased dramatically as did the number of java String instances. This 
is a clear sign that something is storing references to items at the 
session or application scope thus preventing the garbage collector from 
reclaiming the no longer used instances.

What one would like to see is a stable increase in memory per request 
that is then reclaimed on the next garbage collection.
Dangling references will result is long garbage collection runs and Out 
of memory exceptions both of which Cosmo is currently experiencing.

I found it very hard with JProfiler to really get any specific 
information as to which components were causing the memory leaks.
I could find that HashMap's were growing over time but determining which 
code was causing these problems proved difficult.

I decided that working with JProfiler and the Apple VM was not the way 
to go.

We mentioned Jrockit and excellent JVM in the Cosmo server meeting as a 
better choice than the SUN JVM for efficient garbage collection. However 
jrockit is not officially supported with Debian linux our Cosmo server 
platform.

However, I was able to get it to install and run on debian which is good 
news since JRockit includes some excellent memory and performance tools 
as well as a number of options to fine tune garbage collection.

It also offers a light weight management console which can attach to a 
running production server for memory and performance profiling without 
adversely affecting that production servers performance.

So I decided since Jrockit is also supported on Windows my other 
development machine to install and run Cosmo with JRockit on Windows XP. 
Another advantage is that Optimize it, JProbe and a host of other 
commercial profilers run on Windows.


I used cygwin to leverage the existing Cosmo build and startup scripts
I then installed jrockit, and maven. I downloaded a Windows version of 
pycurl but found that it needed
a Python 2.4 window registry entry which cygwin does not provide even 
though my cygwin had a python 2.4 executable.
So I downloaded and installed the Python 2.4 installer release from 
python.org. I set all my paths to point to jrockit and python 2.4.

I also installed and configured the mem leak and performance tools for 
jrockit.

So now I have everything working on Windows an it is on to testing.

Jrockit memory profiler has a nice feature where it will help pinpoint 
memory leak code points something that JProfiler does not.

I welcome any thoughts, comments, suggestions, or help with Cosmo memory 
and performance profiling.

As I move further along in the project I will formalize my thoughts and 
experiences more either via wiki or email.

Since starting on this profiling challenge I have come across a number 
of very good articles which I will share below.

The first is a series of articles by Attila Szegedi which illustrate 
just how hard a task memory and performance profiling can be:

http://www.szegedi.org/articles/memleak.html
http://www.szegedi.org/articles/memleak2.html
http://www.szegedi.org/articles/memleak3.html

Other articles:
http://tomcat.apache.org/faq/memory.html
http://www-128.ibm.com/developerworks/java/library/j-jtp11225/index.html?ca=dgr-lnxw06PlugLeaks
http://opensource2.atlassian.com/confluence/spring/pages/viewpage.action?pageId=2669
http://java.sun.com/j2se/1.5.0/docs/guide/vm/gc-ergonomics.html
http://www.informit.com/guides/content.asp?g=java&seqNum=27&rl=1
http://dev2dev.bea.com/pub/a/2005/06/memory_leaks.html
http://e-docs.bea.com/wljrockit/docs50/tuning/config.html#1014163
http://wiki.apache.org/jakarta-commons/Logging/UndeployMemoryLeak?action=print


-Brian




-- 
Brian Kirsch - Email Framework Engineer
Open Source Applications Foundation
543 Howard St. 5th Floor
San Francisco, CA 94105
(415) 946-3056
http://www.osafoundation.org




More information about the Cosmo mailing list