[Cosmo-dev] Infrastructure for metrics

Jared Rhine jared at wordzoo.com
Tue Jan 2 12:11:19 PST 2007


Proposal:
Enhance Cosmo code to include account names in the "user" field of the
common log format generated by Tomcat

Background:
When Preview launches, the hosted service will be wanting to get some
nice metrics about usage.

We're primarily interested in "number of visitors" from the different
user profiles (Chandler/hub, casual collaborator, consultative,
standalone, interop/iCal/Outlook, dev/mashup).  Identifying "accounts"
and "anonymous" accesses is key to counting up visitors.

Under Cosmo 0.5 and earlier, I relied on post-processing of
Apache/Tomcat HTTP access logs for extracting core metrics.  In
particular, I would slice-and-dice the URL path, to get from
"/cosmo/home/jared/Work/...", that account "jared" was accessed.

With Cosmo 0.6 emphasizing URLs with UUID instead of a path, in some
cases, it won't be easy to determine from post-analysis which accounts
were accessed.

In terms of baby-step features to improve the ability to analyze Cosmo
instances from access logs, the idea has been floated to replace the
"extract account name from URL" procedures with direct access to the
account name, placed into the standard access logs.  The common log
format has a dedicated field for this piece of data for each request.
Seems like I could implement a lot of analysis myself if this field got
filled in.

If it's agreed that this is feasible and a good approach, I'll open the
Cosmo ticket.

It's also been discussed that I could probably get some of this info
from osafsrv.log now that the HttpLogFilter has been added to the code,
and does track user.  But this feels hacky and overloads a debug-type
feature to provide mission-critical usage analysis.

I have a short-list of additional "events" I'll want to be able to
clearly identify through log analysis:

* Successful new account signups
* Successful logins to the web UI
* Anonymous access to a collection via web UI (versus
authenticated-by-password  access)
* Authenticated-by-password users accessing the web UI
* Updates to events by Chandler
* Updates to events via the web UI

(These are "base markers", not metrics by themselves.  If I can count
these events by looking at logs, then I can build the "real" metrics on
top of these measurements.)

Some of these I think already have unique markers in access logs; I need
to do a full analysis of recent log output.  Others may need Cosmo
enhancements/changes to implement.  If you've immediate thoughts upon
seeing this list, please comment.  Otherwise, I'll revisit this longer
list soon in a more detail "metrics" plan for the hosted service.

-- 
Jared Rhine <jared at wordzoo.com>




More information about the cosmo-dev mailing list