[Design] Collecting Usage Data/Central Logging

Ashkan Soltani ashkan at osafoundation.org
Mon Jun 26 15:59:49 PDT 2006


Hey all

I'm Ashkan, one of the interns working with the PPD team this  
summer.  My project it to instrument a usage tracking compoment  
within Chandler in an attemt to better understand how dogfooders are  
using the app.  The idea is if we can gather logging data, we can  
make better assumptions of what people actually do on a regular basis  
and focus energies in improving those activities.

With a lot of JohnA's help, I've managed to put together a small  
compement that logs blockEvent and some of context to a file or  
stderr.  (I should preface this by saying that I don't have any  
previous exp w/ Python .... most of my development was in Perl and C)

Now that I'm able to log the events, I need to find a way to  
consolidate this logging information and get it back to us for review/ 
analysis (both automated statistics crunching as well as individual  
workflow analysis)

The 'logging' workflow is:
	- dogfooders enable 'logging' via a pulldown in the Chandler test  
menu pulldown
	- they enter are prompted for a 'username' (basically an arbitrary  
identifier), displayed some privacy disclaimer, and then a unique  
userid is generated for them (or possibly pulled from a system  
variable if it exists?)
	- the system logs user data, either on the local filesystem or some  
remote logserver
	- if the logging is local, then on some interval, chandler would  
'send' this information to us somehow
	- users can disable/re-enable logging at any time

Questions:

What would be the best way to 'collect' this data, given that users  
may or may not have network access, or could possibly be firewalled,  
etc. (*caveat: i'm looking for a quick low-hanging fruit such that  
more time can be spent on the analysis side of the project)

Possible idea's I've considered is using HTTP in real time (or via  
implementing a buffering system) to log the data to a central  
server.  There's some direct support for this in the python logging  
module under HTTPHandler: http://docs.python.org/lib/module-logging.html

Alternatives would be to use a 'sync' procedure, either via rsync/ftp  
or perhaps even the background sync module to upload the logdata to  
our servers.  The implementation for this stuff would be a bit more  
involved, specially since I'm not storing the logdata in the  
repository and would need to figure out how to encapsulate it, but it  
might be more compatible with the rest of the chandler methodology.

The last/simplest approach would be to use simple SMTP to post the  
information since we know that more often than not, users will have  
'at least' SMTP access from within Chandler since it is a mail app  
after all.

I've looked into all three approaches quite a bit and just wanted to  
get some feedback as to what people think would be the best way to do  
this.

-a






More information about the Design mailing list