[Design] Collecting Usage Data/Central Logging

Ted Leung twl at osafoundation.org
Wed Jun 28 14:14:02 PDT 2006


On Jun 26, 2006, at 3:59 PM, Ashkan Soltani wrote:

> Hey all
>
> I'm Ashkan, one of the interns working with the PPD team this  
> summer.  My project it to instrument a usage tracking compoment  
> within Chandler in an attemt to better understand how dogfooders  
> are using the app.  The idea is if we can gather logging data, we  
> can make better assumptions of what people actually do on a regular  
> basis and focus energies in improving those activities.
>
> With a lot of JohnA's help, I've managed to put together a small  
> compement that logs blockEvent and some of context to a file or  
> stderr.  (I should preface this by saying that I don't have any  
> previous exp w/ Python .... most of my development was in Perl and C)
>
> Now that I'm able to log the events, I need to find a way to  
> consolidate this logging information and get it back to us for  
> review/analysis (both automated statistics crunching as well as  
> individual workflow analysis)
>
> The 'logging' workflow is:
> 	- dogfooders enable 'logging' via a pulldown in the Chandler test  
> menu pulldown
> 	- they enter are prompted for a 'username' (basically an arbitrary  
> identifier), displayed some privacy disclaimer, and then a unique  
> userid is generated for them (or possibly pulled from a system  
> variable if it exists?)
> 	- the system logs user data, either on the local filesystem or  
> some remote logserver
> 	- if the logging is local, then on some interval, chandler would  
> 'send' this information to us somehow
> 	- users can disable/re-enable logging at any time
>
> Questions:
>
> What would be the best way to 'collect' this data, given that users  
> may or may not have network access, or could possibly be  
> firewalled, etc. (*caveat: i'm looking for a quick low-hanging  
> fruit such that more time can be spent on the analysis side of the  
> project)
>
> Possible idea's I've considered is using HTTP in real time (or via  
> implementing a buffering system) to log the data to a central  
> server.  There's some direct support for this in the python logging  
> module under HTTPHandler: http://docs.python.org/lib/module- 
> logging.html

I think that some kind of HTTP based solution is the way to go.

>
> Alternatives would be to use a 'sync' procedure, either via rsync/ 
> ftp or perhaps even the background sync module to upload the  
> logdata to our servers.  The implementation for this stuff would be  
> a bit more involved, specially since I'm not storing the logdata in  
> the repository and would need to figure out how to encapsulate it,  
> but it might be more compatible with the rest of the chandler  
> methodology.

I think you are going to end up with firewalling problems here if you  
use a protocol other than http, or something that can't tunnel over  
http.   Actually, I think that in some version of the future, storing  
these traces in the repository might be interesting

>
> The last/simplest approach would be to use simple SMTP to post the  
> information since we know that more often than not, users will have  
> 'at least' SMTP access from within Chandler since it is a mail app  
> after all.

I don't know that you can count on people having set all the SMTP  
settings.  I am dogfooding the calendar, but I do not have the SMTP  
or IMAP settings configured.

Another application for your instrumentation machinery is to build an  
Attention Recorder <http://www.attentiontrust.org/services> for  
Chandler.   While this is probably out of scope for Beta / 1.0, this  
is something that I am personally very interested in investigating /  
experimenting with.   Is your stuff checked in somewhere?

Ted




More information about the Design mailing list