[Design] Collecting Usage Data/Central Logging
Ted Leung
twl at osafoundation.org
Wed Jun 28 14:14:02 PDT 2006
On Jun 26, 2006, at 3:59 PM, Ashkan Soltani wrote:
> Hey all
>
> I'm Ashkan, one of the interns working with the PPD team this
> summer. My project it to instrument a usage tracking compoment
> within Chandler in an attemt to better understand how dogfooders
> are using the app. The idea is if we can gather logging data, we
> can make better assumptions of what people actually do on a regular
> basis and focus energies in improving those activities.
>
> With a lot of JohnA's help, I've managed to put together a small
> compement that logs blockEvent and some of context to a file or
> stderr. (I should preface this by saying that I don't have any
> previous exp w/ Python .... most of my development was in Perl and C)
>
> Now that I'm able to log the events, I need to find a way to
> consolidate this logging information and get it back to us for
> review/analysis (both automated statistics crunching as well as
> individual workflow analysis)
>
> The 'logging' workflow is:
> - dogfooders enable 'logging' via a pulldown in the Chandler test
> menu pulldown
> - they enter are prompted for a 'username' (basically an arbitrary
> identifier), displayed some privacy disclaimer, and then a unique
> userid is generated for them (or possibly pulled from a system
> variable if it exists?)
> - the system logs user data, either on the local filesystem or
> some remote logserver
> - if the logging is local, then on some interval, chandler would
> 'send' this information to us somehow
> - users can disable/re-enable logging at any time
>
> Questions:
>
> What would be the best way to 'collect' this data, given that users
> may or may not have network access, or could possibly be
> firewalled, etc. (*caveat: i'm looking for a quick low-hanging
> fruit such that more time can be spent on the analysis side of the
> project)
>
> Possible idea's I've considered is using HTTP in real time (or via
> implementing a buffering system) to log the data to a central
> server. There's some direct support for this in the python logging
> module under HTTPHandler: http://docs.python.org/lib/module-
> logging.html
I think that some kind of HTTP based solution is the way to go.
>
> Alternatives would be to use a 'sync' procedure, either via rsync/
> ftp or perhaps even the background sync module to upload the
> logdata to our servers. The implementation for this stuff would be
> a bit more involved, specially since I'm not storing the logdata in
> the repository and would need to figure out how to encapsulate it,
> but it might be more compatible with the rest of the chandler
> methodology.
I think you are going to end up with firewalling problems here if you
use a protocol other than http, or something that can't tunnel over
http. Actually, I think that in some version of the future, storing
these traces in the repository might be interesting
>
> The last/simplest approach would be to use simple SMTP to post the
> information since we know that more often than not, users will have
> 'at least' SMTP access from within Chandler since it is a mail app
> after all.
I don't know that you can count on people having set all the SMTP
settings. I am dogfooding the calendar, but I do not have the SMTP
or IMAP settings configured.
Another application for your instrumentation machinery is to build an
Attention Recorder <http://www.attentiontrust.org/services> for
Chandler. While this is probably out of scope for Beta / 1.0, this
is something that I am personally very interested in investigating /
experimenting with. Is your stuff checked in somewhere?
Ted
More information about the Design
mailing list