[Chandler-dev] Collecting Usage Data/Central Logging
Mike Taylor
bear at code-bear.com
Tue Jun 27 21:18:11 PDT 2006
On Jun 26, 2006, at 7:18 PM, Ashkan Soltani wrote:
> Questions:
>
> What would be the best way to 'collect' this data, given that users
> may or may not have network access, or could possibly be firewalled,
> etc. (*caveat: i'm looking for a quick low-hanging fruit such that
> more time can be spent on the analysis side of the project)
>
> Possible idea's I've considered is using HTTP in real time (or via
> implementing a buffering system) to log the data to a central server.
> There's some direct support for this in the python logging module
> under HTTPHandler: http://docs.python.org/lib/module-logging.html
>
> Alternatives would be to use a 'sync' procedure, either via rsync/ftp
> or perhaps even the background sync module to upload the logdata to
> our servers. The implementation for this stuff would be a bit more
> involved, specially since I'm not storing the logdata in the
> repository and would need to figure out how to encapsulate it, but it
> might be more compatible with the rest of the chandler methodology.
>
> The last/simplest approach would be to use simple SMTP to post the
> information since we know that more often than not, users will have
> 'at least' SMTP access from within Chandler since it is a mail app
> after all.
Here are some questions and concerns I thought about while reading your
post:
Your right to be concerned about outgoing ports and transport
mechinisms - unless we have a server listening on one of the known
ports (80, 8080, 443, etc) the vast majority of the users will not be
able to use the service. This may also be a concern if you decide to
use email - a lot of people are behind email servers that limit
outbound email to X number per minute or X amount of traffic and that
would also require some sort of configuration step so Chandler could
authenticate to their SMTP server.
My biggest concern is given that we could be logging personal or other
sensitive information, even indirectly, the transport stream would
either have to be encrypted or the data encrypted and then sent as
binary data.
We would also need to be concerned with the number of data packets
being sent to this server so we can find out sooner than later if
front-end load (the receiving of the data) or back-end load (the
expanding of the data and putting it somewhere that devs can access)
will be the bottleneck.
Your idea of using rsync doesn't seem practical to me as all of this
data is being generated new on the client side so rsync would just end
up sending it all anyway - may as well stick to HTTP Put or WebDav or
something like that.
Using sftp or ftp IMO is also a non-starter as those protocols have
plenty of issues from the security standpoint. Just watch how quickly
our IP would be deluged with script-kiddies when they find out we have
a FTP port that allows PUTs using computer generated UserIDs.
One method I would propose would be to use XMPP and a pub/sub setup.
---
Bear
Build and Release Engineer
Open Source Applications Foundation (OSAF)
bear at osafoundation.org
http://www.osafoundation.org
bear at code-bear.com
http://code-bear.com
PGP Fingerprint = 9996 719F 973D B11B E111 D770 9331 E822 40B3 CD29
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://lists.osafoundation.org/pipermail/chandler-dev/attachments/20060628/8b26a6d5/PGP.pgp
More information about the chandler-dev
mailing list