[Chandler-dev] tips for profiling

Andi Vajda vajda at osafoundation.org
Fri Apr 6 18:03:20 PDT 2007


Profiling Chandler is tricky because there are lots of moving parts. Here are 
a few tips and tricks that come to mind:

   1. To compare numbers, always use the same data.
      Recreating your repository everytime changes the data in subtle ways
      because you get new UUIDs everytime and the order of things, random in
      mappings, is conditioned by the hash of their keys, very often UUIDs.
      Instead, load and prepare your data, backup the repository and always
      restore it when re-running the testcase.

      rc -r ~/Desktop/__repository__.001  (or whatever your path may be)

   2. Preload all items, that way the vaguaries of item loading are out of the
      way, unless that's what you're interested in. Running check() loads all
      items into the UI view.

      rc -r ~/Desktop/__repository__.001 --undo check

      --undo check runs check() and undoes bogus versions until check() passes
      at startup time. If check() passes right away, as I'd expect from a
      clean restore, --undo will undo nothing and you will have started your
      chandler will all items pre-loaded.

   3. The -P flag using the perf tests produces profiles that crash hotshot (at
      least, for me, on my Mac). Also, with -P you cannot be sure how much of
      the perf test cruft you're profiling along with it. Know what you're
      measuring, and add the profiling hook yourself.
      Find the block of code or call you're interested in and wrap it
      with a prof.runcall() call, creating a local function for the block of
      code if need be. For example, to profile import, I just change
      ImportExport.py around line 267 as follows:
             import hotshot
             prof = hotshot.Profile('import.log')
             collection = prof.runcall(importICalendarFile,
                                       fullpath, self.view, coll,
                                       filterAttributes, activity,
                                       None, logger)
             prof.close()
      Then process the resulting import.log file with the hotshot APIs as is
      done, for example, in my t82.py file included below. Really, how to
      process the hotshot output is up to you, whatever works best for the kind
      of output you're looking for. t82.py, when run the first time on a .log
      file, produces a .stats file from it which is vastly faster to reload.
      This is can be a huge timesaver when profiling things like a large
      import, for example.

   4. Remove commit() calls. Currently, our code includes commits() in less
      than ideal places and this has the effect of slowing things down
      considerably. commit() is an expensive operation that we'd like to do
      *between* user actions instead of *during* user actions. Measuring it
      during a user action is pointless unless it is critical to the action.
      For example, the criticality of calling commit() at the end of
      CalendarCanvas.CreateEmptyEvent() is dubious.
      John has offered to review all such commit calls and remove them if
      possible, leaving it to the user event that started the action to do the
      commit, if needed.

   5. Instead of only looking at the amount of time spent doing things,
      consider how many times they're done. Currently, I sense that there are a
      lot of perf gains to be made in the app by doing things less instead of
      faster. Why are we loading 41 images when creating an event in the
      calendar, for example ? This leads to the next point.

   6. The app loads things lazily. Items are loaded lazily. Images are loaded
      lazily. If you see 41 images loaded, the next thing is to profile
      creating a second event instead. Indeed, the calendar doesn't load 41
      images the second time. Run the action multiple times so that all the
      init cases are out of the way. But, we're still calling
      MenusAndToolbars.appendDynamicBlocks 336 times nonetheless... something
      to look into there even if we're only spending 0.007s there, the fact
      it's called that often may lead you onto something where significant
      time is spent overall.

   7. When communicating about profiles, don't send the .log files around, they
      can be huge. Use a .stats file instead. As said before, the t82.py file
      below creates a .stats file from a .log file whenever it is invoked with
      a .log file. Don't send a text version out either, the .stats file can be
      opened with hotshot APIs and queried in various ways to look for callers,
      callees, different sort orders, etc, etc. The Stats class has several
      interesting APIs to poke around a .stats file:
         http://docs.python.org/lib/profile-stats.html

That's all I can think of for now.
See my t82.py script below.

Andi..
----------------------------------------------------

import os
import hotshot, hotshot.stats, pstats

def cum(stats, length=50):
     stats.sort_stats('cum')
     stats.print_stats(length)

def calls(stats, length=50):
     stats.sort_stats('time', 'calls')
     stats.print_stats(length)

def show(file, call='cum', length=50):
     if file.endswith('stats'):
         stats = pstats.Stats(file)
     else:
         stats = hotshot.stats.load(file)
         stats.strip_dirs()
         stats.dump_stats('.'.join((os.path.splitext(file)[0], 'stats')))
     if call == 'cum':
         cum(stats, int(length))
     elif call == 'tot':
         calls(stats, int(length))
     else:
         stats.sort_stats(call)
         stats.print_stats(int(length))

     return stats

if __name__ == "__main__":
     from sys import argv
     show(*argv[1:])


More information about the chandler-dev mailing list