[Dev] A short explanation of Collections
john at osafoundation.org
Mon Aug 29 10:33:56 PDT 2005
As the dust settles on the recent Collections and Sets work, I decided
to write up a short description of what every Chandler developers should
know about Collections. The idea of a query that automatically updates a
list of items, and notifies subscribers of changes, has been central to
Chandler from the beginning. Our design and implementation has evolved
many times, influenced by what we have learned through experience.
Although some of what I describe here might change slightly I think the
basic ideas will remain unchanged.
The new Collections are a replacement for repository.query.Query, which
was used by ItemCollections. In the old ItemCollection world that most
of you are probably familiar with an itemCollection was made up of a
query that specified a set of items, modified by adding in a list of
inclusion items and removing a list of exclusion items. The final
results were cached in a ref collection that was usually accessed like
an array. We ran into a number of problems using ItemCollections. For
example, when one ItemCollection, e.g. the "All" item collection fed
its results into a new filtered ItemCollection, e.g. the subset of
calendar events, there were problems propagating changes and
notifications. Also we learned that the majority of ItemCollections in
Chandler were simply ordered lists of items, and the notion of order in
ItemCollections was not always maintained.
In the new Collections world we have a number of different types of
KindCollection: all the items of a particular kind.
ListCollection: an explicit list of items.
FilteredCollection: all items in another source Collection that match a
Python expression. You must manually specify a list of attributes which
Items must have to be considered for filtering by the expression. In the
future we may limit what Python code FilteredCollections may use.
UnionCollection: the union of two or more source Collections
IntersectionCollection: the intersection of two or more source Collections
DifferenceCollection: the difference between to source Collections
InclusionExclusionCollection: a collection similar to our old
ItemCollection, that implements some convenience methods to access
inclusions, exclusions, the source Collection, and methods to add and
remove items. The InclusionExclusionCollection, is made up of a union
collection, difference collection, 2 list collections and a source
collection as follows:
InclusionExclusionCollection = ((source - exclusions) + inclusions).
To illustrate the power of Collections consider the new "All" Collection:
allCollection = ((((Notes - (Events filtered by (isGenerated = True)) -
Trash) - allExclusions) + allInclusions)
allCollection is an InclusionExclusionCollection. Notes and Events are
KindCollections. allInclusions, allExclusions and Trash are ListCollection.
There isn't any code necessary to exclude generated events or item in
the trash from the "All" Collection, which simplifies the design. It's
also easy to update the rules for what is contained in the "All"
Collection without having to update a bunch of code. So if you find
yourself writing a bunch of code to make sure items end up in the right
Collections in the sidebar or elsewhere, you could probably avoid it
completely by setting up the right Collections to start with.
You can subscribe to a collection by adding an item to notify to the
collection's subscribers attribute. By default, the method
"onCollectionEvent" is called on items that are subscribed, however, you
can specify a different method name in the collectionEventHandler
attribute of your item that is notified.
Collections are not dependent on Blocks, but Blocks are the main user of
That finishes the overview. For those that want to understand more
detail or the implementation, read on.
Collections are Items that provide a thin wrapper on repository Set
attribute values, where most of the work actually takes place. We need
this wrapper for a few reasons. First it's difficult to manage lots of
references to an attribute, which is why Blocks, ContentItems, etc. are
not attributes. Second, the Item implements the support for
notifications. Finally, Set attributes require arguments that refer to
other Sets in order to create them. These arguments aren't known when
the Collection Item is created. This creates an awkward need to delay
creation of the Set attribute. The Item provides Python magic to handle
this awkward delay creation. A further limitation of Sets is that they
are immutable, which means that changing a node in a Collection tree is
not supported. It may be possible to add more Python magic the Item
that destroys and re-create the correct Sets when one node changes.
These disadvantages imposed by making Sets an attribute made some of us
think that making Sets an Item would have been a better choice. The
counter argument was that we would face the same limitations even if
Sets were Items. There might also be situations where using Sets as
attributes would have a advantage, even though they are used that way today.
Collections have the same kind of index that ItemCollections had. If you
never index into a Collection it won't have an index. If you index into
it, you'll get an index. The index you get is determined by an
attribute on Collection. By default you'll get an ordered index, where
the order is the same as the iteration order of the Collection. If the
index attribute is the name of an attribute, you'll get an index sortedd
by that attribute.
Unlike ItemCollections, collections, except for ListCollections, don't
cache their results.
Most Collections are used as contents for Blocks. As in the past, when
the Block is rendered it subscribes to notifications, and when it's
unrendered it unsubscribes to notifications. This is a simple
optimization to minimize the number of notifications, since only blocks
that are visible on the screen need to be notified to update themselves.
KindSets and FilteredSets maintain their indexes by using repository
monitors. We use that same mechanism to notify subscribers.
Notifications for Items coming and going to Collections are synchronous.
This doesn't work for changes to attributes on Items in other views, so
instead we we use an asynchronous notification. In order to get these
notifications it's necessary to poll for them. Each time OnIdle is
called we do a repository update and poll for these notifications. Each
time a notification is received, the block that gets the notification is
added to a list of dirty blocks. At the end of OnIdle, the list of
dirty blocks is updated on the screen and removed from the list of dirty
blocks. This has the benefit of accumulating all of the changes to data
fairly quickly, and only redrawing the affected part of the screen when
there's nothing left to do.
Finally, we plan to implement a nestable "Freeze/Thaw" methods to
temporarily ignore and enable notifications, which will further improve
More information about the Dev