[Dev] Simplifying biref definition and kind extensions

Phillip J. Eby pje at telecommunity.com
Thu Sep 8 12:05:00 PDT 2005


Yesterday, I posted a short and rough proposal for making it possible to 
define a bidirectional reference from only one class.  However, discussion 
on IRC and some emails I got privately made it clear that I didn't really 
provide enough background on either the why's or how's of the proposal, and 
there was also some IRC discussion that led to a better solution for one of 
the problems, than the solution I proposed here yesterday.  So, I'm going 
to restate the proposal to incorporate that enhancement, and also to 
provide some of the background that was asked for.

Donn asked, "why is circularity a problem?"  and "why is it more of a 
problem now?"  And the answer to both questions is that circularity breaks 
modularity.  Because if component A depends on component B, and B depends 
on A, then you can't use either one without the other, and so you no longer 
have any meaningful distinction between A and B - they might as well be the 
same component.  You lose the ability for someone to learn A and then B, 
and you lose the ability to have A first and then optionally add B later.

So, the problem is that bidirectional reference definitions being split 
across parcels breaks this modularity, and the problem is popping up more 
now as we try to enforce the modularization of the parcel structure.  We 
still want to have bidirectional references across parcels, of course, but 
we need to be able to define them without making A depend on B and B depend 
on A.  We'd like to be able to define the whole biref from *one* parcel, so 
that you can have A and then add B later, and if you never add B then A 
still works as-is.

Right now, however, if you define a biref with the schema API, you have to 
do half of it in A, and half in B.  This is fundamentally broken because it 
means you can't have *any* modularity and still relate things in different 
parcels.  So, we need a way to define both sides of a biref from only one 
place.

And that's the first part of my proposal, that we allow you to define a 
biref from only one "side".  In many of the cases of birefs in our current 
schema, the other side is only there because we *have to have it*; we never 
actually use the "A" side of the biref, we're only really using the "B" side.

Let's take Morgen's sharing use case as an example.  The sharing parcel 
needs to keep a collection mapping iCal UID's to calendar events.  In this 
case, calendar events are part of parcel "A" - the "pim" parcel.  The pim 
parcel shouldn't have anything to do with sharing, or else it can't be used 
independently, or taught independently.  (That is, if you have to 
understand sharing before you can fully understand the pim parcel, we have 
a learning curve problem as well as an inability to deploy them separately.)

But, the only way Morgen can have a bidirectional reference between the 
sharing parcel and calendar events, is if he *modifies* the calendar event 
mixin class to add an attribute, which then makes the calendar parcel 
depend on sharing - making A depend on B, in other words.  This approach 
doesn't scale very well, and it definitely doesn't work for third-party 
parcels.  And at our current team and application size, we are starting to 
run into problems because effectively we are all "third-party" with respect 
to one another's code.  That is, in this example, Morgen is "third-party" 
when it comes to the calendar parcel.

So the first part of what I'm proposing, then, is that in the sharing 
parcel, Morgen should be able to do this:

     items = schema.Sequence(pim.CalendarEventMixin, inverse=schema.One())

and *not* have to go edit the CalendarEventMixin class, just to add the 
backward reference that he never uses anyway.  He just specifies that he 
wants a new 'One()' reference to be added to the CalendarEventMixin kind in 
the repository, and this will happen as soon as his parcel is 
installed.  The calendar parcel, meanwhile, can be loaded and used 
*without* the sharing parcel, because it doesn't have any references to 
sharing defined in its code.  The calendar developers don't have to ask, 
"what's this sharing thing in our code?", and so they are happy.  Morgen 
doesn't have to worry about annoying the calendar developers, or what to 
call the extra attribute he doesn't want anyway, and so Morgen is 
happy.  Life is good.  :)

There's an additional detail to this idea, which is how it's implemented 
internally.  When you create a "one-way biref" like this, it will actually 
add a new attribute to CalendarEventMixin for you.  You just don't have to 
give it a name, or add it to the class by hand.  The name this attribute 
will be automatically given is "osaf.sharing.UIDMap.items.inverse", which 
of course cannot collide with any of the calendar-specific attributes 
defined by the calendar parcel.  It does mean that it's more awkward to 
access that attribute, if you really need to access it for some reason, 
because you have to use getattr(ob,name) or ob.getAttributeValue(name) 
(where 'name' is "osaf.sharing.UIDMap.items.inverse").  You can't just say 
'ob.name' the way you can with attributes that are created explicitly.

This is a feature, though, not a bug.  The fact that you can't access it 
via 'ob.name' means that the calendar parcel can never *accidentally* use 
this attribute, or define a conflicting attribute.  This is a good thing, 
because it means that no matter what other parcels do to the kind, the 
calendar parcel never needs to know about it.  It can define whatever 
attributes it wants, and everybody else can have whatever attributes they 
want, and everybody is happy.  Life is still good.  :)

Okay, so what about the case where you really want to be able to use that 
attribute?  Or what if you just want to add an attribute to an existing 
kind, like in the AbstractCollection.color case?

Well, that's what the second proposed feature is for, and this part of the 
proposal is a bit different today, based on the IRC discussions 
yesterday.  It's an API to allow you to define these additional attributes, 
and to access them conveniently, without having to spell out attribute 
names like "osaf.sharing.UIDMap.items.inverse".  Here's an example, loosely 
based on a suggestion by Alec on IRC yesterday:

     class SidebarInfo(schema.Annotation):
         schema.annotates(pim.AbstractCollection)
         calendarColor = schema.One(blocks.ColorType)
         alertSound    = schema.One(schema.Lob)

If this class were defined in "some_module", then loading that module into 
the repository would add two new attributes to the AbstractCollection kind: 
"some_module.SidebarInfo.calendarColor", and 
"some_module.SidebarInfo.alertSound".

But, it also does one other thing, which makes it much more useful.  The 
SidebarInfo class is actually an "annotation wrapper" class that you can 
apply to an item, in order to access the attributes "normally".  That is, 
the Annotation subclass would have automatically-defined properties that 
look up the corresponding attributes on an underlying item.

So, if you wanted to get the calendar color of a collection, you would do this:

     the_color = SidebarInfo(some_collection).calendarColor

And if you wanted to set a collection's calendar color, you would do this:

     SidebarInfo(some_collection).calendarColor = the_color

And in each case, the attribute being get or set on the annotation object 
would cause the attribute to be get or set (using its full, dotted, 
internal name) on the wrapped item.

If you are doing lots of things with a particular annotation, you can of 
course save it in a variable, and use it more than once:

    sbi = SideBarInfo(some_collection)
    MessageBox(("Your color is %s" % sbi.calendarColor), sound=sbi.alertSound)

However, annotation wrappers aren't persistent and shouldn't be stored in 
the repository -- although they could be later if we have the 
need.  They're really just a convenience for Python code, at the moment, 
though, and things like attribute editors should probably just use the 
attributes' full dotted names, rather than using a wrapper to access them.

In addition to annotation attributes, you can also define methods on 
Annotation classes, and then use these methods on the instances, e.g.:

     class SidebarInfo(schema.Annotation):

         schema.annotates(pim.AbstractCollection)

         calendarColor = schema.One(blocks.ColorType)
         alertSound    = schema.One(schema.Lob)

         def alert(self):
             MessageBox(
                  ("Your color is %s" % self.calendarColor),
                  sound = self.alertSound
              )

     # Alert about some_collection:
     SidebarInfo(some_collection).alert()

Thus, you get a kind of "dynamic mixin" capability that's ideal for adding 
extra information and behavior needed by "third party" parcels.  (Except 
that third party is a misleading name, since most of our parcels are "third 
party" relative to some other parcel).

There are a couple more examples I need to present, in order to show how 
the two proposals above (i.e. "one-way" birefs and annotation classes) work 
together.  First, I'll revisit yesterday's Contact likers/likees example:

     class Friends(schema.Annotation):
         schema.annotates(pim.Contact)
         likes = schema.Many(pim.Contact)
         isLikedBy = schema.Many(pim.Contact, inverse=likes)

     Friends(somebody).likes       # get the contacts who somebody likes
     Friends(somebody).isLikedBy   # get the contacts who like somebody
     you in Friends(me).isLikedBy  # do you like me?
     me in Friends(you).likes      # no, really, do you like me?  :)

     Friends(everybody).likes.add(somebody)  # everybody likes somebody!
     Friends(me).likes.remove(you)           # I don't like you any more  :(

This of course is the special case where both attributes are annotating the 
same existing kind.  If we wanted to create a biref between two different 
existing kinds, we might have something like:

     class Favorites(schema.Annotation):

         schema.annotates(pim.Contact)

         favorite_feeds = schema.Many(feeds.Feed)
         favorite_movies = schema.Many(movies.Movie)

         # ... other 'favorite things' attributes here


     class FavoriteFeed(schema.Annotation):

         schema.annotates(feeds.Feed)

         favorite_of = schema.Many(
             pim.Contact, inverse=Favorites.favorite_feeds
         )

We could then use 'FavoriteFeed(some_feed).favorite_of' to find the people 
who consider 'some_feed' a favorite, and we can use 
'Favorites(some_contact).favorite_feeds' to find a person's favorite 
feeds.  (And we can do all this without modifying either the pim or feeds 
parcels.)

The last example covers the case where a parcel wants to create a two-way 
link between an existing kind and a new kind:

     class SoccerMatch(pim.ContentItem):
         # ... various other attributes here
         referee = schema.One(pim.Contact)
         # ... more attributes here

     class SoccerReferee(schema.Annotation):
         schema.annotates(pim.Contact)
         refereed_games = schema.Sequence(
             SoccerMatch, inverse=SoccerMatch.referee
         )

We can now use some_match.referee to find a match's referee, and we can 
find out if a contact has refereed any games using 
'SoccerReferee(some_contact).refereed_games'.  We could also add methods to 
the SoccerReferee class to do things like compute statistics about the 
refereed games, etc.

Now, you could make an argument that this last use case should be 
implemented by creating a SoccerReferee kind, and I wouldn't necessarily 
disagree with you.  However, as the number of roles an individual plays 
increases, the number of mixin kinds is O(2^N).  That is, every time a 
mixin is added, the total number of kinds doubles.  Having just three 
mixins means eight kinds (what we have now for stamping), 4 mixins means 16 
kinds, and by the time you get to twenty mixins there are over a million 
potential kinds.  That's an awful lot of repository space just to store all 
the different kind mixtures.  :)

The annotation approach, on the other hand, doesn't create any new kinds, 
but instead allows items to be of multiple "virtual" kinds at once.  As 
Donn pointed out in an email this morning, this means that annotations 
might end up being a better way to implement extensible stamping in future 
versions of Chandler.

Anyway, that's the updated proposal.  Comments?  Questions?



More information about the Dev mailing list