Atompub collections and resource collections

There is a subtle but important difference in the use of the term “collection” in the context of resource oriented architectures and Atom standards. I’ve literally had this conversation 3 or 4 times with colleagues in the last week or two so thought it relevant to post.At the crux of the matter is the fact that the term “collection” is used in both the Atom context and resource oriented context. The good news is that the thing each is referring to with the term “collection” is almost the same as what is referred to in the other – the bad news is that the things they are referring to are almost the same. It is the “almost” that I want to address here. Let me start out with the good news “almost.”In resource oriented interfaces, i.e. REST interfaces, the term collection is colloquially used to refer to sets of resources. Examples of resource collections are all of my blog entries, all of my photos on Flickr, all of my snowboarding photos on Flickr or all of the blog entries from some source that are about Atom. Pretty simple concept. The Atom Syndication Format has a construct, a feed, that can be used to represent a resource collection. The Atom format doesn’t formally define a collection (it formally defines a feed), nor does any resource-oriented literature that I am aware of.The term “collection” IS formally defined in Atompub.

The Atom Protocol defines Collection Resources for managing and organizing both kinds of Member Resource. A Collection is represented by an Atom Feed Document. A Collection Feed’s Entries contain the IRIs of, and metadata about, the Collection’s Member Resources. A Collection Feed can contain any number of Entries, which might represent all the Members of the Collection, or an ordered subset of them.

Okay, so far, so good – seems not to have any conflict with the less formal notion of a collection in resource-oriented architectures. But later on in the spec is where the difference comes in. Atompub defines a new element, app:edited, which is the date that the entry was most recently changed, and it is the semantics around this that introduce a subtle difference between the resource collection and an Atompub collection.

The Entries in the returned Atom Feed SHOULD be ordered by their “app:edited” property, with the most recently edited Entries coming first in the document order.

“SHOULD” is defined in RFC2119 and should be read as “your compliant app really, really, really should abide by this but if it doesn’t, you better have a really, really, really good reason for any deviation.” What that means is that Atompub collections are to be ordered by the value of the app:edited value. I know, I know, there are plenty of instances where we want a feed that is ordered by something other than a last edited date – contacts or even documents alphabetically, calendar entries by the appointment date/time (not by when the appointment was created or updated), etc. Such sets of entries with such orderings can be safely represented as Atom feeds but they are NOT Atompub collections.Why did Atompub add this constraint? Because Atompub is not only about the publishing or writing of content, it is equally about synchronization. Off line access. Take a feed reader for example. Many of them have off-line support so that I can catch up on the latest even when I am on an airplane. If the server producing the feed is Atompub compliant and you have subscribed to feeds that are in fact collections, whenever you do come on line the reader can request a refresh on a feed and be sure to get the latest changes without having to retrieve the entire feed.Often when we have overloaded terms, like “collection”, the context of use makes it very clear which meaning is intended. In the case of resource collections and Atompub collections, however, the context doesn’t always help. One reason for that is that it is not at all uncommon for people to refer to feeds as “Atom collections”, even when they are not doing anything with Atompub. While this can lead to confusion it is technically still correct. But when someone starts talking about “Atom collections” when what they really mean are “Atompub collections”, but they are not properly ordering the entries in the collection feeds, that is when things can break.Simply put, all Atompub collections are feeds, but not all feeds and not all resource collections can be said to be Atompub collections. I personally use the term “Atom feed” when I am referring to an Atom format representation of a resource collection and I use the term “Atompub collection” when I am referring to a true Atompub collection.

One Comment

Share Your Thoughts