Web Magazine for Information Professionals

Syndicated Content: It's More Than Just Some File Formats?

Paul Miller takes a look at issues arising from the current enthusiasm for syndicating content to portals or other web sites, and offers some guidelines for good practice.

There is, unsurprisingly, an increasing recognition that digital resources of all kinds are eminently suitable to repurposing and reuse. The Iconex Project [1], for example, was funded under JISC's 5/99 Programme to look at the creation, storage and dissemination of reusable learning objects. Service providers of the Arts & Humanities Data Service [2] concern themselves with collecting the digital outputs of scholarly activity in order to preserve them for posterity, but also with facilitating their ongoing use and reuse by learners, teachers and researchers across the community [3]. International developments such as the Open Archives Initiative [4] explicitly recognise the value of sharing metadata about resources with any number of service providers in order to raise visibility, and draw greater attention to the underlying resources.

As the presentation frameworks generically labelled as 'portals' continue to gain ground across the community, there will be an increasing requirement for reusable content of all forms, whether drawn from within the organisation building the portal or gathered from elsewhere. Work on the PORTAL Project [5], funded under the JISC's Focus on Access to Institutional Resources (FAIR) Programme [6] is raising issues relating to the reuse and reintegration of digital resources of various forms, specifically in the context of 'surfacing' these resources within institutional portals.

In this article, a number of these issues will be explored. For the sake of simplicity, and because of the ready availability of helpful visual examples, the bulk of the article will concern itself with RSS-based 'news feeds' [7]. Many of the issues raised, though, are more generically applicable, and will be revisited in greater detail through deliverables from the PORTAL Project itself.

Readers who are already comfortable with RSS may wish to skip straight to the suggestions for good practice... Those who are interested in making use of the potential offered by RSS, but without infrastructure such as a portal to display feeds of interest, might be interested in RSS-xpress-Lite from UKOLN [8], which allows RSS to be displayed in traditional Web pages with the use of a single line of Javascript.

RSS in context

The JISC's Information Environment architecture [9] identifies three main mechanisms for disclosure of resources to end users: searching, harvesting, and alerting.

Of these, searching basically encapsulates a long-standing area of JISC activity, encompassing current services such as Zetoc, AHDS and COPAC, as well as eLib Phase 3's Clumps, and the JISC-supported development of the Bath Profile for Z39.50. Harvesting takes a different approach, gathering metadata for remotely held resources to a central location. The Archaeology Data Service catalogue [10] was designed from the outset as an expression of this (then unstated) model, with the more recent development of OAI and JISC's funding of explorations thereof through the FAIR Programme formalising this approach.

Alerting is different again, concerning itself with the business of keeping users apprised of changing conditions or resources; at a basic level, telling an avid reader of Nature that the latest issue is now available, and perhaps delivering its table of contents to their desktop. The main mechanism recommended for alerting is the use of RDF (or Rich) Site Summary - RSS.

Some RSS nuts and bolts

UKOLN's Pete Cliff wrote a good introduction to RSS in Ariadne's sister e-publication, Cultivate Interactive [11]. Pete's article, and the references it contains, are useful in explaining the background to RSS, and point to some useful tools that will help to get an RSS feed up and running with a minimum of fuss.

In essence, RSS is a mechanism for sharing short snippets of information, along with a link back to the originating source for expansion. The excellent BBC News service [12], for example, makes RSS content available, as in the example below from the portal being built at the University of Hull [13]. Anyone following one of the links is taken back to the BBC site for the full news story.

Fig 1 Screenshot (69K): University of Hull development portal

Figure 1: the University of Hull development portal, including an embedded news channel from the BBC

 

Fig 2 Screenshot (77K): BBC News Channel

Figure 2: selecting one of the headlines from the BBC news channel directs you to the full story on the BBC site

The UKOLN homepage, too, uses RSS. The main block of News on the page is, in reality, an RSS feed, and can therefore easily be picked up for use elsewhere, whether on the UKOLN Intranet, or by a wholly external service.

Fig 3 Screenshot (51K): A screenshot of the UKOLN homepage

Figure 3: the UKOLN home page, displaying an RSS feed of recent News

 

Fig 4 Screenshot (59K): the UKOLN intranet

Figure 4: the UKOLN intranet, displaying the same RSS, alongside a feed from the BBC

What is RSS?

RSS is an XML-based format, which looks much like related formats such as HTML. Rather than including presentation information, though, the elements available in an RSS file are strictly limited to those required to define such things as a headline, a brief blurb, and a link to richer detail.

To confuse the otherwise suspiciously simple, there is more than one format of RSS; version 1.0 [14] is based on RDF [15], and includes such things as Dublin Core metadata about the RSS feed itself. Version 0.9x [16] is straight XML, and still widely deployed. Fortunately, presentation frameworks such as uPortal [17] are often capable of displaying any of the current flavours, and the end user rarely notices any difference.

<?xml version="1.0" encoding="ISO-8859-1" ?> 
  <!DOCTYPE rss (View Source for full doctype...)> 
  <rss version="0.91">
  <channel>
  <title>BBC News | UK | UK Edition</title> 
  <link>/go/rss/-/1/hi/uk/default.stm</link> 
  <description>Updated every minute of every day</description> 
  <language>en-gb</language> 
  <lastBuildDate>Wednesday, 19 February, 2003, 10:52 GMT</lastBuildDate> 
  <copyright>Copyright: (C) British Broadcasting Corporation, 
   http://news.bbc.co.uk/2/shared/bsp/hi/services/copyright/html/default.stm</copyright>
  
  <docs>http://www.bbc.co.uk/syndication/</docs> 

  <image>
  <title>BBC News Online</title> 
  <url>http://news.bbc.co.uk/furniture/syndication/bbc_news_120x60.gif</url> 
  <link>http://news.bbc.co.uk</link> 
  </image>

  <item>
  <title>Asylum seekers win ruling</title> 
  <description>A group of six asylum seekers has won its High Court challenge 
   against new rules denying them housing and benefits.</description> 
  <link>http://news.bbc.co.uk/go/rss/-/1/hi/uk/2779343.stm</link> 
  </item>

  <item>
  <title>'One in four teens a crime victim'</title> 
  <description>A quarter of 12 to 16-year-olds have been victims of crime - 
   mostly violence and assault - in the past year, says a study.</description>
  <link>http://news.bbc.co.uk/go/rss/-/1/hi/uk/2778255.stm</link> 
  </item>

  <item>
  <title>£1m boost for epilepsy care</title> 
  <description>A £1m strategy to improve epilepsy care is announced by the 
   government, but campaigners say the plan does not go far enough.</description>
  <link>http://news.bbc.co.uk/go/rss/-/1/hi/health/2776279.stm</link> 
  </item>
...

Figure 5: a sample of RSS 0.91, taken from the BBC News, © British Broadcasting Corporation

<?xml version="1.0" encoding="UTF-8" ?> 
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns="http://purl.org/rss/1.0/" 
  xmlns:dc="http://purl.org/dc/elements/1.1/" 
  xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" 
  xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/">

<channel rdf:about="http://dbweb.liv.ac.uk/ltsnpsc/">

<title>LTSN Physical Sciences News</title> 
<link>http://dbweb.liv.ac.uk/ltsnpsc/</link> 

<description>Latest teaching and learning additions to the LTSN Physical Sciences web 
  site.</description> 

<dc:language>en-gb</dc:language> 
<dc:rights>Liverpool University for LTSN Physical Sciences</dc:rights> 
<dc:date>Mon Dec 16 11:51:10 2002</dc:date> 
<dc:creator>rgladwin@liv.ac.uk</dc:creator> 

<items>
<rdf:Seq>
<rdf:li rdf:resource="http://dbweb.liv.ac.uk/ltsnpsc/devprojs.asp" /> 
<rdf:li rdf:resource="http://dbweb.liv.ac.uk/ltsnpsc/download.asp" /> 
<rdf:li rdf:resource="http://dbweb.liv.ac.uk/ltsnpsc/journal.asp" /> 
...
</rdf:Seq>
</items>

<image rdf:resource="http://dbweb.liv.ac.uk/ltsnpsc/images/ltsnpsc6.gif" /> 
</channel>

<image rdf:about="http://dbweb.liv.ac.uk/ltsnpsc/images/ltsnpsc6.gif">
<title>LTSN Physical Sciences</title> 
<url>http://dbweb.liv.ac.uk/ltsnpsc/images/ltsnpsc6.gif</url> 
<link>http://dbweb.liv.ac.uk/ltsnpsc/</link> 
</image>

<item rdf:about="http://dbweb.liv.ac.uk/ltsnpsc/devprojs.asp">
<title>More development project reports</title> 
<link>http://dbweb.liv.ac.uk/ltsnpsc/devprojs.asp</link> 
<description>The third tranche of projects have finished and reports are 
  now available (Dec 02).</description> 
</item>

<item rdf:about="http://dbweb.liv.ac.uk/ltsnpsc/download.asp">
<title>Development project resources downloads</title> 
<link>http://dbweb.liv.ac.uk/ltsnpsc/download.asp</link> 
<description>This link allows direct access to resources created through 
  the Centre's Development Projects (Dec 02).</description> 
</item>

<item rdf:about="http://dbweb.liv.ac.uk/ltsnpsc/journal.asp">
<title>Journal no 5 (Vol 3 Issue 2) published</title> 
<link>http://dbweb.liv.ac.uk/ltsnpsc/journal.asp</link> 
<description>This latest version has reviews of 4 software packages, 2 web 
  sites and 10 books (Dec 02).</description> 
</item>

Figure 6: a sample of RSS 1.0, taken from the LTSN Physical Sciences subject centre, © University of Liverpool

As Pete's article shows, there are a variety of ways to create RSS, and once done all you need to do is publish the URL and wait for other sites to pick it up and start redisplaying your content.

RSS in use

On first consideration, RSS is an invaluable gift to those tasked with gathering, sifting and re-presenting information to users. Take the example of an institutional portal. One important aspect of most portals is personalisation. Although there are certainly situations in which they will need or want access, it is generally true that the average researcher of Particle Physics neither needs nor wishes to see table of contents entries for the latest issue of Antiquity, nor news about the recent provision of online access to the census of 1981. They very well might, though, be interested in topical information on the latest funding opportunities from the relevant research council, happenings at CERN, and the table of contents for this months' Physical Review. Any and all of these could be available as an RSS feed, provided by their host organisation and either automatically made available in the portal of the researchers' institution on the basis of matching personal information held about the individual by their organisation or manually added to their portal layout by the physicists.

In reality, though, the ad hoc manner in which RSS feeds are currently populated and maintained makes it difficult for portal creators to plan for meaningful inclusion of RSS content from any of the available sources, either directly by end users or on their behalf. Early experience in trying to do just that within the PORTAL Project [5] has led to a recognition of the need for agreed good practices amongst those creating and maintaining such channels. A first attempt at establishing these practices is offered for comment, below.

Towards Good Practice

In this section, I make a first attempt at establishing some guidelines for good practice in the use of RSS. I am assuming that the primary purpose of the RSS is to be embedded in someone else's Web site or portal, in order to deliver current awareness information relating to the originating site, institution or service. In order to illustrate the suggestions, there are references to existing RSS feeds that readers can look at for themselves.

Criticisms levelled at these feeds are intended to be general rather than specific in nature, and they are made within the context of my assumptions about the purpose of RSS and the guidelines that this article is attempting to formulate. It should be recognised that the creators of any RSS feed criticised in this manner may well have had a different purpose in mind for their channel. i.e., whilst their channel may be 'bad' for my intended use, it may be perfectly well suited to that for which it was created.

Adhere to the standards

This one may appear obvious, but a random check of some of the RSS feeds listed in UKOLN's Channel Directory [18] turned up very few that validated successfully against Mark Pilgrim's useful RSS Validator [19].

The RSS standards are not overly complex, and tools exist which are capable of producing compliant RSS. It therefore seems unnecessary for content providers to invest time and effort in building a feed that may not render properly when viewed by an end-user.

Recommendation 1: In line with current JISC guidelines [20] on provision of content for the Information Environment, comply with the RSS 1.0 specification [14]. Validate the structure of any RSS using an RSS Validation tool.

Ensure persistence

If we presume that the principal use of RSS is alerting users to change, then it becomes important to ensure that the RSS feed itself is persistent and resilient. End users, such as our putative physicists, are likely to alter their subscribed feeds only infrequently, if at all, and will expect those they have selected to continue to deliver useful and timely content. Were a feed available from Physical Review, for example, our physicists might expect it to display the table of contents for the current issue, thus alerting them to the appearance of a new issue and giving an idea as to the subject matter covered [see an example]. Although they exist, it seems unlikely that our physicist would want separate feeds for each issue of Physical Review, with the corresponding need to receive some other notification that a new issue is out, and to then manually add it to those displayed through the portal or whatever presentation tool is being used. There are surely better technologies for browsing or searching back-issues of a journal than a page filled with RSS feeds, one for each issue!

Recommendation 2: Maintain one or more feeds with changing content, rather than creating new ones each time there is new content to deliver. Ensure that the locations of RSS feeds are persistent, and institute procedures to ensure that they are running and available.

Be brief and to the point... but not too brief!

An examination of the range of RSS content available in the UKOLN Channel Directory demonstrates a wide variation in the length and number of entries.

An RSS file incorporated into a presentation framework such as a portal will be displayed in its entirety, and often alongside other feeds or different portal services. It therefore becomes important to ensure that the feed does not take up more room than it needs to, whilst providing sufficient information about each item in order to allow the reader to evaluate whether or not any link is worth following.

Hubs of the Resource Discovery Network (RDN) provide potentially valuable feeds, highlighting resources newly added to their resource catalogues. They take very different approaches, with SOSIG [see it] providing no more than a resource title and date, and HUMBUL [see it] including quite rich textual descriptions. EEVL [see it] appears to take a middle ground.

Outside of the RDN, news feeds from CETIS [see it] and the Dublin Core Metadata Initiative [see it] appear to strike an excellent balance between brevity and content.

As well as the length of individual entries, consideration should be given to the number of entries in any feed. Given the need to fit the contents of any one feed within a page alongside other RSS sources or services, a guideline in which individual entries are limited to around 50 words, and a feed to no more than half a dozen entries strikes a reasonable compromise. Note that although the RSS 1.0 specification [14] does not stipulate the number of entries allowed, it does suggest a maximum of 15. Experience would suggest that, for RSS feeds used in the manner envisioned here, this number is too high.

Fig 7 Screenshot (90K): different approaches to RSS content

Figure 7: different approaches to RSS content; one (too?) short, one (too?) long

Recommendation 3: Restrict RSS feeds intended for embedding in external web sites, portals, etc. to not more than six

<item>

s, each with

 <description>

s of up to 50 words in length.

Stay current

RSS feeds play an important role in keeping their readers informed as to current or topical events. As such, it is important that they be topical and timely, and not overloaded with out-of-date content. At the time of writing (the end of February, 2003), the LTSN Physical Sciences news feed, for example, contained items from June of 2002, and the most recent item is from December of that year! [see how it looks today]

Recommendation 4: Ensure that RSS sources are kept fresh with topical and timely content. Archive or delete older material, rather than allowing it to languish at the bottom of the list.

Know what your feed is for

RSS feeds have a wide range of purposes, including delivering news headlines, showing a current table of contents, alerting users to newly loaded records, etc. The style and update cycle of each may differ, and it is therefore important to have a clear purpose in mind when creating a new feed. Where more than one purpose must be fulfilled, it may be more appropriate to set up additional feeds. This is preferable to overloading an existing feed (e.g. one checked for new database records) with extraneous material (e.g. job adverts).

Recommendation 5: Understand the purpose of each RSS feed you provide. Provide multiple feeds, rather than diluting the message of one with information irrelevant to its audience.

Market your content

Creation and maintenance of a useful RSS feed is a significant investment, but without letting potential readers know about its existence, that investment of effort will be wasted. Various sites exist on which RSS feeds can be registered, including the Channel Directory at UKOLN [18] or Syndic8 [21]. If relevant, it might also be marketed through other awareness raising mechanisms already employed for the Web site or services of your organisation.

Recommendation 6: Register your RSS with appropriate sites, such as UKOLN's Channel Directory.

Remember that content may appear out of context

Given that RSS feeds will often be displayed on Web sites belonging to organisations other than the content creator, and alongside content from any number of other sources, it is important to ensure that the feed makes sense in that context.

As such, content creators should aim to use language and terminology suitable to being read on any number of external sites. Acronyms will need to be defined, for example, and the text of entries should be written for as general an audience as possible. Use of 'this site', 'here', etc (referring to the site of the content creator) should clearly be avoided.

Recommendation 7: Write text for entries that is suitable for delivery through a wide variety of display channels, and to audiences not necessarily accustomed to the subject matter.

Cater — within reason — for different viewers

Not all RSS viewers will display the entirety of an RSS feed. Some, for example, will omit the paragraph of descriptive text associated with an item of news, and only display its title. Creators of content should bear this in mind, and seek to provide meaningful and independent titles and descriptions.

For example, this snippet of RSS from the BBC:

<item>
<title>Asylum seekers win ruling</title>
<description>A group of six asylum seekers has won its High Court challenge
against new rules denying them housing and benefits.</description>
<link>http://news.bbc.co.uk/go/rss/-/1/hi/uk/2779343.stm</link>
</item>

could conceivably be displayed as either:

Asylum seekers win ruling

A group of six asylum seekers has won its High Court challenge against new rules denying them housing and benefits.

or just:

Asylum seekers win rulingRecommendation 8: Ensure that the content of an item's

<title>

and

 <description>

make sense when viewed in isolation.

Preserve your brand

There is some concern that the delivery of content to sites other than your own using technologies such as RSS will lead to a diminishing of your brand, and a drop in the number of visits to your site. This would appear to be patently untrue, with techniques available within the RSS format to display appropriate branding, and with every reader clicking on a headline in one of your RSS feeds — wherever they read it — directed back to your site to read the full story. RSS, potentially, leads to a massive increase in visits to your site, as you reach all the readers of all the other sites that display your headline content, rather than just the small number of visitors who choose to travel to your Web site on the offchance that there may be an item of interest to them there.

Many RSS feeds make use of the format's

<image>

subelement to ensure that an appropriate logo is displayed, along with the textual content, in order to impart branding.

 

Fig 8 Image (31K): logos from a range of feeds

Figure 8: logos of different sizes from a range of RSS feeds

As can be seen, use of this subelement varies widely, from no logo at all to logos of excessive size that take up far more room than is necessary, and possibly even encourage portal and web page designers to avoid use of the feed.

Recommendation 9: Make full use of the

<title>

,

<description>

and

<image>

subelements of

<channel>

within the RSS format in order to carry branding along with the news feed. Restrict

<channel>

titles to less than six words, descriptions to less than 15 words, and images to less than 90 pixels along the longest side.

Integrating other types of content

The bulk of this article has been concerned with using the RSS format to deliver headline-type content to remote Web sites and portals. Related issues arise in delivering other forms of content through portals, and the PORTAL project and others will need to develop guidelines and procedures in order to streamline this process before it can become truly effective.

The Resource Discovery Network (RDN) has made some progress in this area, with their Working with the RDN document outlining a number of the interfaces available to external services such as portals [22].

The PORTAL project is currently planning a workshop for content providers, in order to continue this process and work with the community to develop guidelines that meet the different needs of content owners and presenters/aggregators.

Staying informed

There are several mailing lists and web sites devoted to discussion of portal issues. One that is extremely valuable for those working in UK Further or Higher education is the

portals

list on JISCmail.

 

To join this list, send a message to

 

jiscmail@jiscmail.ac.uk

 

with the body of the message reading

 

join portals Your_Firstname Your_Lastname
--

 

e.g.

 

join portals Paul Miller
--

 

References

  1. The Iconex project is at: http://www.iconex.hull.ac.uk/.
  2. The Arts & Humanities Data Service (AHDS) is at: http://ahds.ac.uk/.
  3. The Archaeology Data Service's PATOIS project is at: http://ads.ahds.ac.uk/learning/.
  4. The Open Archives Initiative (OAI) is at: http://www.openarchives.org/.
  5. The PORTAL project is at: http://www.fair-portal.hull.ac.uk/.
  6. The JISC's Focus on Access to Institutional Resources (FAIR) Programme is at: http://www.jisc.ac.uk/index.cfm?name=programme_fair.
  7. Some information on RSS is available at: http://www.ukoln.ac.uk/metadata/resources/rss/.
  8. UKOLN's RSS-xpress-Lite tool is available at: http://rssxpress.ukoln.ac.uk/lite/.
  9. Information about the JISC's Information Environment Architecture is available at: http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/.
  10. The Archaeology Data Service's Catalogue is available at: http://ads.ahds.ac.uk/catalogue/.
  11. 'RSS - Sharing Online Content Metadata' by Pete Cliff appeared in issue 7 of Cultivate Interactive, and is available at: http://www.cultivate-int.org/issue7/rss/.
  12. The BBC News site is at: http://news.bbc.co.uk/.
  13. Information on the University of Hull's Digital University Project is available at: http://www.digital.hull.ac.uk/.
  14. RDF Site Summary, version 1.0, is available at: http://purl.org/rss/1.0/spec.
  15. Information on the Resource Description Framework (RDF) is available at: http://www.w3.org/RDF/.
  16. Rich Site Summary, version 0.91, is available at: http://my.netscape.com/publish/formats/rss-spec-0.91.html.
  17. Information on uPortal is available at: http://mis105.mis.udel.edu/ja-sig/uportal/.
  18. UKOLN's Channel Directory is at: http://rssxpress.ukoln.ac.uk/.
  19. Mark Pilgrim's RSS Validator is at: http://feeds.archive.org/validator/.
  20. '5 step guide to becoming a content provider in the JISC Information Environment' by Andy Powell appeared in issue 33 of Ariadne, and is available at: http://www.ariadne.ac.uk/issue33/info-environment/.
  21. Syndic8 is at: http://www.syndic8.com/.
  22. Working with the RDN is available at: http://www.rdn.ac.uk/publications/workingwithrdn/.

Acknowledgements

Thanks to Paul Browning, Pete Johnston, Liz Pearce, Andy Powell, and Robert Sherratt for commenting on a draft of this paper.

UKOLN is funded by Resource: The Council for Museums, Archives & Libraries, the Joint Information Systems Committee (JISC) of the United Kingdom's Further and Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the Universities of Bath and Hull where staff are based.

The Presenting natiOnal Resources To Audiences Locally (PORTAL — http://www.fair-portal.hull.ac.uk/) Project is a joint activity of UKOLN and Academic Services Interactive Media at the University of Hull, funded under the JISC's Focus on Access to Institutional Resources (FAIR) Programme.

Author Details

Paul Miller
Interoperability Focus
UKOLN

Email: p.miller@ukoln.ac.uk
Web site: www.ukoln.ac.uk/interop-focus/