RDF Site Summary. A technique of broadcasting a
newsfeed from a website summarising changes. RDF stands for Resource Description
Framework. Nature
Magazine sends its daily updates that way. They provide headlines, summaries
and links for all the new web content and their own and associated sites. The
feed is in RDF/XML.
Introduction
You don’t need special server software to serve feeds. You create XML
files and upload them. The only thing you have to your server in ensure it
serves XML with the correct MIME type — application/xml.
or text/xml You can check if your server is already
configured by uploading an XML file and checking it with MimeCheck.
The reader handles the cleverness of polling and downloading merging what it has
with recent postings.
There are seven
RDF variants, Atom and RSS. You probably want to use RSS version 2.0, or
support multiple formats including Atom 3. Atom was designed after RSS, and is considered cleaner.
Websites called aggregators spider the various feeds
and keep a central database of them, and redistribute the information in various
ways.
RSS-2
You can see what a typical RSS-2 feed looks that I generate for the Java
Glossary RSS2 feed:
The items are stored in reverse chronological order. Even if you stop posting
old items, your viewers may optionally retain copies of them. Normally you would
just post a week through a month’s worth of the most recent items, then
discard them. People who want to keep old ones, can keep their own copies. You
don’t need to serve an archive.
Most of the fields are pretty obvious. You can learn details about the many
optional fields.
Unlike ordinary XML, there is no DTD.
The ttl field means Time To Live
in minutes, how long a copy of your feed could be cached before it should be
considered stale.
The image is logo for your feed, by default 88 ×
31 pixels.
The optional guid (not guide) stands for: globally
unique identifier. It is a string that uniquely identifies the
item. When present, software reading your feed can use this unique string to
determine if an item is new. In my HTML Static macros implementatin, I created
it by computing a MD5 digesh of the website, feed, link,
title and publish date giving 16 bytes (256 bits) which I display display as a
32-digit hex number. You want a globally unique id, not just unique within the
file, unique within the feed over time or unique over your entire website. It is
not used to fetch the item. If you say:
<guid isPermaLink="false">2000220155</guid>
then the id is just a meaningless hash.
But if you say:
<guid isPermaLink="true">http://mindprod.com/jgloss/yakshaving.html?123</guid>
then the link is also a URL to the item described by the feed. An undecorated
URL not be unique if you generated a second RSS item to the same URL when the
story changed.
The pubDate uses computer-unfriendly, English-only
alphabetic month and day-of-week names of RFC 822
It won’t accept UTC, only GMT
or UT. Java does not support UT. It supports only
a subset of the Java TimeZone abbreviations, so it would be dangerous to use Locale
default.
Two fields I would have expected are missing, the URL of the home page, and URL
of the xxx.xml feed itself.
To get a browser to notice your feed, you need a special entry in the <head>
section of a the html page hosting the feed that looks like this:
RDF
You can see what a typical RDF feed looks like on this Nature
Magazine feed:
FeedBurner
You can see what a typical FeedBurner feed looks like on this FeedBurner
Blast the Right feed:
Note how it permits you to format the text with
HTML tags.
Atom 3
Atom was developed to address some of the problems with the rather Mickey Mouse
RSS. The syntax of the feel is called APP (AtomPub or Atom Publishing
Protocol). It is defined in RFC 287 and RFC 5023. It is considerably more complicated and
includes a server protocol for handling individual items. You can see what a
typical Atom feed looks like on this BlogSpot
feed:
The display is not quite right because the text is being displayed in ISO-8859-1,
where the original is in UTF-8. Each item is sandwiched in entry
tags. UTF-8 encoding allows special characters without entities. It supports
both plain and HTML-formatted text. It allows you to ascribe an author to each
entry. The scheme allows you to update entries. There is a mechanism to reply to
individual entries, as you would in a blog.
Viewing Feeds
The feed is in XML. Here is how it will render in various browsers:
| How Various Browsers Render RSS Feeds |
| Browser |
How Renders |
| Opera |
Special RSS-feed column format |
| Firefox |
CSS-style sheet HTML. Displays the feed logo. |
| Sea Monkey |
raw HTML tags, colourised |
| Mozilla |
CSS-style sheet HTML |
| Netscape |
CSS-style sheet HTML |
| IE 7 |
A blank screen |
| IE 8 |
HTML layout |
Advantages
A person interested in your website can use a program called rss2email
to subscribe to sites of interest. he program polls the feeds every so often and
emails anything new. This way, the user does not have to check each site for new
stuff; the new stuff is delivered right to the inbox. If the user is interested
in a particular item, she can follow the link included in the feed.
Opera browser can manage your RSS feeds. Firefox supports RSS feeds calling them live
bookmarks. Some sites refer to them as live feeds.
Because the date of each item is encoded in a standard way, it makes it possible
to display just the new unviewed material no matter what the frequency of the
automated visiting.
Setting Up a RSS Feed
If you wanted to set up an RSS feed on your website there are several ways to do
it:
- Use an ordinary text editor to compose an XML file. When the file gets too big,
drop the oldest items.
- Use any of the tools mentioned below.
- You can use Rome to produce
RSS feed in various formats. The library handles the differences between
supported formats without impacting the developer. Once you have a servlet (or
something similar) to produce a feed using Rome , you can hook it up with FeedBurner
which will take care of HTML-ing the feed, as well as giving the users options
to bookmark it on various social bookmarking sites, add the feed to various
readers, provide statistics for the click-thru on the feed items, etc. You can
also get a user-friendly URL by doing so. A example of this technique is at Techcrunch.
- Roll your own. RSS is not that complicated. In the simplest case, for each item
you have a date, a title and a subject wrapped in boilerplate. It is not that
difficult to write a program that formats the data is any RSS form and also in
HTML ready to include as a side effect. You can hard code the data into a
program you compile and run each time you update the feed, or you could put it
an CSV file or one line per field. Or you might use HTML Static macros to
publish the data both in HTML and RSS. You might find that easier than learning
a third party package. That is how I do the Java
Glossary RSS feed for my own website. I embed magic comments on my webpages
that the HTML Static Macro engine finds, sorts, groups and compiles into RSS 2
feeds.
Aggregators
Aggregators are people who will help publicise your feed. They may also proxy it.
That means they probe it periodically, then serve it to others. This takes a
load off your server. They also convert it to other formats. They may also
provide statistics about your feed’s usage. It may alse also provides
tools for syndicating your content (such as embedding
headlines into web pages). Aggregators may also convert ordinary blogs into
feeds for you too.
Once your feed in operational, you might want to register it with the following
aggregators.
Books
 |
recommend book⇒Developing Feeds with RSS and Atom |
| | paperback |
|---|
| ISBN13: | 978-0-596-00881-9 |
|---|
| ISBN10: | 0-596-00881-3 |
|---|
| publisher: | O’Reilly  |
| published: | 2005-04-13 |
| by: | Ben Hammersley |
| Covers RSS-2 and Atom to create website update feeds, and website update consolidating feeds, especially for news websites. |
|