Zope.org - RDFSummary - display RSS files

www.zope.org

old.zope.org
- /Products
- /Members

Log in

Folder Contents View List Releases DublinCore

This is the README file for RDFSummary 2.x. If you have installed version 1.x, then read this file.

RDFSummary is a product to display content from other web sites provided they make it available in RSS 0.9, 0.91, 0.92 or RSS 1.0 format.

The benefit of doing it this way is that the data you get is not encumbered with HTML, giving you more flexibility when applying your own look and feel.

How to use it

First you install RDFSummary.tgz in the Products folder and restart Zope. You will now be able to create objects of the type "RDF Summary". The form will ask you four questions: the id, title, URL of the RSS File and an optional proxy server. This product understands versions 0.9, 0.91, 0.92 and 1.0 of the RSS format. You enter the URL that will return a correct format. If your RDF file is password protected you can specify authentication parameters like this: http://user:[email protected]/file.rdf

If nothing comes to mind, you can always try http://www.slashdot.org/slashdot.rdf. Last time I checked it was version 0.9. Otherwise, I have created some test files on Zope.org.

Prefix all names with http://www.zope.org/Members/EIONET/RDFSummary/
rdfexample1.rdf	The first example from section 7 of the RSS 1.0 core specification. It uses RSS and Dublin Core.
rdfexample2.rdf	The second example from section 7. Since this example uses an unsupported module RDFSummary will not be able to import it.
eventexample.rdf	An example of an event parsed by the experimental support for events.
slashdot.rdf	This is a copy of Slashdot from around 11 April 2001. It is RSS 0.9. Note the use of ISO-8859-1 encoding.

There is also an optional property for a proxy-server. You enter the URL of the proxy as in http://proxy.mycompany.com:8080. Authenticated proxies are supported if you are using Python 2.1. To use an authenticated proxy, you enter the URL as in http://user:[email protected]:8080.

When you have created the object, you must update or synchronize the object with the content on the remote webserver. Click Update to perform it. Most common mistake is bad encoding of the file in which case you get a syntax error or you have HTML tags somewhere in the title or description and these are not supported.

Let's say you have created a channel called slashdot. Then insert this in your dtml-document

<dtml-with slashdot>
  <dtml-var "channel()['title']">
  <dtml-var picture>
  <dtml-in items mapping>
    <p>
    <a href="<dtml-var link>">
    <dtml-var title></a><br>
    <dtml-if "_.has_key('description')">
      <dtml-var description>
    </dtml-if>
    </p>
  </dtml-in>
</dtml-with>

or if you prefer ZPT:

<metal:block tal:condition="here/slashdot"
	     tal:define="news here/slashdot"
 >
   <div tal:content="python:news.channel()['title']" />
   <div tal:repeat="item news/items">
       <a href=""
          tal:attributes="href item/link"
          tal:content="item/title"
       />
   </div>
   <div>
       <a href=""
          tal:attributes="href python:news.channel()['link']">More ...</a>
   </div>
</metal:block>

If you want your Site Summary to import data on a regular basis, you can write a program which updates the channel by doing a GET on the update method as in:

 lynx -source http://www.mysite.com/slashdot/update >/dev/null

Generally, it's polite to ask the owner of the RSS file if you can use their file, and also not to retrieve it more than once per hour.

The Image

The image can be referenced in two ways:

<dtml-if "image().has_key('data')">
    <img src="&dtml-absolute_url;/view_image">
</dtml-if>
   or
<dtml-var picture>

The last convenience method will also put an anchor-tag around the img-tag if the RSS file contains a link for the image.

Textinput

The textinput could be used like this:

<dtml-with slashdot>
<form method="GET" action="<dtml-var "textinput()['link']">">
<dtml-var "textinput()['description']"><br>
<dtml-var "textinput()['title']">
<input type="text" name="<dtml-var "textinput()['name']">">
</form>
</dtml-with>

How it works

An RDF Site Summary file consists of four main parts. A channel part, (inside <channel> tags), which is the description of the summary file itself. An optional image part, which contains a url to an image file of about 88x31 pixels. An optional textinput part. It contains the elements necessary to set up a search for the site you are retrieving the summary from. Finally there is the items-part. The first three are implemented as Python dictionaries called channel, image and textinput. The last one is implemented as an array of dictionaries.

A Python dictionary is also known as an associative array. It is kind of like a sack, where you can put all your goodies tagged with a keyword you can use to get them back.

RDFSummary parses the summary file, and for each tag inside the four main parts, it stores them under a keyword. Since there is only a few mandatory tags, you must typically first check if the dictionary contains the item before you can use it.

RDFSummary supports the core RDF Site Summary 1.0, and the two modules: Syndication and Dublin Core. How it supports them is very simple. It simply maps the namespaces to easily usable keywords. The Dublin Core has one tag for dates, but RDFSummary doesn?t try to understand the date. It just treats it as a string.

Restrictions & peculiarities

Unknown tags

The intention of the RSS 1.0 modules is that old parsers should simply ignore tags they don't know about. This product will give you a syntax error.

Encoding

The encoding from the xml processing instruction is saved and added to the channel dictionary.

HTML

HTML (or XHTML) is not allowed inside an RDF file. May come as surprise to some, but this would circumvent what RSS is trying to achieve.

Entities

All known and unknown entities are supported.

Semantic augmentation

Most of the RDF-elements expect just a simple text string. Sometimes this is not good enough. What if you want to provide an email address and phone number for the creator of a resource? The specification makes this possible by semantic augmentation. An example of it:

<dc:creator>
  <addr:address rdf:value="John Smith (+45 1234 5678)">¹
    <addr:name>John Smith</addr:name>
    <addr:phone>+45 1234 5678</addr:phone>
  </addr:address>
</dc:creator>

You just need to know that RDFSummary isn't able to handle semantic augmentation. You will get a syntax error. If you have these kinds of issues maybe RDFGrabber can help you.

Cut-n-paste and Rename

If you use copy-and-paste, then RDFSummary will also copy the pickle-file and load it into the object. But if you use rename, it won't. You must do a manual update.

Note 1: Specifying rdf:value as an attribute is actually legal according to figure 14 in W3 RDF spec.

Experimental support for events

With version 1.4 I've add experimental support for the event RSS module, which I have authored myself and proposed to the RSS-DEV working group. The event module will make it possible for you to grab events like meetings, conferences etc. from remote websites and create a calendar listed in the order of the event time.

How does it work? Assume you grab event announcements from three different sources: KDE, Usenix and O'Reilly and save them to the objects; kde, usenix and oreilly. Then you can make a simple calendar like this:

<dtml-var standard_html_header>
<h2>Calendar of events</h2>
<dtml-comment>
  Make one big list from three sources
</dtml-comment>
<dtml-call "REQUEST.set('evlist',[])">
<dtml-in "(kde,usenix,oreilly)">
  <dtml-in items>
  <dtml-call "evlist.append(_['sequence-item'])">
  </dtml-in>
</dtml-in>
<dtml-comment>
  Go through the list sorted on startdate.
  Event adds the elements startdate, enddate, location, organizer and type.
</dtml-comment>
<dtml-in evlist mapping sort=startdate>
    <p>
    <a href="<dtml-var link>"><dtml-var title></a><br>
      <dtml-var startdate>
    <dtml-if "_.has_key('enddate')">
     - <dtml-var enddate>
    </dtml-if>
    <dtml-if "_.has_key('location')">
     <br><dtml-var location>
    </dtml-if>
    </p>
</dtml-in>
<dtml-var standard_html_footer>

Acknowledgements

This product would probably never have seen the daylight if I hadn't been able to build on top of Edd Dumbill's SiteSummary product.