Synthesizing guids for podcast feeds

Posted by Dave Winer, 12/2/04 at 12:22:56 AM.

This subject came up on the ipodder-dev list a couple of weeks ago -- should we require guids, which are optional in RSS 2.0, on podcast feeds?

The problem is that there are feeds that re-use the same url for an MP3, so you can't look at the url to see if an enclosure is new. If the feed has guids for each item there's no problem telling that something is new, that's what the guid is for. But for feeds with items with enclosures, like all podcast feeds, you can synthesize a pretty good if not perfect guid by hashing the combination of the enclosure url and the size. Here's some pseudo-code that illustrates.

guid = string.hashmd5 (enclosure.url + string (enclosure.length))

You get a string like this: "72562c84bb8e6e7636f5b7c756e0a19e."

In fact, the size may be good enough as a guid, they're usually such large numbers that you can guess that each one is globally-unique. It seems you won't be wrong very often. But when combined with the url, it seems fool-proof. If the url and the length are the same, we say we've already seen it. Of course the best way to be sure is for the feed to provide a guid.

Discuss

XML icon

XML coffee mug

Create your own Manila site in minutes. Everyone's doing it!