Free Newsletters
Technology & Business Daily

InfoWorld
Log-in | Register

  Friday, August 29, 2003 

Well-formed writing and information routing

The tagging conventions I've been applying for the last four months are really springing to life, now that structured search of my blog is available. For example, my convention has been to write quotations like so:

<p class="quotation" source="...">...</p>

On the search page, one of the canned queries uses this XPath expression to find all the places where I quote Ward Cunningham:

//*[@class='quotation' and @source='Ward Cunningham']

If I want to find Don Box quotes, I can just change that -- in the form's accompanying input field -- to:

//*[@class='quotation' and @source='Don Box']

While I'm at it, I might as well acknowledge all of the voices that have enriched my blog over the past four months. A snippet of XSLT found them:

Adam Curry, Alf Eaton, Allie Rogers, Annrai O'Toole, Bernard Teo, Bill de hÓra, Bill Gates, Bob Clary, Brendan Eich, Brian Marick, Chad Dickerson, Chris Brumme, Crazy Apple Rumors, Dan Brickley, Danny Ayers, Dave Winer, Don Box, Douwe Osinga, Gordon Weakliem, Hiawatha Bray, Ian Hixie, James Farmer, Jenny Levine, Jesse James Garrett, Jim O'Halloran, John Markoff, Ken Manheimer, Les Orchard, Matt Griffith, Micah Alpern, Mitch Kapor, Nancy McGough, Patrick Logan, Paul Everitt, Paul Graham, Paul Philp, Pete Cole, Peter Wayner, Phil Wainewright, Philip Brittan, Ray Kurzweil, Ray Ozzie, Rob Howard, Robert Ivanc, Robert L. Vaessen, Robert Scoble, Sam Ruby, Samuel Pepys, Sandeepan Banerjee, Scott Reynen, Sean McGrath, Stefano Mazzocchi, Ted Leung, Ted Neward, Tiernan Ray, Tim Bray, Tim Oren, Tom Yager, Tonico Strasser, Ward Cunningham

It's great to be able to reuse content like this. A point I made yesterday bears repeating, because it's central to what Steve Gillmor calls the "information routing" aspect of RSS and blogging. Well-formed content is a powerful enabler for a couple of reasons.

First, you have more control over your own material. If you want to develop a series of elements -- mine include quotations, mini-reviews, tips, and code snippets -- there's no special content-management machinery needed to do so. Just start tagging things accordingly; structured search immediately brings these views to life. Some will merit formalization in the CMS, others won't. This exploratory mode is to the CMS world what dynamic languages and interactive environments are to the world of programming.

The second reason is subtler. Your content doesn't just live on your blog. It flows through the RSS network. If others can perform structured search of your content, and use automated methods to recombine it, then your stuff can resonate more powerfully and is more likely to retain its fidelity as it gets routed around.

To ante up for this game, you have to produce well-formed content. The mainstream blog-writing tools aren't helping at all. Most well-formed writing is done in emacs, still. Can we please change that soon?

 

RSS to replace email? Nah.

I've heard a lot about how Outlook 2003, both alone and in combination with Exchange Server 2003, has been beefed up to fight the war on spam. From a client-only perspective, it doesn't look too promising. Apart from filtering messages that have been externally processed -- for example, by SpamAssassin -- the primary strategy appears to be blacklisting or whitelisting senders. As this screenshot illustrates, Sobig-like worms destroy that strategy. I can neither whitelist nor blacklist email appearing to be from Dave Ogle or Anne Manes or Tom Thompson or Lowell Rapaport. Quite likely, none of these folks has even been infected with the worm. Their names just happened to be chosen randomly from the address books of users who were infected. sobig

For what it's worth, my current lines of defense are:

  1. SpamPal, a local proxy that I use for RBL (realtime blacklist) checking. I point Outlook 2000 at SpamPal on localhost; it rewrites the headers of RBL positives; Outlook filters send them straight to Deleted Items for review.

  2. SpamAssassin. Mail to my InfoWorld address is checked by SpamAssassin. Until Sobig came along, I wasn't getting much mileage out of SpamAssassin, because the IW guys have it running in a conservative mode. SpamBayes, my third line of defense, was doing most of the work. But this SpamAssassin rule has been highly effective against Sobig:

    MICROSOFT_EXECUTABLE (10.0 points) RAW: Message includes Microsoft executable program

    Again, Outlook filters send these straight to Deleted Items for review.

  3. SpamBayes. I'm quite sure that SpamBayes alone would have adapted to Sobig. But by letting SpamAssassin do the grunt work, I reserve SpamBayes for subtler discrimination. You'd think that during this onslaught, my MaybeSpam folder -- where SpamBayes puts messages it's not sure about -- would be overflowing. In fact, only five or 10 messages a day land there, and as usual they are messages that I legitimately have to decide how I want to handle.

The only real accommodation I've had to make is to reduce the amount of mail I leave on the server, because the volume -- which seems not to be slackening -- was causing quota problems. Also, to be fair, I spend more time scanning for false positives, though nowhere near the amount of time I used to spend sorting things out before I implemented this layered strategy.

There's been a lot of talk about replacing email with RSS. I don't buy it. Although I am a huge fan of RSS, and expect it to largely replace email for subscription-related purposes (e.g., mailing lists), I don't see it as a general solution for ad-hoc person-to-person communication. Nor do I buy the argument that we need to toss SMTP. Obviously, we need to use it in a slightly different way. Of the various proposals floating around, the RMX idea -- a DNS-based solution that enables a receiving mail server to verify whether the sender's IP address is authorized to send from the domain within the sender's address -- seems particularly interesting. (I mentioned RMX in the Canning Spam article last month.) But it would be nuts to throw out the SMTP baby with the spam bathwater, and I'd be really surprised if that were to happen.

 


Recent Entries


















































Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist