tiistai 4. toukokuuta 2010

Back to basics: quantifying the web

I stumbled upon developing a iGoogle gadget. These small web programs are floating in the customizable front page of Google account owners. They can be used to push information to the user, and probably in the near future they develop into real applications.


The point isn't actually about this specific technology. What I started thinking was, is it possible that in the very near future we can test, index and handle the web in a systematical, reliable way? Let's say something along these lines:

if (blog("John Smith") has new content since yesterday) then Display(the content)
or:

find (the best article (about "NASA space mission")) with length of (3-4 pages)
or:
find (a free picture (of "venomous snake") at least (size 1024 x 768))

Not just tech, but vision

What I mean is that instead of the very technically diverse, and sometimes spaghetti-like content what we now see in the HTML source code, we could have content that really could be utilized in flexible ways. The problem is not technical, actually. I am sure there are solutions for this kind of thing. The problem is really at awareness; people's understanding of what openness can mean for the entire world. Just like the common mathematics we use in science, the language of the web could mean a significant step forward if the information content could be used in several contexts, easily.

Origin of WWW

The original WWW architecture was invented (or attributed at least) by Sir Timothy John Berners-Lee, a professor in Oxford. When he was doing the research at CERN, Tim felt that the scientific papers and documents in the world were very hard to access. Thus he started to think about platform neutral solution for information access and conversion, and eventually came up with what was known as the World Wide Web. Technically it was the client-server protocol called HTTP, or Hyper text transfer protocol.

Multiple modalities: context counts

What is interesting is to see how the command line fares against graphical user interfaces, and how these two are combined with sensor interfaces like those used in virtual reality glasses. Yet another spice which we are going to be utilizing probably a bit more is context/location sensitivity. You could sort your email based on in which city you are; those requests for a meeting, coffee break etc. would be prioritized first, since you're most likely to be able to tend a meeting which is 5km from your position - not the one across the globe.

Getting the juice out of bits

I've always wanted to develop intelligent filters for email. Email, though is quite aged, is still the strong backbone of our communications and intelligence. It's enhanced by the use of mobile communications and instant messaging, but still there's need for email to kind of carry your "mail" just as in the physical world. I think email is not likely to be completely ruled out by other means of communication.

But, on the backside, I have to say that at least my inbox has become a kind of wild bazaar, where you could sample any 30 consecutive mails and they're probably from at least 27 different contexts. In other words, the email stream has become less coherent and more diversified, which sometimes occludes efficient access to important information. I did have a time when my inbox was filled with over 600 unread mails, and they really did start to press my brain!

Lot of possibilities ahead

So it's a mix-and-match game of picking up important signal from the noise. Automation, learning filters, and precise custom-made filters can probably help a lot. Perhaps also visualization of emails, like a tree, a timeline, or perhaps using unique keywords and drawing a mesh network of how the days' emails related to each other? Time will tell!