There has been a lot of fuss going on lately regarding Stephen Wolfram’s ambitious project to create a comprehensive "computational knowledge engine." called Wolfram/Alpha.
UPDATE: Stephan Wolfram now also started a blog at http://blog.wolframalpha.com/
The Berkman Center for Internet & Society at Harvard University hosted yesterday 27 April 2009 a sneak preview of the Wolfram|Alpha system.
This was a full 2 hours webcast, with no screenshots (at least not during the webcast), just a talking head for 2 hours and Q&A from the audience.
I finally got hold of a screenshot via Techcrunch:
There is already some good coverage on this by Larry Dignan, Editor in Chief of ZDNet, on his blog here.
Larry summarizes well:
“Four big pieces are behind Wolfram/Alpha:
- Curated data: Free, licensed and feed data. Running through human and automated process to verify the data and make sure it’s “clean and curatable.” At some point, you need a human domain expert.
- Algorithms: Wolfram/Alpha uses a bevy of algorithms including 5 million to 6 million of mathematical code.
- Linguistics: The goal is to interpret free-form language processing. Wolfram said Wolfram/Alpha uses various components and techniques to figure out what people are actually asking. Part of that process is filtering out fluff. ”We’ve been pretty good at removing linguistic fluff,” said Wolfram, he said people eventually get to the point where they speak as if they were talking to an expert. “People quickly begin to just type in concepts as they come to them.”
- Presentation: Algorithms try to pick out what’s important to the searcher. Again, Wolfram noted that human-aided algorithms are needed.
Instead of delivering up a bunch of links, the Wolfram/Alpha search engine tries to put a narrative around a user’s question and allow them to drill down. Indeed, the result presentation features graphics and other computational features. Think part calculator, part search engine. “
Interesting to see that Google upgraded and announced some new features during the Wolfram demo – and thereby taken all the attention away from Wolfram back to Google, and Wolfram fires back a couple of minutes/hours later. Some other coverage about this here and here and on Techcrunch.
The webcast itself was pretty boring. After 43 min of monologue, Stephen Wolfram opens the floor for questions. And the first question was right on.
A journalist from O’Reilly wanted to know more about the consistency of data, and whether you can trust the algorythm this much. Answer; what we are doing is creating an (or the ?) authoritative source of data. Mechanism for people to contribute data. And Wolfram to “audit” that data. Source identification is the key challenge in all this. All this makes me think of Ken Steel BSR (Basic Semantic Repository) Beacon project in the mid 90’ies, where he would be THE owner of the semantic repository that’s going to keep all tags and semantic meanings of date being carried around in XML-like tagged data.
- API’s: 3 levels of API’s: presentation, underlying XML, and symbolic expressions of underlying Mathematica source data.
- Metadata: when they open-up, plan is to expose some of the ontology through RDF.
- Upload personal data to the system: intention to have a professional version of Wolfram/Alpha, subscription based.
David Bermaste: what with questions/answers were scientists have difference in opinion, such as “Are certain classes of PCB’s human cancerogeneous ?” or in other words “who has the real truth ?”
Who is this for ? Kids or scientists ? Answer: “To make expert knowledge available to anybody, anywhere, anytime.” Wow. That’s ambitious.
What in case the question does not make sense ? For example “what is the 300th biggest state in Europe ?”. At this stage and in this version Wolfram/Alpha does not return you a result.
The challenge also seems to be how you keep the info and the universes of knowledge up to date. Today this project has +/- 100 people working on it (last period maybe 250), but what army of people do you need when this really goes live big way ? Answer: it’s probably going to end up with a 1,000 people. Sounds a bit underestimated to me if you ask.
In essence, all this is about Knowledge Management. And i know quite some companies that would be interested in throwing all their unstructured data and have an engine that can make meaning out of all that data. So the professional version may be up to something. But in it’s current state for the public in general to compete head to head with Google ? No, i don’t think so.
I suggest you also have a look at Mendeley, a start-up (initially from Germany, but now based in London), use parse and discover patterns in university research papers, but just think how this could be applied to basically any type of information. One of their VC’s is an ex Last.fm and ex-Skype (and also a professor or even a doctor in Economics at the University of Hamburg) and it’s interesting to see how these young net-generation guys are capable of telling their story in less then 2 minutes, with monetization topic included, and still leave you with a hunger and curiosity to want to know more.
I never got this thrill/feeling of “want to know more” during the 2 hour Wolfram webcast. I felt bored, and was asking myself all the time the question “what have i missed here ?” and a sort of compassion and respect for somebody’s lifework of the last 25-30 years. I was also somewhat disturbed by what i consider a form of self-complacency, bit out of the ivory tower type of discourse, not really accessible for non-experts.
This Stephen is definitely a very smart and wise man, and it’s clear he is passionate about his work and is in search of “intellectual satisfaction”, but i am afraid he won’t be up to the power and sexiness of Google and many other newcomers on this stage.
But does this withstand what i would call the “Jeff Jarvis’ Google Test” about new types of relationship, architecture, publicness, elegant organization, new economy and business reality, new attitude, ethics and last but not least speed ?