Monday, December 25, 2006

Blog Moved! - PLEASE UPDATE this Feed/ Links

This blog has moved to

Tuesday, December 12, 2006

This Blog has moved!

Please update your bookmarks and rss subscriptions. The new url is:, the address of the feed is

Sorry for the inconvenience.

The new blog is pretty much the same*, the main reason for the switch is that I wanted to move the blog over to my own server and domain.  

I'll keep this blog online as an archive but will stop posting here.

* There is one new feature: I have enabled comments of the new blog.

Cyc Google TechTalk

Google Video has a video of a talk given by Douglas Lenat, the President and CEO of Cycorp. It's more than 70minutes long, but worth the time of anyone interested in AI. I want to highlight two parts that I found particular interesting:

It's been my believe for a while that general purpose reasoners and theorem provers are only good for very few tasks (such as proving the correctness of a program) and that most real world tasks rather need faster, task specific reasoners or heuristics. For me this thought was always motivated by ideas from cognitive psychology (see for example the research into "Fast and Frugal heuristics" by the ABC Research Group in Berlin).  However, I always lacked good computer science arguments to back up this point - now at least I can say that Cycorp sees it the same way:

There is a single correct monolithic reasoning mechanism, namely thorem proving; but, in fact, it's so deadly slow that, really, if we ever fall back on our theorem prover, we're doing something wrong. By now, we have over 1,000 specialized reasoning modules, and almost all of the time, when Cyc is doing reasoning, it's running one or another of these particular specialized modules.(~32:20)

I also think that humans are almost constantly reorganizing the knowledge structures in their head - most of the time becoming more effective in reasoning and quicker in learning. An example for this process is the forming of "thought entities". There  seems to be a limit on the number of thought entities that humans can manipulate in their short term memory. This limit seems to fixed for live and seems to be somewhere between 5 and 8. What does change with experience is the structure and complexity of these thought entities. A famous example for the effect of experience on the thought entities is the ability to recall chess positions in expert chess players and amateurs. If you show the positions of chess pieces from a normal game to expert chess player and amateurs, the expert players will be much better at recalling the exact positions. But when you place the pieces in a random manner both will perform equally bad. The common explanation for this phenomena is that the expert has more complex though entities at her disposal. In normal chess positions she can find large familiar patterns - like "white has played opening A in variant B". These large and complex thought entities allow the expert to fit the position of up to 32 chess pieces into the available 8 slots. When the chess pieces are placed in a random manner, these structures familiar to the experts don't appear anymore and the expert loses its advantage.
And now I always wondered what could be equivalents to this knowledge reorganization process in logic based systems, Cyc has one interesting answer: 

Often what we do in a case like this, if we see the same kind of form occurring again and again and again, is we introduce a new predicate, in this case a relational exists, so that what used to be a complicated looking rule is now a ground atomic formula, in this case a simple ternary assertion in our language (~21:15)

Friday, December 08, 2006

Sporadic Link Post

I haven't made a link post for a long time and the number of links amassed in this time is just to large to post them here - so here is a selection (you can find all links at, the newest ones are always shown in the sidebar of this blog).  

AT&T accurately predicts the future, incorrectly picks delivering company.
OLPC usability - a look at the UI of the 100$ laptop (really quite different)
Riya's Is First True Visual Image Search.
Google Ad revenue 'to surpass TV'
Amazons Elastic Compute Cloud
Pope says that AI researchers risk fate of Icarus

The Future of Entertainment? Very well written article on the making of Lonelygirl15 and what it might mean for the future of entertainment.

Ontology Maturing with Lightweight Collaborative Ontology Editing Tools

Another publication. Will be presented at the Workshop on Productive Knowledge Work : Management and Technological Challenges (ProKW), 4th Conference on Professional Knowledge Management - Experiences and Visions (WM 2007).

Authors are Simone Braun, Andreas Schmidt and me.

Ontology building is an important prerequisite for state-of-the-art semantic technologies for knowledge worker support. But ontology engineering methods have so far neglected the early phase of ontology building where a conceptualization only exists rather informally and underlies continuous evolution through collaboration and interaction within the community. We have to view ontology building as a maturing process that requires collaborative editing support and the integration into the daily work processes of knowledge workers. In spirit of current Web 2.0 applications, we present an AJAX-based lightweight ontology editor as a first approach to this problem.

I won't be at the conference, but the other two will be. My role in writing the paper was rather small anyway. I did, however, do most of the work in defining and implementing one "lightweight collaborative ontology editing tool" presented in the paper. A rather nice AJAX application. A collaborative editor for a subset of SKOS. The cool thing about the editor is that it really support truly collaborative work - Google Spreadsheet style;  i.e. two people can really change the same concept at exactly the same time and nothing will break. Users see almost realtime* updates of the changes other people do to the same taxonomy.

The paper is not yet online, but someday you'll find it here.

* depending on configuration and connection - but the delay is maybe a third of a second.

Ask City And The Semantic Web

Ask City is the new local search portal released by ask, and no - it's not a Semantic Web application.  But it should be.

For me one of the main new ideas I took home from this years International Semantic Web conference was that for many Semantic Web technologies there is only a limited window of opportunity to move into the mainstream. If the sw-technologies don't make it on time, other technologies will have been used to solve most of the problems that they where conceived for. The other technologies may not solve the problem as complete or as elegant - but their existence makes sw-technologies a harder sell.

Take Ask City as an example. In a way its a traditional mashup - it integrates data from (at least) CitySeach, Yelp, Judysbook, Ticketweb and Urban Mapping.  Exactly the kind of data integration challenge that the Semantic Web wanted to solve. However, its probably not created with rdf or owl because other technologies where more mature, more tools existed, people understood them better...

And there is the "window of opportunity" closing a little bit - SW technologies could solve this problem in a more elegant and flexible manner - but it just got a little bit harder to convince people of that. Its gotten a little bit harder to show a visible(!) added benefit when people already see large scale web information integration happening without rdf.

The BAsAS Architecture For Semantic Web Annotations

A poster I presented at the 1st Semantic Web Authoring and Annotations Workshop at the ISWC 2006. 

We describe a generic architecture for the (semi-automatic) creation, storage and querying for annotations of web resources. Our BAsAS architecture uses recent advances from the Semantic Web and Web 2.0 communities to make Semantic Web annotations a reality. The BAsAS architecture makes it easy for users to start to annotate and easy for
developer to use the annotations that get created.

Besides describing the general architecture we will also detail an implementation of this architecture build for a Semantic Web community portal.

Think of it as Annotea but better. The presented system addresses some of the most important shortcomings of Annotea: that there are only plugins for the firefox browser (shutting out the majority of web users) and that there is no query language for annotations.

Actually I'm still quite annoyed that it only got accepted as poster. It was not "innovative enough", the changes to Annotea not big enough. Ahh well, I put it down to my bad writing. In a way I even agree that we don't need another Semantic Annotation Paper  - we need applications that come with a nice user interface and are usable "out of the box" (in particular without the need for the user to worry about finding a server - something you've to do with current Annotea tools).

The long version of the paper is here.  

It's been a long time ...

Yea, it's been a long time since the last post. I've started to be a bit more serious about sports and ended up going to training/playing every day - which cut down the amount of time I can spend on stuff like this Blog ... but such a long time without posts won't happen again ... probably  

Saturday, November 18, 2006

SPAM blog

Wondering about the long time without any posts? Google decided that this Blog is a SPAM blog and hence that I shouldn't be allowed to post anymore. Only tonight did I finally get the email from Google:

Your blog has been reviewed, verified, and cleared for regular use so that it will no longer appear as potential spam.

 I'm actually quite annoyed by that since I really haven't done anything questionable. I'm at a loss to explain what caused them to suddenly think that this is a SPAM blog ...  maybe that I posted a pretty long link list - but that would be a pretty shallow criteria. Maybe that I configured www.valentinzacharias/blog to forward to this site? That I added the link widget to the sidebar? Guess you can just accept a high false positive rate when the cost of false positives is not borne by you.  But yes - this is a free service so I shouldn't expect too much.

In the end this is just another reminder not give Google to much control over your data / your life (or the Internet as a whole, for that matter). I for one will be moving my blog over to my own server.  For the time being I'll still use Blogger - but with FTP publishing. At least I'll have the final say and cannot be locked out anymore.

Tuesday, October 24, 2006

The Return Of The Link List

In the early days of the Web everyone had a carefully collected list links- it was just so difficult to find anything that you had to store whatever you found on your computer in order to be able to find it again. Then along came Google and search seemed so simple that the importance of the link list decreased. But now it seems the link list is making a comeback - albeit in a different form:

Google Co-op allows a user to create and launch a search engine with just a few specific websites included. Searches will return results from only that website.

And actually Google is not the first to offer such a service, Rollyo, Eurekster and Yahoo did it before. 

So now after first only link list and then only search we now have the combination of the two. I think the next step is a search engine that automatically gives priority to the sites that you visited in the past (Google Toolbar and/or Google Desktop Search know these sites already anyway).