Sunday, October 30, 2005

Internet and Technology Trends in Germany

The german edition of Technology Review comissioned a Delphi study on long term technology trends in Germany. The (german) report about the study can be read here.

Asked about technology trends that are important for Germany the experts said nanotechnology (81%), medicine technolgy (79%) and automation technology (70%). On the last place came - and that is why I'm writing it here - "Internet technology" (which I would assume to include the semantic web). It is worrying that only very few experts seem to belive that there is a great potential in internet technology (at least for Germany)

Friday, October 28, 2005

Semantic Web + Web 2.0 + eLearning = ?

I posted my newest paper here[pdf]. I will present this paper at the International Conference on Intelligent Agents, Web Technology and Internet Commerce at the end of next month.

In this paper I describe how we are trying to take succesfull Semantic Web and Web 2.0 ideas (like RSS, tagging and REST) and slightly extend them in order to get a bit closer to the idea of the Semantic Web. The title of the paper is "A Metadata Registry For Community Driven E-Learning Sites", the abstract is:

We present the architecture and the interface of a metadata registry for a large e-learning site. The metadata registry is very simple to integrate by content and application providers and thereby tries to motivate more members of the community to contribute. It takes its inspiration from currently successful Semantic Web architectures and aims to be an evolutionary change to the web – using long established standards where possible.

The same application is also described in an poster at this years ISWC. Due to an upcoming deadline, however, I can't afford to be away for a week and won't be there personally.

The idea to use RSS as an example for a succesfull Semantic Web application and carefully build upon this ideas is a bit older - we have already described similar ideas last year in the paper Semantic Announcement Sharing. There we analyzed RSSs success, looked for a domain that is similar in some important aspects, replicated the basic RSS structure and added taxonomies and great extensibility. The result was the specification of an Semantic Web application for the sharing of information about events. Sadly we never had the resources to really push this - still think it was a good idea.

Tags: , , ,

Tuesday, October 25, 2005

Google Base & the Semantic Web

Google is rumoured to unveil a service called "Google Base" at tomorrows Zeitgeist conference. At base.google.com it said (for a while):

Google Base is Google’s database into which you can add all types of content. We’ll host your content and make it searchable online for free. [...] You can describe any item you post with attributes, which will help people find it when they search Google Base.

The way I see it, this is very similar to a simple Semantic Web vision (a database like web) just with central storage. It tries to solve some part of what the Semantic Web set out to solve (having structured content on the web). But everybody should be clear that you can't remove the "Web" from "Semantic Web" at no cost - having all this data under their control hands Google a unique position on the search market - a position that then is almost impossible to challenge. Just like Google Print and Froogle this is an attempt to bind costumers to Google by getting exlusive content that is only searchable through Google. Semantic Web researches should be aware that Google cannot be interested in the Semantic Web, because the Semantic Web would make it easier to challenge Google. Google wants structured data - but it wants it exclusively.

Update: Google did not unveil the service, but they wrote a short bit about it on their blog.

Tags: , ,

Vision for the Future Web

The context:Danny Ayers recently asked about alternatives to the semantic web vision:

One of the reasons the Semantic Web vision appeals to me is I lack the imagination to think of alternatives. But there must be other strategies for enhancing the Web. As far as I can see it does seem a natural progression from a Web of Documents to a more general Web of Data, and it also seems to make sense to use URIs as the key identifiers. Er… but that’s the Semantic Web.
Readers of his site proposed some alternatives: Syn Web, Padabam and TagTriples.


It seems to me that the all the proposed alternatives are actually just different instantiations of the Semantic Web vision:

the idea of having data on the web defined and linked in such a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications
The proposed alternatives differ on the axis of SemiStructured Web vs. Database Web vs. Knowledgebase Web - surely an interesting debate, but one within the Semantic Web vision.

I think the first question when searching for an alternative for the Semantic Web vision should be "Isn't the internet to large and diverse for just one vision?" And I belive yes - it is and development IS progressing on many dimensions, the Semantic Web vision is just one of many. Richard Reisman made a nice compilation of dimensions of the future web: the Content Web (the one we know and love), the Social Web, the Semantic Web, the Service Web, the Spatial Web, the Temporal Web, the Sensor Web and the Communication Web. I have my doubts about some of these dimensions, but the Spatial Web and the Social Web are surely interesting visions for the future web.

Tags:

Sunday, October 23, 2005

Next Generation Knowledge Access

Next Generation Knowledge Access is the title of one paper about the Special Issue Semantic knowledge management from the Journal of Knowledge Management.

The paper is mostly about work done in the SEKT project, but they also includes a nice compilation of trends in knowledge access. These trends are (more about each point in the paper):

  • Desktop search
  • Categorisation (like verity or clusty)
  • Integrated search (searching by highlighting in something in Microsoft Word ..)
  • Seamless search (implicit queries based on user activities
  • Personalisied search
  • Beyond search (help the user in analysing, working with the results)
  • Visualization
  • Device independence

Tags: , ,

Friday, October 21, 2005

Virtual Sticky Notes

Leo Sauerman gets REALLY excited about semapedia, an application that allows to tag the real world with links to wikipedia articles. While I was still wondering: "why only wikipedia articles" I came across an article about Social Light - they allow you to tag the real world with anything you want and lets you see the tags/annotations of your friends.

While I do think that is a nice idea I don't understand the excitement. First, the idea is pretty old and secondly the infrastructure just isn't there yet.

The idea is prety old, because I saw a similar idea in some EU project years ago, the augmented reality people are talking about virtual billboards for sometime now and - whether you belive it out not - I though about starting my own startup with an idea very similar to sociallight four years ago (and I don't think I was the first to come up with this idea). I was serious enough to apply to Jamba (to work there for a while in order to learn how to program mobile phones) and to go to the CEBIT to talk to everybody on the T-Mobile stand (at this time they had just released their first location aware application for mobile phones - a Michelin restaurant guide) ...

The reason they are going to run into technical problems is still the same that back then stopped me from actually doing it (well, besides me beeing a coward): With current technology your talking of a best case accuracy of maybe 50m in the inner cities - outside of cities you measure the accuracy in kilometers. That's just not good enough to annotate your favorite shop or bar - don't even think about that doctors office in a multistory building. You could of course add a GPS to your phone, but this still hasn't happend on a big scale (its difficult because the power consumption of GPS receivers is pretty high) and GPS has its problems in cities and inside buildings. Sociallight only works on phones with GPS - but then, it only works with exactly one type of phones (Motorola i860)
BTW: I believe that these technical problems are also the reason why google hasn't started with location aware advertisements yet(they are thinking about that - read the eula for the GoogleMaps JavaScript API). These advertisements may one day pay your mobile phone bill!

I do think that one day we will have applications similar to sociallight and maybe now is the right time to start, but I'm sceptical that I will be seeing "StickyShadows" in Karlsruhe anytime soon. It may be a little different in America because there the mobile phone companies where under pressure to provide a decent localisation of calls made to the emergency services - I did not follow these developments after I dropped the idea of my own company.

Tags: , ,

OOPS, they rejected my paper ...

Sadly the OOPS (object oriented programming languages and systems) rejected my "Simple F-Logic Data Objects" paper. That's actually the first time this has happened to me .. and some reviews where actually quite positive (I'm especially grateful to the reviewer that gave me 8 out of 10 for style and clarity - guess he/she can't speak english either :-)) I'm not sure if I'll find the time to enhance it and to send it to another conference, so I'll post the paper here.

Abstract:

We present SFDO - Simple F-logic data objects - a lightweight Java middleware that allows to put Java objects into a F-logic knowledge bases and reconstitute the Java objects from the data in the knowledge base. Unlike in object-relational mapping frameworks the data that is used to create Java objects may not be explicitly contained in the data store but may be the result of complex inference processes.

It may interest some that most reviewers agreed that OO - Logic Programmming integration is a very interesting topic. On the other hand most reviewers - quite rightly - agreed that this paper is too shallow .. (nevertheless the framework works quite nicely)

And I changed the template for this blog, you can find now find all my publications to the right.

Thursday, October 20, 2005

Asian Semantic Web Conference 2006

For all of you in organisations with unlimeted travel budget: for the first time next year there is a Asian Semantic Web Conference (ASWC 2006).

Tags:

Future challenges for law

I wrote about future challenges for IT and there are quite a few people thinking about that - but I'm wondering: are there somewhere researchers in law doing the same? I'm really annoyed hearing about "law problems" that prevent robotics from beeing used. Just heard the chief scientist from VW talk about "Stanley" (the robot that won the grand challenge). He was asked how long it will take until Stanleys descendants will drive us to work. He couldn't give an estimation because of "law problems". Yes - of course - it is a difficult question who is responsible when a robot causes a accident - but well, a lot of technical challenges need to be solved as well and we need a solution anyway: why not start now? Its ridiculous that there are now cars (very close to the market) that can park by themselves - but are not allowed to, the drivers still needs to be in control of acceleration and the brake; because of "law problems" ...

And maybe we could use a good idea how to tax work performed by robots & computers? (not sure if we want that, though) And don't forget about the new intellectual property rights that can actually work in the digital age.

Actually I'm willing to admit that there is a good chance that there are researchers worrying about that - and its polititians that are just to concerned with terrorism and bird flu. At least In this case we can count on the industries buying the attention of politicians as soon as these things get financially more interesting ;-).

, ,

Tuesday, October 18, 2005

Grand Challenges For IT Research

Last year a group of british scientist tried to identify the future grand challenges in computing research - the results are still interesting to read. The grand challenges are:

  • In Vivo-in Silico - computer simulation of entire living organisms, from the cellular scale on upwards
  • Science for global ubiquitous computing
  • Memories for life: managing information over a human lifetime
  • Scalable ubiquitous computing systems
  • The architecture of brain and mind
  • Dependable systems evulution
  • Journeys in non-classical computation
You can read more about each challenge in the report they published.

Tags: ,

A Cool Semantic Search Engine - Finally

Search Engine Watch yesterday wrote about the new specialized medical search engine healthline. I had a look at it and I'm happy to report that it is the first really cool semantic search engine in the wild. Granted, building such a system is not rocket science - but still, I'm not aware of any public website that has a comparable semantic search engine.

The architecture of healthline is a classic "Pre/Post semantic seach engine" - the simplest kind of a semantic search engines (I'll write about the other possibilities: "pseudosemantic search engine" and "semantic search engine" some other day). At the core of such a search engine is a traditional text index, retrieval is only based on the text of the documents no metadata is used whatsoever. The background knowledge is only used to augment a query before it is posed to the text index and in the end to enrich the result that is displayed to the user. Once a query comes in, it is first examined for references to the ontology/taxonomy, if some are found the terms that reference the taxonomy/ontology may be removed, replaced or some more terms may be added. An example for healthline would be the search that uses a casual term for a disease that is replaced by the standartized medical term. The altered query is then used to query the text index. In the end the result that is returned from the text index is enriched based on the background information. Healthline enriches the result with links offering query refinement ("narrow your search") and for query relaxation ("broaden your search"). As a nice touch they sometimes offer "information maps" (search for "Prostate cancer" to see one) for important topics.

If you think you need such a search engine you can contact me or one of my employers (fzi or ontoprise). We have already build similar system (sadly only used in corporate intranets, so I can't link to them here) and would love to make more!

Update: In their feedback to my feedback the makers of healthline point out, that

we're doing more than just full text index/retrieval.
My guess is that they are increasing recall by either using automatic classification (unlikely, because they would need training data) or Latent semantic indexing

Tags

Monday, October 17, 2005

Semantic Knowledge Management

The Journal of Knowledge Management has a special issue: "Semantic Knowledge Management", the entire conent is available online .

Tags: ,

Another Semantic Newsportal ...

Inform started its public beta today. It looks similar to Yahoo or Google News but offers a unique way to dig deeper: for each article there are links for people, places, organizations, topics, industries and products mentioned in the article; clicking on one of these links brings you to articles about this topic. It also proposes new articles based on what you looked at in the past. The New York Times has an article about inform.

The unique selling point of this site seems to be the good application of named entity recognition (unlike it competitor Topix.net, where it was automatic classification, they only use named entity recognition for locations). I would assume that this algorithm works with some kind of background knowledge (at least a long list of possible persons, products etc.). They also do automatic classification (for "industries" and "topics") and I give it a fair chance that they are using the output of their named entity recognition + background knowledge to increase the classification performance (you could have a list of people in sport as background knowledge, recognizing that the article talkes about one of these persons increases the chances that it is about Sport). Inform does not have as much background knowledge about places as topix.net.

Overall it looks like it could become a nice site for news junkies (once they index more stories). For me the named entities looks like a more useful way to browse through the news than the many categories of topix (well, actually I want both! And a decent ontology of places integrated with Google Earth, Maps and its competitors)

Update:
Someone from topix.net informed me, that they are using named entity recognition as part of their classification and for more than just locations .. Mea culpa! So the difference comes down to how the classification / named entity recognition results are presented to the user. Topix.net makes it easy to find stories for a particular topic and to integrate the classification results in other sites (see for example the nice topix.net integration in healthline.com). Inform.com is more tailored to people that want to browse through the news, maybe not looking for a particular topic but exploring.

Tags: , , ,

The no-computer virus

I found this article in an old economist while cleaning my desk. I really liked this article, because it clearly describes the extend of the problems that could at least be lessened by IT, explains why interoperability is so central to IT for health care and highlights another advantage that may result from more IT in the health care sector: that people finally get full access to and possession of information about their own health.

Some quotes:

People on the right side of the digital divide increasingly take for granted that they can go online to track their FedEx package, to trade shares, file taxes and renew drivers' licenses, and to do almost anything else-unless, of course, it involves their own health. That information, crumpled and yellowing, is spread among any number of hanging folders ar all the clinics they have ever visited, and probanly long since forgotton about. The most intimate information is, in effect, locked away from its owners in a black box.
This [the reluctance of actors in the health care industry to use IT in the back-office] has perverse consequences. According to the Institue of Medicine, a non-governmental organisation in Washington, DC, preventable medical errors - from unplanned drug interactions, say - kill between 44,000 and 98,000 people each year in America alone. This makes medical snafus the eighth leading cause of death, ahead of car accidents, breast cancer and AIDS. "It's like crashing two 747's a day." says Mark Blat
A study from the clinical reasearch centre at Dartmouth College [...] estimates that a third of America's $1.6 trillion in annual health-care spending (as of 2003) goes to procedures that duplicate one another or are inappropriate.
Estimating how much IT could save, after taking account of the considerable cost of applying it widely, is not easy. Writing in Health Affairs [...] Jan Walker and five colleagues [...] concluded that a fully interoperable network of electronic health records would yield $77.8 billion a year in net benefits, or 5% of America's annual healthcare spending. This includes saving from faster referrals between doctors, fewer delays in ordering tests and getting results, fewer errors in oral or hand-written reporting, fewer redundant tests and automatic ordering and re-fills of drugs. It does not include, however, perhaps the biggest potential benefit: better statistics that would allow faster recognition of disease outbreaks (such as SARS or avian flu).

I don't agree, however, with what the econmist sees as "the potential biggert benefit". In my opinion that is the chance to use data mining on the medical data for quality assurrance. To always evalute which treatments work and to check for unexpected side effects of drugs / specific treatments.

Tags: ,

Saturday, October 08, 2005

Congratulations Stanford ..?

Sadly the grandchallenge.org site is a bit broken - the time of a team that has crossed the finish line does not stop - rather it keeps increasing... But I was just checking the status board when the first teams crossed the finish line: by this time the team from Stanford was ahead of Red Team Too (they started 5 minutes earlier than "Stanley") and I'm pretty sure that it was more than two minutes until the other Red Team car crossed the finish line. 3 other cars are still running, but they all have more than 50 miles to go, not more than 2 1/2 hours left (to beat Stanleys time) and all took more than 5 hours for their first 50 miles ...

Update:
engadget is also confused ... they also saw Stanley win, but then changed their mind (I guess they haven't realized that the timers did not stop). However on this page there is also a CMU student posting a comment that says that the times on the Grand Challenge page are not corrected for pauses (the referees can pause a car, for example to let another car pass, these times are deducted from the total).. if this is true, than there may be a different winner.

Update 2:
Alright, I guess by now everybody knows, but for the sake of completeness: Yes, Stanford won, it's now official.

Anyway: both RedTeams and Stanford did the track in under 8 hours - an impressive feat indeed.
On the other hand many very qualified teams did not make it through the track and I'm especially sad for team DAD, they where moving faster than all the top teams and this only relying on visual sensors and a team consisting of four technichians! From the little information that is on their website it also appears that they are using some really interesting technology. I'm certain they will be trying to market their stereo vision technology - surely a company to watch.

Tags:

Friday, October 07, 2005

Moving from Wordpress.com

I'm moving my Blog from Wordpress.com to Blogger today .. I really liked Wordpress.com, especially their style sheets are beautiful - but when it comes to reliability they suck big time. In the last week there where three occasions where the entire server was down (one was planned), two times my rss feed was marked as unreachable by Bloglines and twice when I tried to edit posts, the server was so slow that I gave up.

Wordpress.com is not a commercial service, they do not put ads on your blog and don't charge you money. A main goal of wordpress.com seems to be to test new versions of the wordpress software, so I could not and did not expect a very high reliability. Nevertheless the way it turns out it's just unusable. I guess the target audience is people wanting to play with the current wordpress software, not people that want to create and maintain a blog.

If - after all this - you still want an account at wordpress, send me an email (I still have 1 invite). My email address is username "zach" at the server "fzi" in Germany (tld = "de").

Tags:

Thursday, October 06, 2005

Entertaining AI links

Autonomous robotic fish made it into the London aquarium (Video), while japanese robots learn to bicycle (actually: to keep the balance on the bike even when stationary - they managed to build bicycling robots many years ago).

The German weekly "Die Zeit" has a quite interesting (but loooong) article about AskJeeves.

And finally, although not really AI related: need a bigger mobile display? You may want to Tatoo one into you skin

Tags:

Wednesday, October 05, 2005

Topix.net - A Concept Based News Portal?

Just came across an article describing Topix.net - a site that automatically classifies news stories according to their topics. Another major feature seems to be the integrated background knowledge about US geography. Search for Springfield to understand what I mean: not only does the system ask for a disambiguation about which Springfield you mean, but it also offers a list of "Nearby Cities".


Tags: ,

The RFID Conspiracy …

Wired has a review of the book "Spychips : How Major Corporations and Government Plan to Track Your Every Move with RFID", almost the same book is also availlable as "The Spychips Threat : Why Christians Should Resist RFID and Computer Tracking".
Albrecht and McIntyre make a staggering accusation in Spychips: that Philips, Procter and Gamble, Gillette, NCR and IBM are conspiring with each other and the federal government to follow individual consumers everywhere, using embedded radio tags planted in their clothing and belongings.

I'm always wondering when privacy advocates will awake to the dangers of face recognition; afterall RFID Tags need to be put on a person, both the RFID tag and the transceiver are easily detectable, you can destroy RFID chips and you can easily put things in a conducting bag - stopping anyone from "seeing" RFID tags inside. With face recognition things are different: all you need is (a couple of) cameras that can be easily concealed or you just reuse existing CCTV cameras. No need to put anything on a person, no way to tell if your face is recognized and face recognition software has already shown is robustness against disguises.

The potentials for this technologies are endless - finally supermarkets can track how often costumers come to the store, even if they pay cash or don't buy anything at all. After you paid with your credit card once, all employees in all stores of the same chain will greet you by name ... Expect face recognition enabled cctv cameras in the supermarket near you in 5 years.

Tags: ,

Grand Challenge finalists announced

DARPA has announced the finalists for the Grand Challenge. I'm saddened that Team Jefferson and their robot (propably the only Java Bot) didn't make it :-(

Tags: ,

Tuesday, October 04, 2005

Altova SemanticWorks 2006

From Businesswire:

Altova(R) (www.altova.com), creator of XMLSpy(R) and other leading XML, data management, UML, and Web services tools, today announced a new addition to its award-winning line of XML applications. Altova SemanticWorks(TM) 2006 is a visual Semantic Web development tool with support for Resource Description Framework (RDF) and Web Ontology Language (OWL) creation and editing.
Altova SemanticWorks allows developers to graphically create and edit RDF instance documents, RDF Schema (RDFS) vocabularies, and OWL ontologies with full syntax checking. Context-sensitive entry helpers present developers with a list of permitted choices based on the RDF or OWL dialect they are using, so they can create valid documents quickly and easily.
More at the Altova Page. It costs $249, but there is a 30day trial version available.

Tags:

First finalists for DARPA Grand Challenge ..

DARPA has changed the qualification schedule - 10 teams are already qualified for the finals and don't need to make more track runs (out of a total of 43 semifinalists, 20 teams will be selected to compete in the finals).

The lucky ones are:

  • RedTeam (Sandstorm and H1ghlander
  • IVST1
  • Team TerraMax
  • Team Cornell
  • SciAutonics
  • Virginia Tech team Rocky
  • Axion racing
  • Standford Racing
  • Team Caltech
  • btw: Grand Challenge Pictures at Flikr.

    Update: DARPA now says that:

    DARPA cautioned teams and observers to draw no conclusions from today's approach to runs and practice. The NQE evaluation process includes many tests and standards, of which actual run completion is only one measure.
    So I might have been a bit fast with calling the ten teams "finalists" (but hey - Team Cornell even says so on their homepage) but these teams did very well and I would be surprised if one of them gets eliminated in the semifinals. Tonight we'll now.

    Update 2: Tom's Hardware Guide has nice Grand Challenge coverage

    Tags: ,

    Saturday, October 01, 2005

    Ugly Ants and Google Earth

    From the Google Blog:

    At a time when the power of information technology doubles every 12 to 15 months and extends to capture every scrap we have, digitizing biodiversity information is a final frontier for IT. It's an essential step to ensure society maintains and hopefully increases bio-literacy. Toward this end, there's Antweb. It's a project from the California Academy of Sciences that has incorporated the Google Earth interface to provide location-based access to the diversity and wonder of ants: from your backyard to the Congo Basin.

    You can find more information at GoogleBlog and learn how to try it yourself at the Ant Web page. In the Google Blog entry you can also see the picture of the (extremely ugly) ant "Proceratium google", named this way to thank google for their support in building the application.

    In case you're wondering why I write about this in the Semantic Web context: Everybody can have her data shown in Google Earth if she just creates a suitable XML file - and thats exactly the idea of the Semantic Web: more machine understandable web data makes the web more user friendly / powefull. And now that this KMZ file is on the web you can use it in other applications - for example your tourism information system for people with Myrmecophobia (fear of ants). Alright ... that may not the best example, but you get the idea.

    Tags: ,

    Grand Challenge

    The qualification for DARPA's robot race "Grand Challenge" has started on the 29th. The real race is planned for October the 8th. You can get much more information on www.grandchallenge.org (click on NQE for the newest qualification results). Some pictures can be found at C-Net.

    Tags: ,

    NSF Funding for Semantic Web Development

    From Genetic Engineering News

    The National Center for Genome Resources (NCGR) announced today a $1.7 million grant from the National Science Foundation (NSF) to develop a Virtual Plant Information Network (VPIN) in collaboration with Cold Spring Harbor Laboratory (CSHL), Cold Spring Harbor, N.Y., and The Institute for Genomic Research (TIGR), Rockville, Md.
    The VPIN will greatly advance semantic Web development for biologists by allowing multiple plant information Web sites to associate their data and services with publicly accessible ontologies

    (I'm always happy to hear that even without DARPA there is still some public US money for Semantic Web research)

    Tags: