software analysis – ImageNotion

One thing that is still missing is the analysis of a semantic annotation tool for images. So lets have a look at ImageNotion. Unfortunately I cannot find the demo of the system anymore, i think the system is going to be used commercially.

ImageNotion was created by the FZI (Forschungszentrum Informatik) in Karlsruhe. It is a visual technique for the semantic annotation of images and their segments, with so called imagenotions. The systems stands out of the crowd of semantic annotation tools with its easy-to-use interface and by using nice web-2.0-techniques, for example to use drag&drop for the annotation of image regions. A user can point to a special region of an image by drawing a rectangle on the decided region and combining this selected region with an imagenotion. The system integrates the developement of the ontology in the annotation process. If there´s a missing relation, the user can simply define the new relation.

For example, it is possible to draw a relation like the following: You´ve got an image depicting the Eiffeltower in Paris.

So you start draw a rectangle on the region thats depicting the Eiffeltower, and enter a special label in form of a String. Now you can draw further imagenotions by saying, that you can find the Eiffeltower in Paris. Paris is the capital of France. And France is a member of the European Union. And so on.

So the creation of the ontology is left to the user by using simple and intuitive techniques, starting by the defintion of concrete terms like “Angela Merkel” to abstract terms like “politic”, by creating new notions.

Additionally the system provides techniques for the automatic extraction and annotation of images with detecting faces, emotions, gender, objects and text. This methods enables fast and clever annotation of images. If something is incorrect or missing the user  has the possibility to refine the result or correct it.

The future developement will concentrate on the developement of a mashup service, loading pictures from external sources and a web service for getting automatic annotations on any image just by sending an URL.

For the structured representation of the annotation MPEG-7 is being used. The description of persons is made by using semantic web standard FOAF.

The system is quite perfect concerning the annotation and ontologies that can be created. As summery for the developement of my prototype

  • The use of MPEG-7 for the description of technical parts and image regions should be replaced by using W3C recommendation W3C Ontology for Media Resources and Media Fragements
  • intergration of automatic processes should be integrated in future releases
  • FOAF should be used for the semantic description of persons

useful tools, slides, software and implementations

    • source code:
    • a java web service based on Apache Axis2
    • reads metadata of images of different formats (Dublin Core and MPEG-7 in this case) and displays these properties mapped to the W3C Ontology for Media Resources with the belonging API
  3. OpenCyc for the Semantic Web -
  4. DBpedia -
  5. Photometadata on Webpages
  6. Adobe XMP
  7. Raphael Troncy – addressing and annotationg multimedia fragments 
  8. yuma.min.js. - Open Source media annotation tool kit
  9. EuropeanaConnect - Media Annotation Prototype

vocabularies for the prototype

So after reviewing the discussed vocabulary that can be used to represent image metadata I´ve decided to take the following for the implementation of my prototype:

  • Technical, Copyright, Image Region Information – W3C Ontology for Media Resources
    This ontology is a recommendation from W3C as core voabulary for the description of media resources in the web. Within the ontology it is possible to realte to fragments of the item by using media fragments, which is a W3C recommendation, too.
  • Description of depicted persons – FOAF
    The Friend of a Friend Vocabulary should be used to annotate a person depicted in a picture.
  • DBpedia
    For the semantic description of any other object, DBpedia should be used. It offers a consistent ontology for 1.83 million things (416.000 persons, 52.600 places …  - read more). With DBpedia Lookup Services it could be possible to ease the creation of semantically defined objects, like for example The United States, cities, buildings, animals or famous persons relating to user contributed knowledge which is up to date. It is also possible to access the DBpedia Data Set over the web via a SPARQL query endpoint and as Linked Data
    I´ve decided just to consider persons depicted in a picture for the implementation of the prototype

main facts for prototype implementation – summery

So after analysing two applications I can make a first summary of the main facts concerning the implementation of a prototype:

  • the prototype should match the requirments of the embedded metadata manifesto and the metadata working group
  • therefor, the metadata should be stored in the image file by using XMP
  • the implementation should follow best practices using
  1. MVC-framework
  2. Apache Tomcat Server??
  3. JSF??
  • W3C Ontology for Media Resources as core vocabulary
  • other ontologies for semantic description of persons, objects and every thing else ;) (the prototype should be extensible in this case)
  • Media Fragements Spezification for the annotation of regions
Whats missing right now is the analysis of a semantic image annotation tool, which will be done soon.

Any comments and suggestions are welcome ;)

software analysis – stipple

There are lots of nice and interesting tools to annotate pictures with enriching information, that are not focusing on semantic markup but on interactive pictures. So I´ve decided to take a look at Stipple to get a clue about what might be interesting and important to semantic annotation of images and their regions. In my opinion, this service stands out of the crowd (in comparision to services like ThingLink or LUMINATE™) because of its clean design and lovely usability.

Stipple offers a clean and easy to use editor for the annotation of images and their regions. Just by clicking on a part of the image, a editor pops up and all you have to do is inserting any kind of URL. If this URL is related to e.g. wikipedia, vimeo, flickr or youtube, some information is automatically extracted and displayed in the region that has been annotated.

This is an example for an “stippled” image.

So its really simple to add content and links to your images and whats depicted inside. Additionally Stipple has the concept of supporting affiliates by adding products and shops that directly link to the store, where you can buy for example the shoe that is covered in the picture.

So, Stipple covers all aspects that might be interesting in a picture and a way to spot to related resources in the web.

To cover these information in a semantic meaning would be a great way to enable machines to know whats inside this picture encouraging new possibilities in search and retrieval of those images.

software analysis – photordf

There are plenty of software-tools and projects related to image annotations, already. So I have to analyse these existing solutions to extract nice features or missing features so my prototype application for region-based image annotation would be a great summery of all of them.

At first I will present PhotoRDF, a project published by the W3C in 2002.

It was a project for the demonstration of currently technologies under developement, like RDF Schema or the Jigsaw-Server. The goal was to emphasize the relation and the potential of RDF Schema in cases of metadata and the web.

The system provides three main parts

  • the RDF Schema
  • editor to annotate pictures (just JPEG images)
  • A module for the Jigsaw server that can serve either the JPEG image data or the RDF description that is stored in it, using HTTP content negotiation to determine which of the two a client wants.

    Diagram of the parts of the photo-RDF system. Top left: the pictures are digitized and stored as JPEG images. Bottom left: metadata is written into the pictures with the data-entry program (and can also be edited if corrections are necessary). Right: requests from the Web are served by Jigsaw, by sending either the picture or the metadata, depending on the form of the request.

To sum it up, I will only present the results of the analysis:

Nice features:

  • Metadata is stored directly into the image file –> this suits requirements of the embedded metadata manifesto (buuuuut, the data is stored in the COM section of the JPEG File – an alternative way might be XMP, which was just released at the time of PhotoRDF, because it serves different file formats and is stored is the APP1 Section of the file)
  • Editor for the annotation of images
  • serverside service for querying imagedata or metadata using HTTP-requests using content negotiation

Features to be extended:

  • no ability to annotate segments of the picture
  • the vocabulary used to describe the picture is not that expressive, only covering some elements from Dublin Core and some technical and subject features (e.g. covering if the picture is a landscape, portrait) –> vocabulary must be extended
  • the editor should be web-based too, not a desktop-application
  • it should build on best practices by using for example an apache-server instead of the rarely used Jigsaw-Server
  • editor does not care about existing metadata – so a new application should read that data to avoid doubled work

region-based image annotation prototype – basic requirements

To come clear about what kind of vocabulay and technologies I will need for the developement of my region-based image annotation prototype I´ve created a wireframe catching the main features, the prototyp should be able to do.

I will explain the main features of the prototype as well as suggestions for implementation.

Main principles of the prototype should be as follows:

  • the prototype should read exisiting metadata of the image and should map overlapping fields according to the mappings defined by the W3C Ontology for Media Resources (recommendation of W3C) as core vocabulary
  • It should be possible to annotate fragments of the image. Therefor,  Media Fragments  should be used.
  • it should be possible to edit this metadata. The core vocabulary could be edited by the API defined, for other properties there must be another solution.
  • it should be possible to embed the applied metadata directly into the file. Raphaël Troncy proposed using XMP for this.
  • for other properties, like the definition of a person depicted in an image, there has to be additional vocabularies to be used (Wordnet, rNews)

  1. Depicted Person – link a person displayed in the picture to profiles and databases on the web to make identification easy (maybe by using FOAF vocabulary)
    • E-Mail
    • name, given name
    • URL (e.g. facebook, own homepage etc)
    • etc
  2. Linked Information – for example you´ve found the eiffeltower and you want to link the segement of the image with the belonging article of wikipedia
  3. Place – you want to link the location where the photo has been taken
  4. Media – You want to embed additional media items to the segment (maybe another shot of the same object from different perspective)
  5. The image you want to annotate (uploaded)
  6. Information about the author (from w3C Ontology for Media Resources)
  7. Information about copyright and licence to know have other can use the image (from w3C Ontology for Media Resources)
  8. Technical and administrative information extracted from the image file (from w3C Ontology for Media Resources; EXIF data that isn´t mapped to the Ontology, IPTC data)

So the main question that comes up is how to integrate other vocabulary and ontologies into the Ontology for Media Resources to have a clear base for interoperable metadata?

The features presented are inspired by serveral other applications and can be referenced to StipplePhoto RDF and the extraction of EXIF data like it is done on Flickr.

A brief summery and analysis of each application will follow.

Other thoughts concerning this prototype are future directions like:

  • how could other ontologies be imported to annotate what ever you like in a semantic way (this kind of application can never reach a level of completeness)
  • what about search engines and other web crawlers? how can they index the content of the annotation? maybe you could produce an automated HTML document for each image with embedded microformats, RDFa(preferred way – w3c recommendation) or microdata that can be understood by search engines

Blog Release

Today, this blog has been launched to document the process of my bachelor thesis and to have a communication tool for everybody supporting and following the developement of my thesis.

Please stay tuned – the blog will be filled with further content in the next days.