In nearly all fields of every-day life and work, where people deal with digitized data, the amount of electronic documents and media items is increasing permanently. Thus, finding relevant information efficiently within document and media collections gets even more difficult. However, personal media collections (e.g. photos, videos, etc.) comprehend knowledge which represents context and individual view of the owner, i.e. experiences, memories, emotions, or attitude. This inherent knowledge acts as the ultimate key for managing digital media collections in a way that is suitable for a broad user group. Though, this requires an appropriate machine-processable description, which allows presenting and processing navigation paths and arrangements by applications based on human knowledge "behind" the data.
The question which arises with regard to the comfort of the user, is, how those semantic descriptions could be generated automatically or semi-automatically from existing information and features. Information about media exists to some extend in the form of annotations and metadata - as explicit definitions - for specific formats actually quite comprehensive and according to established standards (cf. EXIF, IPTC, XMP, etc.). On the other hand a substantial portion of knowledge results implicitly from the content itself (e.g. persons or locations depicted in a photo), the structure, creation context and characteristic features of a document (e.g. size and layout of a text).
Today, common management systems and applications only weakly support such knowledge structures as they are typically limited to hierarchical navigation and storage of information, and extract selected sets of characteristic metadata or features. Problems and barriers which typically appear when users deal with search and management tasks within personal media collections mainly result from lacking expressiveness and flexibility of the traditional data models to represent individual knowledge.
Aim of this project is the development of a concept for semantic-based management of personal media collections, which allows the user to apply individual knowledge models and paths with preferably little effort, and which ensures machine-processability and interchangeability.
According to the given outline of the project goal the following four working areas arise:
A characteristic of the project is the overlapping exploration of these working areas, which are mostly dedicatedly examined in related work. In the context of personal media collections it is particularly necessary to investigate acquisition, modeling, storage, and usage of information likewise as interdependent partitions, and thus develop a holistic concept. Furthermore, long-term aspects of ontology-based personal media and knowledge management are up to now only rarely or not specifically examined.
Media items should be processed and analyzed appropriately to acquire information about content and context. Valuable information can be found within file system information and format-specific file header entries (EXIF, IPTC, XMP, etc.). Depending on the media type structural information (global features, resp. local features after segmentation or decomposition) might be extracted to be used for clustering or classification according to prototype pattern. Furthermore, structural information like color distribution, etc., can be used to efficiently arrange and visualize media collections, and support navigation through "loose" categories.
To enable semantic modeling and storage of content- and context-oriented information about media items, it is necessary to develop and apply an appropriate ontology model. With the help of semantic technologies (RDF, OWL) the model can be specified in a machine-processable and interchangeable way. Therefore, descriptions can also be shared and connected within cooperative Web-applications and information systems (Semantic Wikis, etc.).
Regarding the semantics of media items and media collections, individual views and contexts play an important role. This leads to the question of how users deal with media collections, what strategies they have, and how their individual knowledge can be exploited. With the help of questionnaires and evaluations we try to clarify, how individual knowledge could preferably be accessed and what problems and barriers of management and search could be tackled with the chosen ontology-based approach. Moreover, it is necessary to consider where and to what extend the user could/must be granted liberties and intelligent assistance. Additionally, regarding semantic modeling and the applied ontology model, profit and usability of personal knowledge models and ontology extensions are investigated.
Personal media collections and the related knowledge usually exist over years (or decades) and are subject to continuous change. As views and attitudes shift information is added, refined, replaced, or removed. In connection with ontology-based knowledge management over long-time periods consistency and usability must be preserved, and change and evolution should be transparent to the user.