SAWA (Similarity Algorithm based-on WikipediA) has been developed in order to suggest semantic annotations to each web service. It is able to compute text-to-text semantic similarity between phrases. This algorithm returns a value between 0 and 1: "0" means that the phrases are absolutely not similar, whereas "1" means that the phrases are completely similar. SAWA has been created as an extension of a word-to-word similarity algorithm that uses Wikipedia dump as a corpus. The algorithm is deeply optimized and it can annotate an entire web service in few seconds.
SENSE (SEmantic N-levels Search Engine), is an IR system that tries to overcome the limitations of the ranked keyword approach, by introducing semantic levels which integrate (and not simply replace) the lexical level represented by keywords. Semantic levels provide information about word meanings, as described in a reference dictionary, and named entities. SENSE is able to manage documents indexed at three separate levels, keywords, word meanings, and entities, as well as to combine keyword search with semantic information provided by the two other indexing levels.
OTTHO (On the Tip of my THOught) is an information seeking system designed for solving a language game which demands knowledge covering a broad range of topics, such as movies, politics, literature, history, proverbs, and popular culture. OTTHO implements a knowledge infusion process in order to provide a background knowledge which allows a deeper understanding of the items it deals with. The knowledge infusion process consists of two steps: 1) extracting and modeling relationships between words extracted from several knowledge sources; 2) reasoning on the induced models in order to generate new knowledge. OTTHO extracts knowledge from several sources, such as a dictionary, news, Wikipedia, and various unstructured repositories and creates a memory of linguistic knowledge and world facts. Starting from some external stimuli (e.g. words) depending on the task to be accomplished, the reasoning mechanism allows retrieving some specific pieces of knowledge from the memory created in the previous step. OTTHO has a great potential for more practical applications besides solving a language game. It could be used for implementing an alternative paradigm for associative information retrieval, for computational advertising and recommender systems.
Natural Language Processing (NLP) has a significant impact on many relevant Web-based and Semantic Web applications, such as information filtering and retrieval. Tools supporting the development of NLP applications are playing a key role in text-based information access on the Web.
META (MultilanguagE Text Analyzer) is a tool for text analysis, designed with the aim of providing a general framework for NLP tasks over different languages. The system implements both basic and advanced NLP functionalities, such as Word Sense Disambiguation.
FIRST (Folksonomy-based Item Recommender System) is a semantic content-based recommender system capable of providing recommendations for items in several domains (e.g., movies, music, books), provided that descriptions of items are available as text documents (e.g. plot summaries, reviews, short abstracts). The inceptive idea behind FIRSt is to include folksonomies in a classic content-based recommendationmodel, integrating static content describing items with dynamic user-generated content (namely tags, through social tagging of items to be recommended) in the process of learning user profiles.
STAR (Social Tag Recommender) is a content-based tag recommender system. STaR is based on two main assumptions: 1) resources with similar content should be annotated with similar tags; 2) the previous tagging activity of users should be taken into account. The recommendation approach is based on two steps: 1) a preprocessing step, that exploits Apache Lucene engine in order to build indexes containing the information about the resources already tagged (by the user and by the community), and 2) a filtering phase, in which the system gets the most similar resources and builds a set of candidate tags by assigning to each one a score based on its occurrences in similar resources. STaR participates at the ECML/PKDD Discovery Challenge 2009.