SWAP - Semantic Web Access and Personalization Research Group

Semeraro.PhDGraduates History

Hide minor edits - Show changes to output

July 21, 2011, at 10:49 AM EST by 193.204.187.218 -
Added lines 7-8:
[[#Caputo|Annalina Caputo]] (ciclo XXIII)
Deleted lines 9-10:

[[#Caputo|Annalina Caputo]] (ciclo XXIII)
July 21, 2011, at 10:48 AM EST by 193.204.187.218 -
July 21, 2011, at 10:48 AM EST by 193.204.187.218 -
Changed lines 468-469 from:
-----
[[#Iaquinta]][[#Top|(^)]]
to:
[[#Caputo]][[#Top|(^)]]
Changed lines 524-525 from:
\\\
\\\
to:
-----
[[#Caputo]][[#Top|(^)]]
July 21, 2011, at 10:46 AM EST by 193.204.187.218 -
Added lines 9-10:
[[#Caputo|Annalina Caputo]] (ciclo XXIII)
Added lines 468-469:
-----
[[#Iaquinta]][[#Top|(^)]]
Added lines 471-524:
[[http://www.di.uniba.it/~swap/index.php?n=Membri.Caputo|'''[++Annalina Caputo++]''']]

'''''Semantics and Information Retrieval: Models, Techniques and Applications'''''

[-'''Abstract'''-]

The dialogue between humans and machines takes place on two different levels.
Along the path from the user’s mind to a machine representation, concepts,
relationships and meanings are translated into a flat unstructured form deprived
of its original meaning. This process, which also affects text representation,
impacts on Information Access systems, and in particular on the Information
Retrieval (IR) ones. The key concept in such systems is the word ''information'',
but when text is represented as an unordered sequence of words the retrieval
task becomes a mere string matching based process. In this context, user’s
vagueness and word ambiguity become a big challenge for IR systems.

Over the past decades several attempts have been proposed to deviate from
the traditional keyword search paradigm, often by introducing some techniques
to capture word meanings. The result is a vast area of approaches that aimed
at harnessing the semantics in the Information Retrieval reins, working on two
different fronts. The former tries to introduce semantics by modeling word
meaning directly into document representation. The latter tries to build an
ameliorated query representation by shifting from what the user asks to what
the user wants. However, the general feeling is that dealing explicitly with ''only'' semantic information does not improve significantly the performance of text
retrieval systems.

The work presented in this thesis explores the usage of semantics in Information
Retrieval on two separate fronts: documents and queries. ''Semantics'' has
many facets and several interpretations in Computer Science, but this thesis
focuses on ''lexical semantics''.

The first part of this dissertation deals with semantics in documents. Firstly,
it is presented SENSE (SEmantic N-levels Search Engine), an IR system that
tries to overcome the limitations of the ranked keyword approach, by introducing
''semantic levels'' that integrate (and not simply replace) the lexical level
represented by keywords.

Two algorithms are proposed for representing word meanings in SENSE: the former is based on Word Sense Disambiguation, while the latter exploits Word
Sense Discrimination.

The second part of this work tackles with semantics in queries. One of
such approaches is the Query Expansion (QE). Two well known QE algorithms
are investigated within the SENSE framework: Rocchio and the Local Context
Analysis.

Lastly, this thesis faces the problem of building complex queries able to
represent concepts and their relationships. Complex queries are built exploiting
the quantum algebra for structured queries within the Quantum IR framework.

All proposed algorithms and approaches are evaluated on standard test collections,
and results show that most of them are effective ways for improving the
retrieval task. The methods presented in this thesis demonstrate as the point
in question is ''how'', rather than whether, add semantics to IR.
-----
Added line 526:
\\\
May 26, 2011, at 07:14 AM EST by 193.204.187.33 -
Changed lines 5-21 from:
[[#Basile|Pierpaolo Basile]] ciclo XXI

[[#deGemmis|Marco de Gemmis]] ciclo XVI

[[#Gentile|Anna Lisa Gentile]] ciclo XXII

[[#Iaquinta|Leo Iaquinta]] ciclo XXII

[[#Licchelli|Oriana Licchelli]] ciclo XVII

[[#Lops|Pasquale Lops]] ciclo XVI

[[#Palmisano|Ignazio Palmisano]] ciclo XIX

[[#Redavid|Domenico Redavid]] ciclo XX

[[#Tinelli|Eufemia Tinelli]] ciclo XXI
to:
[[#Basile|Pierpaolo Basile]] (ciclo XXI)

[[#deGemmis|Marco de Gemmis]] (ciclo XVI)

[[#Gentile|Anna Lisa Gentile]] (ciclo XXII)

[[#Iaquinta|Leo Iaquinta]] (ciclo XXII)

[[#Licchelli|Oriana Licchelli]] (ciclo XVII)

[[#Lops|Pasquale Lops]] (ciclo XVI)

[[#Palmisano|Ignazio Palmisano]] (ciclo XIX)

[[#Redavid|Domenico Redavid]] (ciclo XX)

[[#Tinelli|Eufemia Tinelli]] (ciclo XXI)
May 26, 2011, at 07:13 AM EST by 193.204.187.33 -
Changed line 50 from:
[[#Top|(^)]][[#Lops]]
to:
[[#Lops]][[#Top|(^)]]
Changed line 79 from:
[[#Top|(^)]][[#Licchelli]]
to:
[[#Licchelli]][[#Top|(^)]]
Changed line 136 from:
[[#Top|(^)]][[#Palmisano]]
to:
[[#Palmisano]][[#Top|(^)]]
Changed line 169 from:
[[#Top|(^)]][[#Redavid]]
to:
[[#Redavid]][[#Top|(^)]]
Changed line 252 from:
[[#Top|(^)]][[#Basile]]
to:
[[#Basile]][[#Top|(^)]]
Changed line 271 from:
[[#Top|(^)]][[#Tinelli]]
to:
[[#Tinelli]][[#Top|(^)]]
Changed line 330 from:
[[#Top|(^)]][[#Gentile]]
to:
[[#Gentile]][[#Top|(^)]]
Changed line 389 from:
[[#Top|(^)]][[#Iaquinta]]
to:
[[#Iaquinta]][[#Top|(^)]]
May 26, 2011, at 07:11 AM EST by 193.204.187.33 -
Changed line 50 from:
[[#Lops]]
to:
[[#Top|(^)]][[#Lops]]
Changed line 79 from:
[[#Licchelli]]
to:
[[#Top|(^)]][[#Licchelli]]
Changed line 136 from:
[[#Palmisano]]
to:
[[#Top|(^)]][[#Palmisano]]
Changed line 169 from:
[[#Redavid]]
to:
[[#Top|(^)]][[#Redavid]]
Changed line 252 from:
[[#Basile]]
to:
[[#Top|(^)]][[#Basile]]
Changed line 271 from:
[[#Tinelli]]
to:
[[#Top|(^)]][[#Tinelli]]
Changed line 330 from:
[[#Gentile]]
to:
[[#Top|(^)]][[#Gentile]]
Changed line 389 from:
[[#Iaquinta]]
to:
[[#Top|(^)]][[#Iaquinta]]
May 26, 2011, at 07:08 AM EST by 193.204.187.33 -
Added line 50:
[[#Lops]]
Deleted line 51:
[[#Lops]]
Added line 79:
[[#Licchelli]]
Deleted line 80:
[[#Licchelli]]
Added line 136:
[[#Palmisano]]
Deleted line 137:
[[#Palmisano]]
Added line 169:
[[#Redavid]]
Deleted line 170:
[[#Redavid]]
Added line 252:
[[#Basile]]
Deleted line 253:
[[#Basile]]
Added line 271:
[[#Tinelli]]
Deleted line 272:
[[#Tinelli]]
Added line 330:
[[#Gentile]]
Deleted line 331:
[[#Gentile]]
May 26, 2011, at 07:06 AM EST by 193.204.187.33 -
Added line 389:
[[#Iaquinta]]
Deleted line 390:
[[#Iaquinta]]
May 26, 2011, at 07:04 AM EST by 193.204.187.33 -
Added lines 22-23:

May 26, 2011, at 07:03 AM EST by 193.204.187.33 -
Deleted lines 22-26:




\\\
Added line 25:
May 26, 2011, at 07:02 AM EST by 193.204.187.33 -
Added lines 5-6:
[[#Basile|Pierpaolo Basile]] ciclo XXI
Changed lines 8-12 from:
[[#Lops|Pasquale Lops]] ciclo XVI
to:

[[#Gentile|Anna Lisa Gentile]] ciclo XXII

[[#Iaquinta|Leo Iaquinta]] ciclo XXII
Added lines 14-16:

[[#Lops|Pasquale Lops]] ciclo XVI
Added line 18:
Changed line 20 from:
[[#Basile|Pierpaolo Basile]] ciclo XXI
to:
Deleted lines 21-22:
[[#Gentile|Anna Lisa Gentile]] ciclo XXII
[[#Iaquinta|Leo Iaquinta]] ciclo XXII
May 26, 2011, at 07:00 AM EST by 193.204.187.33 -
Added lines 4-13:
\\
[[#deGemmis|Marco de Gemmis]] ciclo XVI
[[#Lops|Pasquale Lops]] ciclo XVI
[[#Licchelli|Oriana Licchelli]] ciclo XVII
[[#Palmisano|Ignazio Palmisano]] ciclo XIX
[[#Redavid|Domenico Redavid]] ciclo XX
[[#Basile|Pierpaolo Basile]] ciclo XXI
[[#Tinelli|Eufemia Tinelli]] ciclo XXI
[[#Gentile|Anna Lisa Gentile]] ciclo XXII
[[#Iaquinta|Leo Iaquinta]] ciclo XXII
Added lines 15-19:




\\\
Changed line 23 from:
to:
[[#deGemmis]]
Changed lines 45-47 from:
[[http://www.di.uniba.it/~swap/index.php?n=Membri.Lops|'''[++Paquale Lops++]''']]

to:
[[#Lops]]
[[http://www.di.uniba.it/~swap/index.php?n=Membri.Lops|'''[++Pasquale Lops++]''']]

Added line 74:
[[#Licchelli]]
Changed lines 81-128 from:
to:
The rapid evolution of Internet services has led to a constantly increasing number of
web sites and to an increase in the available information. Today, the main challenge is
to support web users in order to facilitate navigation through web site and to improve
searching among the extremely large web repositories, such as Digital Libraries or other
generic information sources. Personalization, a possible approach to the problem, involves
techniques and mechanisms to reduce this information overload and facilitates
the delivery of relevant information that has been personalized for the preferences of
individual users. Machine Learning techniques have a significant role to play in the
development of personalized services within the Digital Libraries. For example, many
Machine Learning techniques are well suited for transforming user-activity data into
useful preference rules as part of a user profile. In web systems, the user profiles manipulate
information that refers to user knowledge in a domain, to her/his personality,
her/his preferences, or to any other information on the user that can be useful in the
configuration of an application.

This thesis explores the role of user profiles in web applications such as Bookshop
Online, Digital Libraries and e-Learning. In particular, it is analyzed the possibility to
enlarge the availability of teaching materials provided by an e-learning system reusing
materials existing in external sources, such as digital libraries. Therefore, the major
research topic addressed by the thesis is related to improve the search of educational
materials on the Web. Looking at it from an educational perspective, related questions
include: What type of search tool to provide to the students to assist them in their
search for course related materials on the Web? Should it leave the students in control
of their search strategy or should it use a meta-search-like automatic modification of
their search queries? During a search session in an e-learning system, the learner can
obtain a sequence that can help motivate her/him to learn and prevent her/him from
being frustrated. This sequence is the result of a search modified on the ground of the
information contained in the student model which describes the preferences, needs and
interests of the student and her/his learning performance.

This thesis describes the design and implementation of a personalization system
(Profile Extractor), that analyzes the data coming out from the interaction between
the users and the web application to automatically discover, using Machine Learning techniques,
the user preferences, needs and interests. Moreover, it shows the possible
uses of the user profiles created by the Profile Extractor system in two domains: bookshop
online, and digital libraries, where several different experiments have been carried
out in order to measure the efficiency of the user profiles. These two domains have
been used as test beds for the implemented techniques and, since the results of the
experiments have been encouraging, these techniques have been applied in the areas of
the Student Modelling, that is the adoption of user profiles in the e-learning domain.
Several experiments have been carried out for checking the efficiency of the user profiles
in this context and for comparing the effectiveness of the numeric algorithms implemented
by Profile Extractor system with the symbolic ones, from the area of Inductive
Logic Pragramming, implemented by another system, along with an evaluation of their
efficiency in order to decide how to best exploit them in the induction of student profiles
for future works.

Added line 131:
[[#Palmisano]]
Added line 164:
[[#Redavid]]
Changed line 175 from:
and other equipment (e.g.: networked scienti c instruments for e-Science or
to:
and other equipment (e.g.: networked scientific instruments for e-Science or
Changed line 186 from:
di erent organizations o er. The emergence of Web Service technology allowed this
to:
different organizations offer. The emergence of Web Service technology allowed this
Changed lines 192-193 from:
entities grounded within the real world (such as those o ered by network or
utility providers) that may o er some provision of value in some domain. As the
to:
entities grounded within the real world (such as those offered by network or
utility providers) that may offer some provision of value in some domain. As the
Changed line 199 from:
ow de nition languages such as
to:
of definition languages such as
Changed line 213 from:
These de nitions can be applied both to Web Services applications and to agentbased
to:
These definitions can be applied both to Web Services applications and to agent-based
Changed lines 216-217 from:
following analogy illustrates these concepts and their di erences:
to:
following analogy illustrates these concepts and their differences:
Changed lines 220-222 from:
From the Web service perspective, an orchestration is a declarative speci -
cation that describes a work
ow to support the execution of a speci c business
to:
From the Web service perspective, an orchestration is a declarative specification that describes a work
to support the execution of a specific business
Changed line 225 from:
orchestrate in a set of services, we need to be able to nd, select, combine and
to:
orchestrate in a set of services, we need to be able to find, select, combine and
Changed line 230 from:
ofWeb Service remaining entirely in the SemanticWeb sphere. In particular,
to:
of Web Service remaining entirely in the SemanticWeb sphere. In particular,
Added line 244:
Added line 247:
[[#Basile]]
Added line 266:
[[#Tinelli]]
Deleted line 300:
Added line 325:
[[#Gentile]]
Added line 384:
[[#Iaquinta]]
October 01, 2010, at 04:26 AM EST by 193.204.187.101 -
Changed line 244 from:
This thesis aims to advance the state of the art in research on efficient reasoning
to:
This thesis aims at advancing the state of the art in research on efficient reasoning
Changed lines 250-253 from:
The contributions of this research could be summarized as follows:
* We describe a preliminary matchmaking approach which investigates instance modeling in a relational database. It is not dependent on
the domain and it implements several match classes, exploiting SQL standard only. Moreover, two ontologies modeling different domains are used to built several datasets of instances in order to better verify services performance.
* On the basis of results of the above mentioned approach, we present a complete matchmaking algorithm specially suitable for skill matching. Distinguishing features include: the possibility to express both strict requirements and preferences in the user request, a logic-based ranking of retrieved instances and the explanation of rank results. All services only rely on ad hoc queries translated in standard SQL: no built-in operator and/or new constructor are exploited. In this approach, both requests and offers must be expressed using the same reference template which is necessary to define their structure and expressiveness.
to:
The contribution of this research could be summarized as follows:
* We describe a preliminary matchmaking approach which investigates instance modeling in a relational database. It is not dependent on the domain and it implements several match classes, exploiting SQL standard only. Moreover, two ontologies modeling different domains are used to built several datasets of instances in order to better verify services performance.
* On the basis of results of the above mentioned approach, we present a complete matchmaking algorithm specially suitable for skill matching. Distinguishing features include: the possibility to express both strict requirements and preferences in the user request, a logic-based ranking of retrieved instances and the explanation of rank results. All services only rely on ad hoc queries translated in standard SQL: no built-in operator and/or new constructor are exploited.
October 01, 2010, at 04:06 AM EST by 193.204.187.101 -
Changed line 210 from:
In order to offer a powerful and automated retrieval process, the simple keywordbased
to:
In order to offer a powerful and automated retrieval process, the simple keyword-based
Changed lines 212-213 from:
process can be time-consuming and unsatisfactory. Generally speaking, those systems
are keyword-based and then a user can express only her mandatory requirements (there
to:
process can be time-consuming and unsatisfactory. Generally the user can express only her mandatory requirements (there
Changed line 214 from:
systems return not ranked and often irrelevant results without explanations. The efficiency
to:
systems return often irrelevant results without explanations. The efficiency
Changed lines 216-221 from:
frameworks able to perform the match among user requests and offers. From
this point of view, it is noteworthy that non-logical approaches to resource retrieval
and matchmaking have serious limitations. For example, by exploiting standard relational
database techniques to model a resource retrieval framework, there is the need
to completely align the attributes of the offered and requested resources, in order to
perform a match. If requests and offers are simple names or strings, the only possible
to:
frameworks able to perform the match among user requests and offers. If requests and offers are simple names or strings, the only possible
Deleted lines 220-224:
Moreover, in real contexts, very often there are no offers that are better than the
others ones from every user selection criteria. We consider that in these cases, i.e.,
when exact matches are lacking, instead of receiving an empty set as search result, user
could accept worse alternatives gradly or she could negotiate the original requirements
for compromises.
Changed line 230 from:
The final goal is to retrieve only the best offers, opportunely ranked, w.r.t. the user
to:
The final goal is to retrieve only the best offers, opportunely ranked, with respect to the user
Changed lines 233-239 from:
The problemof reasoning efficiency is not new in literature. Knowledge
Compilation, infact, is a technique exploited for making reasoning computationally
easier in a knowledge base (KB) typicallymodelled using a logical formalism. The
idea of knowledge compilation is to split query answering into two phases:
* in the first one the knowledge base is preprocessed, thus obtaining an appropriate data structure (such a phase is sometimes called off-line reasoning);
* in the second phase, the query is actually answered using the output of the first phase (such a phase is sometimes called on-line reasoning).
to:
Changed lines 236-237 from:
classical relational database systems (RDBMS) and languages i.e., SQL, for storing the
KB and to perform reasoning tasks. Several approaches have been presented in which
to:
classical relational database systems (RDBMS) and languages i.e., standard SQL, for storing the
KB and to perform reasoning tasks respectively. Several approaches have been presented in which
Changed line 241 from:
with this work approaches will be discussed because the problem of preference
to:
with the approaches of this work will be discussed because the problem of preference
Changed line 246 from:
intends to show how appropriatemodeling of the KB can improve semantic matchmaking
to:
intends to show how appropriate modeling of the KB can improve semantic matchmaking
Changed lines 248-255 from:
aspects and dimensions:

* '''Application''' – For what is resource retrieval used? And resource composition? Application fields are several and different.
* '''Efficient semantic matchmaking''' – What language is necessary to build both user request and offer semantic description? What data structure enables to perform reasoning tasks? How can we evaluate the reasoning efficiency?
** '''KB modeling''' – How is it possible to respect an Open-world Assumption by means of an RDBMS based on the Closed-world Assumption? Other issue is related to stored information useful for retrieval. In other terms, we discuss which data (structured and not, instances and ontological inforation) have to be stored in order to provide services such as matchmaking, ranking and match explanation.
** '''Match classes''' – Which match classes are allowed? Which algorithms are implemented? Is the system scalable in the sense that the retrieval time quite linearly increases with the data size?
** '''Complementary facilities''' – Which information is used to explain the score obtained for each result? For the end user is important to express both necessary requirements and desiderable ones in her request. Hence, the matchmaker have to be able to deal efficiently with strict and soft constraint, respectively.
to:
aspects and dimensions.
Changed lines 251-254 from:
* We describe a preliminary matchmaking approach which investigates instance modeling in a relational database1. It is domain independent and it implements several match classes, exploiting SQL standard only. Limits are the followings: no ranked list of results is returned and the potential match is not complete because generally it retrieves a bigger set of results containing irrelevant results also. Moreover, two ontology modeling different domains are used to built several datasets of instances in order to better verify services performance.
* On the basis of results of the above mentioned approach, we present a complete matchmaking algorithmspecially suitable for skill matching. Distinguishing features include: the possibility to express both strict requirements and preferences in the user request, a logic-based ranking of retrieved instances and the explanation of rank results. All services only rely on ad hoc queries translated in standard SQL: no built-in operator and/or new constructor are exploited. In this approach, both requests and offers must be expressed using the same reference template which is necessary to define their structure and expressivity.
* As proof-of-concept, we present a tool developed for providing skill matching and team-work composition. A main aim is the design of an user-friendly GUI both for browsing easily the domain ontology (in order to compose the query) and for better explain the retrieved results.
* We describe other efficient techniques for resource retrieval and composition in domains as Ubiquitous Computing and Business Process. We present a possible integration between semantic matchmaking services and user profiling ones and, finally, we investigate the problem of core competence extraction.
to:
* We describe a preliminary matchmaking approach which investigates instance modeling in a relational database. It is not dependent on
the domain and it implements several match classes, exploiting SQL standard only. Moreover, two ontologies modeling different domains are used to built several datasets of instances in order to better verify services performance.
* On the basis of results of the above mentioned approach, we present a complete matchmaking algorithm specially suitable for skill matching. Distinguishing features include: the possibility to express both strict requirements and preferences in the user request, a logic-based ranking of retrieved instances and the explanation of rank results. All services only rely on ad hoc queries translated in standard SQL: no built-in operator and/or new constructor are exploited. In this approach, both requests and offers must be expressed using the same reference template which is necessary to define their structure and expressiveness.
* As proof-of-concept, we present a tool developed for providing skill matching and team-work composition. The main issue is to design an user-friendly GUI both for browsing easily the domain ontology (in order to compose the query) and for better explain the retrieved results.
* We describe other efficient techniques for resource retrieval and composition in domains such as Ubiquitous Computing and Business Process. Moreover, we present a possible integration between semantic matchmaking services and user profiling ones and, finally, we investigate the problem of core competence extraction.
September 28, 2010, at 05:07 AM EST by 193.204.187.33 -
Added lines 201-277:
'''''Efficient Reasoning Techniques for Large Datasets of DLs Instances: Approaches And Applications'''''

[-'''Abstract'''-]

Nowadays more and more people choose to employ Internet and/or automated procedures
as infastructure and means for communication, search and resource repository.
In this context, both services and goods are considered resource. The main aim is to
provide new business opportunities allowing a more efficient management of information.

In order to offer a powerful and automated retrieval process, the simple keywordbased
search is not sufficient. Infact, in the on-line websites and portals the search
process can be time-consuming and unsatisfactory. Generally speaking, those systems
are keyword-based and then a user can express only her mandatory requirements (there
is no possibilities to select features according to wishes or negotiable constraints). Such
systems return not ranked and often irrelevant results without explanations. The efficiency
of such retrieval engine is therefore determined by the efficacy of their underlying
frameworks able to perform the match among user requests and offers. From
this point of view, it is noteworthy that non-logical approaches to resource retrieval
and matchmaking have serious limitations. For example, by exploiting standard relational
database techniques to model a resource retrieval framework, there is the need
to completely align the attributes of the offered and requested resources, in order to
perform a match. If requests and offers are simple names or strings, the only possible
match would be identity, resulting in an all-or-nothing outcome. On the other hand,
pure knowledge-based approaches require heavy computational capabilities, hence response
times are often unacceptable.

Moreover, in real contexts, very often there are no offers that are better than the
others ones from every user selection criteria. We consider that in these cases, i.e.,
when exact matches are lacking, instead of receiving an empty set as search result, user
could accept worse alternatives gradly or she could negotiate the original requirements
for compromises.
In business scenarios, other important issue is to deal with very large datasets of
resources. Hence, the retrieval efficiency is measured both by data scalability and by
parameters such as allowed match classes, obtained relevant results, ranking functions
and query language expressivity.
Of course, in this work is not possible to cover the full range of reasoning services.
Instead, thesis focus is the presentation of efficient resource matchmaking and composition
approaches in several business context and, in particular, the contribution that
Knowledge Representation (KR), specifically Description Logics (DLs), can provide
to improve scenarios where demand (user request) meets offers (good and services).
The final goal is to retrieve only the best offers, opportunely ranked, w.r.t. the user
request.

The problemof reasoning efficiency is not new in literature. Knowledge
Compilation, infact, is a technique exploited for making reasoning computationally
easier in a knowledge base (KB) typicallymodelled using a logical formalism. The
idea of knowledge compilation is to split query answering into two phases:
* in the first one the knowledge base is preprocessed, thus obtaining an appropriate data structure (such a phase is sometimes called off-line reasoning);
* in the second phase, the query is actually answered using the output of the first phase (such a phase is sometimes called on-line reasoning).

Matchmaking approaches presented in this work are based on KB pre-processing in order
to reduce on-line reasoning. A relevant aspect of thesis work is the exploitation of
classical relational database systems (RDBMS) and languages i.e., SQL, for storing the
KB and to perform reasoning tasks. Several approaches have been presented in which
databases allow users and applications to access both ontologies and other structured
data in a seamless way. An overview and a comparison among these will be
also presented. Finally, preference-based models and systems sharing some characteristics
with this work approaches will be discussed because the problem of preference
handling in RDBMS is not new in information retrieval systems.

This thesis aims to advance the state of the art in research on efficient reasoning
techniques for managing very large datasets of DLs instances; in particular, the work
intends to show how appropriatemodeling of the KB can improve semantic matchmaking
and, eventually, match explanation. The discussion will take into account several
aspects and dimensions:

* '''Application''' – For what is resource retrieval used? And resource composition? Application fields are several and different.
* '''Efficient semantic matchmaking''' – What language is necessary to build both user request and offer semantic description? What data structure enables to perform reasoning tasks? How can we evaluate the reasoning efficiency?
** '''KB modeling''' – How is it possible to respect an Open-world Assumption by means of an RDBMS based on the Closed-world Assumption? Other issue is related to stored information useful for retrieval. In other terms, we discuss which data (structured and not, instances and ontological inforation) have to be stored in order to provide services such as matchmaking, ranking and match explanation.
** '''Match classes''' – Which match classes are allowed? Which algorithms are implemented? Is the system scalable in the sense that the retrieval time quite linearly increases with the data size?
** '''Complementary facilities''' – Which information is used to explain the score obtained for each result? For the end user is important to express both necessary requirements and desiderable ones in her request. Hence, the matchmaker have to be able to deal efficiently with strict and soft constraint, respectively.

The contributions of this research could be summarized as follows:
* We describe a preliminary matchmaking approach which investigates instance modeling in a relational database1. It is domain independent and it implements several match classes, exploiting SQL standard only. Limits are the followings: no ranked list of results is returned and the potential match is not complete because generally it retrieves a bigger set of results containing irrelevant results also. Moreover, two ontology modeling different domains are used to built several datasets of instances in order to better verify services performance.
* On the basis of results of the above mentioned approach, we present a complete matchmaking algorithmspecially suitable for skill matching. Distinguishing features include: the possibility to express both strict requirements and preferences in the user request, a logic-based ranking of retrieved instances and the explanation of rank results. All services only rely on ad hoc queries translated in standard SQL: no built-in operator and/or new constructor are exploited. In this approach, both requests and offers must be expressed using the same reference template which is necessary to define their structure and expressivity.
* As proof-of-concept, we present a tool developed for providing skill matching and team-work composition. A main aim is the design of an user-friendly GUI both for browsing easily the domain ontology (in order to compose the query) and for better explain the retrieved results.
* We describe other efficient techniques for resource retrieval and composition in domains as Ubiquitous Computing and Business Process. We present a possible integration between semantic matchmaking services and user profiling ones and, finally, we investigate the problem of core competence extraction.
September 24, 2010, at 05:36 AM EST by 193.204.187.33 -
Added lines 206-258:
'''''Entities and Identities: Named Entity Processing with cultural Knowledge'''''

[-'''Abstract'''-]

Natural Language is a mean to express and discuss about concepts, objects,
events, i.e. it carries semantic contents. Reading a written text
implies the comprehension of the information that words are carrying.
Comprehension is an intrinsic capacity for a human, but not for a machine.
One of the ultimate roles of Natural Language Processing techniques
is identifying the meaning of the text, providing effective ways to
make a proper linkage between textual references and real world objects,
thus enabling machines to have a bit of the understanding which is proper
of a human.

A proper name is a word or a list of words that refers to a real world
object. Linguistic Expressions with the same reference may have different
senses, so it is necessary to disambiguate between them.

Natural Language Processing (NLP) operations include text normalization,
tokenization, stop words elimination, stemming, Part Of Speech
tagging, lemmatization. Further steps, such as Word Sense Disambiguation
(WSD) or Named Entity Recognition (NER), are aimed at enriching
texts with semantic information. Named Entity Disambiguation (NED)
is the procedure that solves the correspondence between real-world entities
and mentions within text. One of the ultimate goals of NLP techniques
is to identify the meaning of the text, providing effective ways to
make a proper linkage between textual references and real world objects.
The thesis addresses the problem of giving a sense to ''proper names'' in a
text, that is the problem of automatically associating words representing
''Named Entities'' with their ''identities'', that is unique real world objects.
Also, the thesis copes with the problem of lack of training and testing
data for such a task.

Proposed approaches automatically associate each entity in a text with
a unique identifier, a URI from Wikipedia, which is used as an "entity-provider".

The main contribution consists of proposing knowledge based approaches
for NED, which do not requires training data. Specifically the
thesis proposes two solutions:

* a completely knowledge-based algorithm for NED, exploiting Wikipedia data
* a Semantic Relatedness (SR) approach for the NED task: SR scores are obtained by a graph-based model over Wikipedia

The first solution has been tested for italian language: due to lack of italian
testing data for such task, the thesis shows a method to automatically
build a testbed dataset from Wikipedia. The second solution has been
tested over an goldstandard dataset for NED: the proposed algorithm
achieves results competitive with the state of the art.

Both suggested solutions are completely knowledge-based, with the
advantage that no training data is needed: indeed, manually annotated
data for this task is not easily available and acquiring such data can be
expensive.
September 24, 2010, at 05:24 AM EST by 193.204.187.33 -
Changed lines 152-162 from:
->Consider a dance with more than one dancer. Each dancer has a set of
steps that he will perform; they orchestrate their own steps because they
are in complete control of their domain (their body). A choreographer
ensures that the steps all of the dancers make are according to some overall
scheme; we call this a choreography. The dancers have a single view
point of the dance, while the choreography has a multi-party or global
view point of the dance. Orchestration is about describing and executing
a single view point model, while choreography is about describing and
guiding a global model. It is possible to derive the single view point
model from the global model by projecting based on participant.
to:
->Consider a dance with more than one dancer. Each dancer has a set of steps that he will perform; they orchestrate their own steps because they are in complete control of their domain (their body). A choreographer ensures that the steps all of the dancers make are according to some overall scheme; we call this a choreography. The dancers have a single view point of the dance, while the choreography has a multi-party or global view point of the dance. Orchestration is about describing and executing a single view point model, while choreography is about describing and guiding a global model. It is possible to derive the single view point model from the global model by projecting based on participant.
Changed lines 161-162 from:
perform them. Therefore, it is needed to realize the following use cases: Discovery,
Selection, Composition and Invocation.
to:
perform them. Therefore, it is needed to realize the following use cases: ''Discovery'',
''Selection'', ''Composition'' and ''Invocation''.
September 24, 2010, at 05:22 AM EST by 193.204.187.33 -
Changed line 152 from:
''Consider a dance with more than one dancer. Each dancer has a set of
to:
->Consider a dance with more than one dancer. Each dancer has a set of
Changed line 161 from:
model from the global model by projecting based on participant.''
to:
model from the global model by projecting based on participant.
September 24, 2010, at 05:21 AM EST by 193.204.187.33 -
Added lines 101-186:
'''''Towards the Orchestration of Semantic Web Services'''''

[-'''Abstract'''-]

The evolution and ubiquity of the Internet has facilitated the proliferation of distributed
resources, such as computer systems and software applications. Organizations
are increasingly utilizing resources that span traditional organizational boundaries,
like shared databases or processor farms, to share expensive computing resources
and other equipment (e.g.: networked scienti c instruments for e-Science or
cyber infrastructure projects) or to pool together Enterprise resources distributed
widely across networks and geographies. Software applications are also evolving
from monolithic, stove-pipe applications to loosely federated, interacting services
that are dependent on networked resources to provide optimal functionality. This
evolution, powered by the dot-com bubble at the turn of the century, emerged
to automate and outsource business processes to a worldwide audience, both at
the Business-to-Business (B2B) level and for Business-to-Customer (B2C) applications
one improving the user experience. This new software engineering approach
enabled distributed, heterogeneous software components to communicate and interoperate,
through declarative, machine-readable descriptions of the services that
di erent organizations o er. The emergence of Web Service technology allowed this
migration for both enterprise and Grid-based applications due to its exploitation of
the near ubiquitous World-Wide-Web infrastructure, cross-platform interoperability,
and the fact that it is built upon de facto Web standards for syntax, addressing,
and communication protocols. Several conceptualizations of a service have been
proposed, ranging from electronic services that facilitate B2B e-commerce, to business
entities grounded within the real world (such as those o ered by network or
utility providers) that may o er some provision of value in some domain. As the
provision of these services has moved from a developer driven mechanism to one
involving automatic runtime selection (requiring service discovery support), the descriptions
of the APIs and protocols have been increasingly declarative. The Web
Service paradigm introduced the concept of homogeneous, XML-based representation
of service descriptions using interface and work
ow de nition languages such as
WSDL, BPEL4WS, and WS-Choreography. Nevertheless, whilst these approaches
facilitated easier access and usage of web services for developers, they have failed
to address many of the knowledge-based problems associated with the diversity of
service providers, i.e. interface and data heterogeneity. Semantic Web Services
(SWS) address this problem by providing a declarative, ontological framework for
describing services, messages, and concepts in a machine-readable format that can
also facilitate logical reasoning. Thus, service descriptions can be interpreted based
on their meanings, rather than simply being a symbolic representation. Semantic
Web Services aim to extend the Web Service integration process in order to facilitate
automated (or semi-automated) composition, discovery, dynamic binding, and
invocation of services within open, scalable environments. Where there is need to
use a SWS infrastructure without human intervention, two very important concepts
can be used to describe the issues to be solved: Orchestration and Choreography.
These de nitions can be applied both to Web Services applications and to agentbased
systems, and in general to any system for which the notion of collaboration
and planning makes sense, i.e. systems including more than one active entity. The
following analogy illustrates these concepts and their di erences:

''Consider a dance with more than one dancer. Each dancer has a set of
steps that he will perform; they orchestrate their own steps because they
are in complete control of their domain (their body). A choreographer
ensures that the steps all of the dancers make are according to some overall
scheme; we call this a choreography. The dancers have a single view
point of the dance, while the choreography has a multi-party or global
view point of the dance. Orchestration is about describing and executing
a single view point model, while choreography is about describing and
guiding a global model. It is possible to derive the single view point
model from the global model by projecting based on participant.''

From the Web service perspective, an orchestration is a declarative speci -
cation that describes a work
ow to support the execution of a speci c business
processes, operation or service; i.e., it describes how Web Services can interact with
each other at the message level, including the business logic and execution order of
their interactions. On the contrary, from a SWS perspective, in order to automatically
orchestrate in a set of services, we need to be able to nd, select, combine and
perform them. Therefore, it is needed to realize the following use cases: Discovery,
Selection, Composition and Invocation.

The aim of this thesis is the formalization of some aspects inherent the orchestration
ofWeb Service remaining entirely in the SemanticWeb sphere. In particular,
it discusses an in-depth analysis of the Semantic Web Services from the orchestration
perspective, including the state of the art and the comparison between the most
widely adopted SWS representation languages. A solution for the SWS Composition
use case is presented. A prototype based on a backward chaining algorithm has
been implemented using SWRL (Semantic Web Rule Language) as representation
language for OWL-S services. It is an original solution since it has been realized
entirely using Semantic Web technologies. Furthermore, there are two Semantic
Web problems that unavoidably impact the development of Semantic Web Services:
ontology alignment and monotonic knowledge base management. These issues are
also discussed and some possible solutions developed in the Semantic Web context
are proposed for the Semantic Web Services. Finally, an empirical evaluation of our
prototype and the conclusion is presented.
September 24, 2010, at 05:11 AM EST by 193.204.187.33 -
Added line 72:
Added line 109:
September 24, 2010, at 05:10 AM EST by 193.204.187.33 -
Changed lines 60-64 from:
to:
'''''Personalization in Digital Libraries for Education'''''

[-'''Abstract'''-]

Added lines 69-94:
'''''A Machine Learning Approach to Ontology Alignment'''''

[-'''Abstract'''-]
The main problem that this thesis is aimed to address is Ontology Alignment. It can
be described as the problem of how to move knowledge between different possible
representations or formalizations, not in the sense of different knowledge representation
formalisms, but in the sense of different conceptualizations within the same
expression language and the same knowledge domain. Conceptualization here is
intended as ontology where the key espression is "An ontology is a formalization
of a conceptualization". The term ontology is borrowed from phylosopy,
where its sense is "discourse about being"; in current Computer Science, its meaning
could be described as "formalization of relationships between entities, both physical
and abstract ones".

In this work, my aim is to use Machine Learning techniques applied to ontologies
expressed in Description Logics (DL) formalisms in order to solve some of
the issues that arise in trying to address an Ontology Alignment problem; Decription
Logics (a family of knowledge representation formalisms aimed at describing
knowledge as concepts and relations between concepts and entities - abstract or
real world ones) have been chosen as logic foundation for many languages that the
W3 Consortium has endorsed, in particular those involved in the realization of the
Semantic Web, the evolution of current web that aims at formally capturing the
knowledge expressed by web contents, so that automatic reasoning can be applied
to accomplish a wide variety of tasks, to name a few: increase search effectiveness,
simplify data migration, automatize knowledge exchange between systems, enhance
automatic service discovery and service composition planning.
September 21, 2010, at 04:52 AM EST by 193.206.186.106 -
Added lines 75-86:
'''''Word Sense Disambiguation and Intelligent Information Access'''''

[-'''Abstract'''-]
In the field of computational linguistics, researchers are mainly concerned with the computational processing of natural language. A number of results have already been obtained, ranging from concrete and applicable systems able to understand or produce language to theoretical descriptions of the underlying algorithms. However, a number of important research problems have not been solved. A particular challenge for computational linguistics pertaining to all levels of language is ambiguity. Most people are quite unaware of how vague and ambiguous human languages really are, and they are disappointed when computers are hardly able to understand language and linguistic communication the way humans do. Ambiguity means that a word can be interpreted in more than one way, has more than one meaning. Mostly ambiguity does not pose a problem for humans and is therefore not perceived as such, for a computer, however, ambiguity is one of the main problems encountered in the analysis and generation of natural languages.

Moreover, advances in the Internet and the creation of huge stores of digitized text have opened the gateway to a deluge of information that is difficult to navigate. Although the information is widely available, exploring Web sites and finding information relevant to a user needs is a challenging task. One of the obstacle is represented by the language ambiguity, for example if you want to search all the documents about ''bat'' as a small nocturnal creature, most probability the retrieval system gets back also the documents that contains ''bat'' as a piece of sport equipment (a club used for hitting a ball in various games). In order to solve this problem, a method able to disambiguate word meanings across the documents is needed.

The central argument of this dissertation is the use of Word Sense Disambiguation for Intelligent Information Access. Word Sense Disambiguation (WSD) refers to the resolution of lexical semantic ambiguity and its goal is to attribute the correct sense to a word used in a given context, while Intelligent Information Access is a ''user-centric'' and ''semantically rich'' approach to access information.

After a brief introduction to the problem of lexical semantic ambiguity, I propose several methods for word sense disambiguation that attempts to disambiguate words exploiting a semantic knowledge resource like WordNet. WordNet is an lexical database whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets ''synset'', each representing one underlying lexical concept.

The second part of the dissertation takes into account the evaluation of WSD strategies. In particular the proposed algorithms are tested independently of any application, using specially constructed benchmarks and after they are evaluated in terms of their contribution to the overall performance of a system designed for Intelligent Information Access such as Semantic Searching and Intelligent User Profiling.
September 21, 2010, at 04:48 AM EST by 193.204.187.140 -
Changed lines 11-27 from:
to:
'''''Learning User Profiles from Text for Personalized Information Access'''''

[-'''Abstract'''-]

Advances in the Internet and the creation of huge stores of digitized text have opened the gateway to a deluge of information that is difficult to navigate. Although the information is widely available, exploring Web sites and finding information relevant to a user's interests is a challenging task.
The first obstacle is research, where you must first identify the appropriate information sources and then retrieve the relevant data. Then, you have to sort through this data to filter out the unfocused and unimportant information. Lastly, in order for the information to be truly useful, you must take the time to figure out how to organize and abstract it in a manner that is easy to understand and analyze. To say the least, all of these steps are extremely time consuming.
This "relevant information problem" leads to a clear demand for automated methods able to support users in searching large document repositories in order to retrieve relevant information with respect to their preferences. Catching user interests and representing them in a structured form is a problematic activity. Algorithms designed for this purpose base their relevance computations on so-called user profiles in which representations of the users' interests are maintained.
The central argument of this dissertation is the use of Supervised Machine Leaning techniques to induce user profiles from text data for Intelligent Information Access.
Intelligent Information Access is a user-centric and semantically rich approach to access information: information preferences vary greatly across users, therefore information access must be highly personalized by profiles to serve the individual interests of the user. Moreover, users want to retrieve information on the basis of conceptual content, but individual words provide unreliable evidence about the meaning of documents. Thus, methods for extracting meaning from documents must be considered in order to effectively find relevant information.

First, we describe content-based learning algorithms designed to learn about users' interests.
The input, given as a set of text documents marked by the user as relevant or not relevant, is used to find characteristics that distinguish relevant documents from irrelevant ones. The induced target concept is a user profile appropriate for the classification of new documents. Documents are represented as bag of words (BOW): a document is encoded as a feature vector, with each element in the vector indicating the presence or absence of a word in the document. This approach was used as a baseline to determine how well a standard keyword-based learner performs on this task.

Second, current limits in the state of the art in profiles generated from the BOW-represented documents are analyzed. Though many linguistic techniques have been employed, there are problems that still remain unsolved like: polysemy, synonymy, etc. A possible solution for this kind of issues is explored: the shift of the level of abstraction from words up to concepts.
Profiles will not contain words anymore. They will contain references to concepts defined in lexicons or, in a further step, ontologies. A first advance in this direction consists of employing WordNet as a reference lexicon in substituting word forms with word meanings into profiles. We show how the described content-based algorithms can be extended using a new, enriched document representation obtained by adding features generated using a new WordNet-based procedure.

The dissertation concludes with the description of the empirical study that evaluates the effectiveness of the proposed approach.
September 20, 2010, at 04:46 PM EST by 93.43.209.83 -
Changed line 25 from:
''Recommender systems'' constitute one of the fastest growing segments of the Internet economy today. They help reduce information overload and provide customized information access for targeted domains. Such systems take input directly or indirectly from users and, based on their ''needs'', ''preferences'' and ''usage patterns', provide personalized advices about products or services and can help people to filter useful information, thus giving users easing the information search and decision processes.
to:
''Recommender systems'' constitute one of the fastest growing segments of the Internet economy today. They help reduce information overload and provide customized information access for targeted domains. Such systems take input directly or indirectly from users and, based on their ''needs'', ''preferences'' and ''usage patterns'', provide personalized advices about products or services and can help people to filter useful information, thus giving users easing the information search and decision processes.
September 20, 2010, at 04:45 PM EST by 93.43.209.83 -
Added lines 17-39:
'''''Hybrid Recommendation Techniques based on User Profiles'''''

[-'''Abstract'''-]

Nowadays users are overwhelmed by the abundant amount of information, and it is not just a problem to a minority of population; it is a problem for everyone in their daily life. In fact, now, we do not get information just from newspapers, colleagues, family members and friends, but also largely from the Internet.

How can people deal with this information overload problem? Individuals tend to filter and ignore information as the effective ways to cope with information overload.

''Recommender systems'' constitute one of the fastest growing segments of the Internet economy today. They help reduce information overload and provide customized information access for targeted domains. Such systems take input directly or indirectly from users and, based on their ''needs'', ''preferences'' and ''usage patterns', provide personalized advices about products or services and can help people to filter useful information, thus giving users easing the information search and decision processes.

Among different recommendation techniques proposed in the literature, the ''collaborative filtering'' approach is the most successful and widely adopted to date. Collaborative filtering by itself cannot always guarantee a good prediction. The effectiveness of predictions relies on the confidence of the computation of the ''similarity'' between users. Correlation between users can only be computed if they have rated a sufficient number of common items. Since users can choose among thousands of items to rate, especially in online catalogues, and new items become available continuously, it is likely that overlap of rated items between two users will be minimal in many cases. Therefore, many of the computed correlation coefficients are based on just few observations. As a result, correlation based only on co-rated items cannot be regarded as a reliable similarity measure.

One of the primary contributions of this thesis is the investigation on how the knowledge about users can be exploited to improve recommendations. In particular it is investigated how overlaps between users' interests could be used to define the similarity among users in order to improve recommendations.

The combination of classic collaborative filtering techniques and ''user profiles'' inferred using content-based methods for designing a new ''hybrid recommendation technique'' is presented.

In the study it is described the process of learning content-based profiles to be used in one of the main steps of the process for producing social recommendations: the ''neighborhood formation''. A clustering technique for grouping user profiles is proposed in order to identify the set of neighbors for those users for which recommendations must be produced.

More specifically, the process of grouping user profiles learns two different profiles of the user: one, from positive examples of interesting items, represents the interests of the user, the other profile, learned from negative examples, represents items the user dislikes. The observation is that two users can be considered similar if they like the same items, but if they dislike the same ones as well.

Finally, advanced semantic user profiles based on ''concepts'' instead of ''keywords'' have been used for improving the accuracy of collaborative recommendations.

Several experiments have been carried out in order to evaluate the effectiveness of the approaches. Some baseline experiments on classic collaborative filtering have been carried out as benchmark. The final experimental analysis provides evidence of the improvements of the proposed approaches.
September 20, 2010, at 05:14 AM EST by 193.204.187.33 -
Deleted lines 19-23:


-----
\\\
'''[++Luigi Iannone++]'''
July 29, 2010, at 10:31 AM EST by 193.204.187.33 -
Changed line 9 from:
[[http://www.di.uniba.it/~swap/index.php?n=Membri.Degemmis|'''[++Degemmis Marco++]''']]
to:
[[http://www.di.uniba.it/~swap/index.php?n=Membri.Degemmis|'''[++Marco de Gemmis++]''']]
July 29, 2010, at 05:13 AM EST by 193.204.187.33 -
Added line 11:
Deleted lines 13-14:
\\\
Added line 16:
Deleted lines 18-19:
\\\
Added line 21:
Deleted lines 23-24:
\\\
Added line 26:
Deleted lines 28-29:
\\\
Added line 31:
Deleted lines 33-34:
\\\
Added line 36:
Deleted lines 38-39:
\\\
Added line 41:
Deleted lines 43-44:
\\\
Added line 46:
Deleted lines 48-49:
\\\
Added line 50:
July 29, 2010, at 05:11 AM EST by 193.204.187.33 -
Deleted line 59:
\\\
July 29, 2010, at 05:11 AM EST by 193.204.187.33 -
Deleted line 61:
July 29, 2010, at 05:09 AM EST by 193.204.187.33 -
Added lines 1-139:
[[#Top]]
'''[++Ph.D. Graduates++]'''

\\\
%center%Attach:Main/linea.gif
\\\


[[http://www.di.uniba.it/~swap/index.php?n=Membri.Degemmis|'''[++Degemmis Marco++]''']]

-----
\\\
\\\

[[http://www.di.uniba.it/~swap/index.php?n=Membri.Lops|'''[++Paquale Lops++]''']]

-----
\\\
\\\

'''[++Oriana Licchelli++]'''

-----
\\\
\\\

'''[++Luigi Iannone++]'''

-----
\\\
\\\

'''[++Ignazio Palmisano++]'''

-----
\\\
\\\

'''[++Domenico Redavid++]'''

-----
\\\
\\\

[[http://www.di.uniba.it/~swap/index.php?n=Membri.Basile|'''[++Pierpaolo Basile++]''']]

-----
\\\
\\\

[[http://www.di.uniba.it/~swap/index.php?n=Membri.Tinelli|'''[++Eufemia Tinelli++]''']]

-----
\\\
\\\

'''[++Anna Lisa Gentile++]'''

-----
\\\
\\\

[[http://www.di.uniba.it/~swap/index.php?n=Membri.Iaquinta|'''[++Leo Iaquinta++]''']]

'''''Serendipity in Context: Context-aware Recommendations of Serendipitous Items'''''

[-'''Abstract'''-]

When a person searches for a piece of information about a topic, she finds so
much information available that she hardly unearths web pages, books, papers,
articles, music, videos, etc. actually relevant to the searched topic. For instance,
most search engines on the Internet return thousands of results on every query,
while only a few of those results are really relevant for the searcher and they
are not always at the top of the returned list. Furthermore, what is relevant
and interesting for one searcher may not be relevant and interesting for another
searcher, even if they submit the same query.

The extensive options lead the user to feel that she looses control on handling
the amount of information and she becomes worried whether something
interesting or important is being missed. This problem is often referred to as
the information overload.

Recommender systems help to reduce information overload and provide customized
information access for targeted domains. Such systems take direct or
indirect input from users and, based on their needs, preferences and usage patterns,
provide personalized advices about products or services so that users are
assisted to filter useful information.

Recommender systems became an important research area since the appearance
of the first papers on collaborative filtering since the mid-1990s. There
has been much work done both in the industry and academia to develop and
to improve new approaches to recommendations over the last decade. The
interest in this area still remains high because it constitutes a problem-rich research
area and because of the plenty of practical applications that help users
to deal with the information overload and that provide them with personalized
recommendations, content and services. In addition, despite all the advances,
the current generation of recommender systems still requires further improvements
to make recommendation methods more effective and applicable to an
even broader range of real-life applications. These improvements include better
methods for representing the user behavior and the information about the items
to be recommended, more advanced recommendation modeling methods, and
exploitation of contextual information into the recommendation process.

For some approaches, such as the content-based one, the item representation
plays a key role, thus choosing proper facets to represent items is a fundamental
task for deploying effective recommender systems. Contextual facets are often
marginally relevant to learn and predict user preferences, but in some domains
disregarding contextual facets makes recommendations useless. Consequently
the thesis deals with the contextual dimension proposing a strategy to improve
the effectiveness of a content-based recommender system by the exploitation
of contextual facets. The demonstrative scenario concerns with the dynamic
suggestion of personalized tours within a museum: the contextual facets deal
with the physical layout of items and the interaction of users with the physical
environment.

The thesis also deals with the serendipitous dimension. Indeed, recommender
systems commonly recommend items that score highly against a user’s
profile and, consequently, the user is recommended for items similar to those
already rated. If this feature becomes a limitation, the recommender system
suffers of over-specialization and it damages the common expectations concern
with novelty and surprise. Indeed novelty occurs when the system suggests an
unknown item that the user might have autonomously discovered. On the other
hand, a serendipitous recommendation helps the user to find a surprisingly interesting
item that she might not have otherwise discovered (or it would have
been really hard to discover). Although the serendipity is a difficult concept
to research because it is by definition not particularly susceptible to systematic
control and prediction, the thesis deals with the serendipitous dimension,
proposing a strategy to mitigate the over-specialization exploiting the learned
user profiles.

Finally, the contextual dimension and serendipitous dimension are synergic.
Indeed, the contextual dimension is used to refine the selection of supposed
serendipitous items and to provide a practical interpretation of serendipity augmented
recommendation task. On the other hand, the serendipity dimension
allows to introduce an increased dynamicity in the contextual facets handling.

-----
\\\
\\\