Semantics-aware Content Representations for Reproducible Recommender Systems (SCoRe)

Tutorial at the 30th ACM Conference on User Modeling, Adaptation and Personalization July 4-7, 2022 | Barcelona (Spain)



Abstract

Overview

Content-based recommender systems suggest items similar to those the user already liked in the past by building a representation of users and items based on descriptive features, which are usually obtained by processing textual content. A classic approach to dealing with the textual content is using a keyword-based representation, where few extracted terms represent the whole content. A sharp limitation of classic keyword-based representations is that they are not often enough to correctly catch the preferences of the users, as well as the informative content conveyed by the items. Of course, a sub-optimal comprehension of the informative content leads to a sub-optimal representation of the user and items and, in turn, to recommendations which are not accurate. Hence, it is necessary to improve such representations in order to fully exploit the potential of content-based features and textual data. Semantics-aware recommender systems represent one of the most innovative lines of research in which the goal is to use semantic approaches for representing content. Thanks to these representations, it is possible to give meaning to information expressed in natural language and to obtain its deeper comprehension. This tutorial aims to present, from both a theoretical and practical point of view, semantic-aware techniques for content representation with the purpose of realizing effective and accountable recommender systems. We will introduce ClayRS, a comprehensive Python framework developed by SWAP research group, which will be publicly released in the next months, aiming to provide a common ground for both researchers and practitioners interested in the latest semantics-aware techniques for user modeling and recommender systems.

Motivations and Relevance

Thanks to semantics, we can take into account the meaning of the content, and this is crucial to improve the quality of user profiles and the effectiveness of intelligent information access platforms, such as recommender systems. The literature on semantics-aware recommender systems is actually rich, constantly evolving, but unfortunately also scattered, even in terms of software libraries to implement them. Indeed, countless software libraries exist to process content and extract the most significant features, to encode semantics using external knowledge sources or embedding techniques. Moreover, the experimental workflow related to recommender systems is becoming more and more complex, making the reproducibility of experiments a challenge. Hence, this tutorial provides, on the one hand, a common ground for both researchers and practitioners interested in the latest semantics-aware techniques for user modeling and recommender systems. Guided by several use cases implemented by Google Colab notebooks, the tutorial will allow putting in practice core concepts related to semantics-aware recommender systems. On the other hand, the use of the ClayRS Python framework will make the entire recommendation pipeline simple, fast, and replicable.

Aims and Learning Objectives

State of the art overview


We will provide an overview of the most recent trends in the area of semantics-aware content-based recommendersystems, covering methods based onexogenous and endogenous representation techniques. The former rely onthe integration of external knowledge sources, while the latter are based on the hypothesis that the meaning ofwords depends on their usage in large corpora of textual documents. The most recent trends based on knowledgegraphs and embedding techniques will be discussed

Hands-on Session


We will provide practical hands-on sessions using ClayRS, an upcoming comprehensive Python framework which makes available most of the methods to implement semantic representation techniques, along with the entire pipeline for reproducible recommender systems evaluation. ClayRS will be released before the tutorial and will allow to simply implement, run and evaluate semantics-aware recommender systems, by also improving their accountability.




Target Audience


This tutorial will benefit researchers and practitioners with broad interest in user modeling and recommender systems, who are willing to have a whole picture of advanced semantics-aware techniques for building advanced and intelligent services for user modeling and recommender systems. The technical level of the tutorial will be intermediate. Basic prerequisites regarding recommender systems, user modeling, and natural language processing are required.




Speakers

We estimate the tutorial duration equal to 3 hours, i.e., two blocks of 1.5 hours. The speakers are reported in the following.

team

Pasquale Lops

Associate Professor

University of Bari Aldo Moro, Itay

Pasquale Lops is Associate Professor at the Department of Computer Science, University of Bari Aldo Moro, Italy. He received the Ph.D. in Computer Science from the University of Bari in 2005 with a dissertation on “Hybrid Recommendation Techniques based on User Profiles”. His research interests include recommender systems and user modelling, with a specific focus on the adoption of techniques for semantic content representation. He authored over 200 articles, and he is one of the authors of the textbook "Semantics in Adaptive and Personalized Systems: Methods, Tools and Applications", published by Springer. He regularly serves in the PC of the top conferences in his areas. He was Area Chair of User Modelling for Recommender Systems at UMAP 2016, and co-organized more than 20 workshops related to user modeling and recommender systems. He gave a tutorial on “Semantics-Aware Techniques for Social Media Analysis, User Modeling, and Recommender Systems” at UMAP 2016 and 2017, he was a speaker at two editions of the ACM Summer School on Recommender Systems. He was a keynote speaker at the 1st Workshop on New Trends in Content- based Recommender Systems (CBRecSys) at RecSys 2014. Finally, he gave the interview “Beyond TF- IDF” in the Coursera MOOC on Recommender Systems.



team

Cataldo Musto

Assistant Professor

University of Bari Aldo Moro, Itay

Cataldo Musto is Assistant Professor at the Department of Informatics, University of Bari. He completed his Ph.D. in 2012 with a thesis investigating the impact of distributional semantics models in content-based recommender systems. His research focuses on the definition of semantics-aware content representation in recommender systems and user modeling platforms exploiting natural language processing and knowledge graphs. He is one of the authors of the textbook "Semantics in Adaptive and Personalized Systems: Methods, Tools and Applications", published by Springer. He acts as a program committee member for the ACM Recommender Systems Conference, the Conference on User Modeling Adaptation and Personalization and many other top-tier conference in the area (IJCAI, WWW, IUI, etc.). He organized several events related to user modeling and recommender systems, such as the workshop series on Holistic and Explainable User Modeling. In 2016 and 2017 he gave a tutorial at UMAP conference about the exploitation of semantics-aware representation in content-based personalized systems.

team

Marco Polignano

Assistant Professor

University of Bari Aldo Moro, Itay

Marco Polignano is Assistant Professor at the Department of Informatics, University of Bari, Italy, in the SWAP (Semantic Web Access and Personalization) research group. He has been a Ph.D. student from 2014 to 2017, and he received a Ph.D. in Computer Science and Mathematics in 2018, at the same university, with the thesis titled "An affect-aware computational model for supporting decision-making through recommender systems". He was a program committee member for many international conferences, including IJCAI, ECAI, IUI, AIVR, WWW. He was a local organizing committee member for the Ai*iA 2017 conference and organizer of the Evalita 2018 challenge – ABSITA about the aspect-based sentiment analysis, Evalita 2020 ATE\_ABSITA, UMAP 2020-2021 ExUm workshop about user modeling and personalization. He has been a reviewer for many international journals and conference papers. In 2016 and 2018, he was a Marie Skłodowska-Curie Research and Innovation Staff Exchange (MSCA-RISE) fellow, involved in the project N. 691071, titled "Seo-Dwarf: Semantic EO Data Web Alert and Retrieval Framework". His research interests are Information Filtering, Recommender Systems, Natural Language Processing, Cognitive computing. During his career, he gained skills in Artificial Intelligence, Machine Learning, and Data Mining over big data.

Outline of Activities


Introduction (30 minutes)
  • Recommender Systems and the role of content
  • Content representation: issues and challenges
  • Towards accountable recommender systems
  • Reproducibility and Replicability issues
  • Frameworks for advanced content representations and accountability


Semantic-aware techniques for content-representations (1 hour)
  • Basics of Content Representation
  • Encoding Exogenous Semantics
    • Linked Open Data and DBpedia
    • Entity Linking
  • Encoding Endogenous Semantics
    • Distributional Semantic Models
    • Word Embedding Techniques
    • Contextualized Word Representations


ClayRS: a Python framework for content representation and reproducibility in recommender systems (1.30hour)
  • Architecture of the framework
  • Focus on the Content Analyzer Module + Hands-on session
    • TF-IDF and data pre-processing
    • Word Embedding Techniques and Contextualized Word Representations
    • Entity Linking
  • Focus on the Recommendation Module + Hands-on session
    • Strategies based on Neighborhoods
    • Strategies based on Neural Networks
    • Strategies based on Graphs
  • Focus on the Evaluation Module + Hands-on session
  • Focus on the Configuration Files and Graphical User Interface + Hands-on session

Resources

Our Location

Department of Informatics

University of Bari Aldo Moro

Via E. Orabona 4, 70125, Bari, Italy

How Can We Help?

pasquale.lops@uniba.it

cataldo.musto@uniba.it

marco.polignano@uniba.it