Enhancing Entity Resolution and Retrieval through Dataset Decomposition - Yannis Velegrakis (Utrecht University)

Archimedes_image
Dates
2025-07-09 19:11

Title: Enhancing Entity Resolution and Retrieval through Dataset Decomposition

Speaker: Prof. Yannis Velegrakis (Utrecht University, The Netherlands)

Abstract: Any traditional data management process is typically applied to some dataset. We advocate that slicing the dataset and treating each slice differently can lead to better performance in a number of different scenarios. We present two works where this principle has been applied and we demonstrate its effectiveness. First, we show that even for short documents (forum posts), slicing allows for better discovery of documents related to a post at hand. Second, we illustrate how entity resolution can benefit from slicing. It is known that different entity resolution algorithms perform better on different datasets. We bring this idea within a dataset. We first slice the dataset and then select for each slice the method that performs best for each slice. Doing so leads to an improved overall performance of the entity resolution process. The two main challenges in this approach are to decide how to do the slicing, and how to select the resolution method that is best for a slice.

Short Biography: Yannis Velegrakis  is a Computer Science professor at Utrecht University (Netherlands) where he holds the chair on Very Large Data Management, heads the Data Intensive Systems Group, and leads the Master’s programme in Data Science. His research area of expertise includes Data Preparation and Curation, Data Quality, Big Data Management, Knowledge Engineering, Graph Management, and Highly Heterogeneous Information Integration. He holds a PhD degree in Computer Science from the University of Toronto. He has been a professor at the University of Trento and a researcher at the AT&T Research Labs. He is also a PI at the Archimedes Unit of the Athena Research Center. He has spent time for research work at IBM Almaden Research Center, the Huawei European Research Center in Munich, the Center of Advanced Studies of the IBM Toronto Lab, the University of California, Santa-Cruz, and the University of Paris-Saclay. He has been the general chair of VLDB 2013 and ICDE 2024, the PC Chair of EDBT 2021, and area chair in multiple VLDB, SIGMOD, ICDE, and EDBT Conferences. He is currently serving on the board of the EDBT Association (as president), on the VLDB Board of Trustees, on the SIKS Research School board, and as associate editor for Systems on the SIGMOD Record editorial team.

 
________________________________________________________________________________
Microsoft Teams Need help?
Meeting ID: 384 980 428 744 4
Passcode: 7nd7hB6Q
 
________________________________________________________________________________
 
 
Mon Tue Wed Thu Fri Sat Sun
1
3
32nd International Colloquium On Structural Information and Communication Complexity (SIROCCO)
General Information   The 32nd International Colloquium On Structural Information and Communication Complexity (SIROCCO 2025) will take place on June 2-4, 2025, in Delphi, Greece. See
Date : 2025-06-03
6
7
8
9
10
11
12
13
14
15
22
23
24
ACM FAccT 2025: AI and Greece: Interdisciplinary Reflections from Past to Present
Athens Conservatoire, Athens, Greece
  Abstract This panel brings together policy makers, scholars and representatives from NGOs in Greece to explore the promises and perils of artificial intelligence.
Date : 2025-06-24
25
ACM FAccT 2025: AI and Greece: Interdisciplinary Reflections from Past to Present
Athens Conservatoire, Athens, Greece
  Abstract This panel brings together policy makers, scholars and representatives from NGOs in Greece to explore the promises and perils of artificial intelligence.
Date : 2025-06-25
26
27
28
29
30
6th ACM Europe Summer School on Data Science
Grand Serai Hotel, Ioannina, Greece
ACM Summer School on Data Science 2024 The 6th ACM Europe Summer School in Data Science will take place in Ioannina in June 30th - July 4th, 2025. Young
Date : 2025-06-30
 
 

The project “ARCHIMEDES Unit: Research in Artificial Intelligence, Data Science and Algorithms” with code OPS 5154714 is implemented by the National Recovery and Resilience Plan “Greece 2.0” and is funded by the European Union – NextGenerationEU.

greece2.0 eu_arch_logo_en

 

Stay connected! Subscribe to our mailing list by emailing sympa@lists.athenarc.gr
with the subject "subscribe archimedes-news Firstname LastName"
(replace with your details)