Uncovering the true value of data

Uncovering the true value of data

Current AI leverages poorly the goldmine of information present in rare data extracting trivial notions rather than high level concepts.

We hypothesize that this is due to several key weaknesses. We make simplistic assumptions about how data are distributed not reflective of the real world; and our training processes seem to favour frequent and common data than the rare ones. Moreover, sometimes (at random) poor information is extracted, either because we find suboptimal data representations or due to latching on non-useful artefacts.  We want to understand why such inconsistency exists and propose to devise methods that combat it and hence improve how we optimize learning functions. We propose to introduce stronger (causal) assumptions to robustly extract high-level concepts.  There is an as-yet-unexploited opportunity where rare data may reveal unique causal relationships. We hope to investigate thoroughly this tantalising prospect. We put herein the underpinnings for an AI that is data-efficient and robust.  We will stress test our ideas and methods on synthetic and benchmark data. We explore key healthcare applications in multimodal datasets of cancer to illustrate potential gains and provide material for follow-up work.

 
 

The project “ARCHIMEDES Unit: Research in Artificial Intelligence, Data Science and Algorithms” with code OPS 5154714 is implemented by the National Recovery and Resilience Plan “Greece 2.0” and is funded by the European Union – NextGenerationEU.

greece2.0 eu_arch_logo_en

 

Stay connected! Subscribe to our mailing list by emailing sympa@lists.athenarc.gr
with the subject "subscribe archimedes-news Firstname LastName"
(replace with your details)