ILSP & Archimedes NLP Theme Talk, Thursday 7 March, 16:00 (Greek time)
Speaker: Preslav Nakov (https://mbzuai.ac.ae/study/faculty/preslav-nakov/)
Title: "Jais and Jais-chat: Building the World's Best Open Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models"
Room: Zampolli, Athena Main Building (6 Artemidos str., Marousi, ground floor)
and virtually via MS Teams:
https://teams.microsoft.com/l/meetup-join/19%3ameeting_NDgzMjM1MTQtYWFiMS00ZDk4LTlhYmItOTg1NDlhYWRjNTg3%40thread.v2/0?context=%7b%22Tid%22%3a%226ae07702-c5f7-4f38-9b87-acad62a75d93%22%2c%22Oid%22%3a%22735f6987-4242-47ec-98d6-f1eb55fb371f%22%7d
- Meeting ID: 364 885 596 720
- Passcode: jbYNHY
Abstract:
I will discuss Jais and Jais-chat, two state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. The models demonstrate better knowledge and reasoning capabilities in Arabic than previous open Arabic and multilingual models by a sizable margin, based on extensive evaluation. Moreover, they are competitive in English compared to English-centric open models of similar size, despite being trained on much less English data. I will discuss the training, the tuning, the safety alignment, and the evaluation, as well as the lessons we learned.
Speaker Bio:
Preslav Nakov is Professor and Department Chair for NLP at the Mohamed bin Zayed University of Artificial Intelligence. Previously, he was Principal Scientist at the Qatar Computing Research Institute, HBKU, where he led the Tanbih mega-project, developed in collaboration with MIT, which aims to limit the impact of "fake news", propaganda and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking. He received his PhD degree in Computer Science from the University of California at Berkeley, supported by a Fulbright grant. He is Chair-Elect of the European Chapter of the Association for Computational Linguistics (EACL), Secretary of ACL SIGSLAV, and Secretary of the Truth and Trust Online board of trustees. Formerly, he was PC chair of ACL 2022, and President of ACL SIGLEX. He is also member of the editorial board of several journals including Computational Linguistics, TACL, ACM TOIS, IEEE TASL, IEEE TAC, CS&L, NLE, AI Communications, and Frontiers in AI. He authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and 250+ research papers. He received a Best Paper Award at ACM WebSci'2022, a Best Long Paper Award at CIKM'2020, a Best Demo Paper Award (Honorable Mention) at ACL'2020, a Best Task Paper Award (Honorable Mention) at SemEval'2020, a Best Poster Award at SocInfo'2019, and the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. His research was featured by over 100 news outlets, including Reuters, Forbes, Financial Times, CNN, Boston Globe, Aljazeera, DefenseOne, Business Insider, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.
Stay tuned for future events:
For ways to receive news about the Archimedes Unit and its meetings, check https://archimedesai.gr/en/. To subscribe to the mailing list of Archimedes, send a message with title "subscribe archimedes-news Firstname LastName” (where Firstname and LastName are your First and Last Name respectively) to This email address is being protected from spambots. You need JavaScript enabled to view it.. The body of the message may be blank
If you are an AI researcher or practitioner, please consider becoming a member of the Hellenic Artificial Intelligence Society (EETN, http://www.eetn.gr/en/).
________________________________________________________________________________
Meeting ID: 364 885 596 720
Passcode: jbYNHY
________________________________________________________________________________