My research interests are in natural language processing, machine translation, information retrieval, language and multimodal resources, and the evaluation of linguistic and interactive systems.
My recent research projects include :
- DOMAT (On-demand Knowledge for Document-level Machine Translation) supported by the SNSF
- UNISUB (Unsupervised NMT with innovative multilingual subword models) supported by Armasuisse
- Decrypt (Décrypter les perturbations prosodiques pour améliorer l’apprentissage de la communication) supported by InnoSuisse (with TALK Sàrl)
- Digital Lyric (Création Poétique Assistée par Ordinateur, et Anthologies Numériques Personnalisables) supported by the SNSF Agora program (main applicant: University of Lausanne)
Previously, I have been the leader of the MODERN SNSF Sinergia project (2013-2017) on Modeling Discourse Entities and Relations for Coherent Machine Translation, which followed the COMTIS Sinergia project (2010-2013), on Improving the Coherence of Machine Translation Output by Modeling Intersentential Relations. I have been involved in the IM2 NCCR from its start (2002-2013), working on dialogue processing, data management and HCI, as head or deputy head of two individual projects. I have contributed to the SUMMA EU project (2016-2017), as well as to the inEvent European project (2011-2014), and to the AMIDA European integrated project (2007-2010) as head of a work package on interfaces and evaluation.
Publication list | Google Scholar profile
Recent articles
- Wolleb B., Silvestri R., Vernikos G., Dolamic L., & Popescu-Belis A. (2023) – Assessing the Importance of Frequency versus Compositionality for Subword-based Tokenization in NMT. Proceedings of the 24th Annual Conference of the European Association for Machine Translation (EAMT), Tampere, Finland, 10 p.
- Atrio A.R., Allemann A., Dolamic L., & Popescu-Belis A. (2023) – A Simplified Training Pipeline for Low-resource and Unsupervised Machine Translation. Proceedings of the Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT), EACL, Dubrovnik, Croatia, p. 47-58.
- Popescu-Belis A., Atrio A.R., Bernath B., Boisson E., Ferrari T., Theimer-Lienhard X., & Vernikos G. (2023) – GPoeT: a Language Model Trained for Rhyme Generation on Synthetic Data. Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL), EACL, Dubrovnik, Croatia, p. 10-20.
- Atrio A.R. & Popescu-Belis A. (2022) – On the Interaction of Regularization Factors in Low-resource Neural Machine Translation. Proceedings of the 23rd Annual Conference of the European Association for Machine Translation (EAMT), Ghent, Belgium, p. 111-120
- Popescu-Belis A., Atrio A.R., Minder V., Xanthos A., Luthier G., Mattei S., & Rodriguez A. (2022) – Constrained Language Models for Interactive Poem Generation. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), Marseille, France, p. 3519–3529.
- Atrio A.R. & Popescu-Belis A. (2021) – Small Batch Sizes Improve Training of Low-Resource Neural MT. Proceedings of the 18th International Conference on Natural Language Processing (ICON), Online and Silchar, India.
- Vernikos G. & Popescu-Belis A. (2021) – Subword Mapping and Anchoring across Languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), Online and Punta Cana, Dominican Republic, p. 2633-2647.
- Atrio A.R., Luthier G., Fahy A., Vernikos G., Popescu-Belis A., & Dolamic L. (2021) – The IICT-Yverdon System for the WMT 2021 Unsupervised MT and Very Low Resource Supervised MT Task. Proceedings of the 6th Conference on Machine Translation (WMT), Online and Punta Cana, Dominican Republic, p. 978-986.
- Luthier G. & Popescu-Belis A. (2020) – Chat or Learn: a Data-Driven Robust Question-Answering System. Proceedings of LREC 2020 (12th Language Resources and Evaluation Conference).
- Melly J., Luthier G. & Popescu-Belis A. (2020) – A Consolidated Dataset for Knowledge-based Question Generation using Predicate Mapping of Linked Data. Proceedings of ISA-16 (16th Joint ACL-ISO Workshop on Interoperable Semantic Annotation), Online.
- Popescu-Belis A. (2019) – Context in Neural Machine Translation: A Review of Models and Evaluations. arXiv:1901.09115, Jan. 2019, 16 p.
Recent talks at conferences
- Popescu-Belis A., Xanthos A., Minder V., Atrio À. R., Luthier G., and Rodriguez A. (2020) – Interactive Poem Generation: when Language Models support Human Creativity. Presentation at the 5th Swiss Text Analytics and 16th KONVENS Conference (SwissText-KONVENS 2020), Zurich, Switzerland.
- Popescu-Belis A. and Luthier G. (2020) – Learning, optimizing, or hand-crafting chatbot components. Talk at the 4th Applied Machine Learning Days (AMLD 2020), EPFL, Lausanne, Switzerland.
- Popescu-Belis A. (2019) – Context in NMT evaluation: methods and implications for MT. Invited talk, SMART Select Workshop on Document-level MT Evaluation, Luxembourg, 19 November 2019.
- Luthier G. and Popescu-Belis A. (2019) – PLACAT: A user-friendly question answering system for smart speaker devices. Talk at the 4th Swiss Text Analytics Conference (SwissText 2019), Winterthur, June 2019.
- Large contexts in neural machine translation. Talk at the 3rd Applied Machine Learning Days (AMLD 2019), EPFL, Lausanne, Switzerland, 29 January 2019.