The research I supervised and contributed to has generated a number of software and data repositories. They are accessible from Idiap’s Data Distribution Portal (pointing to Zenodo repositories), Idiap’s Github repository, HEIG-VD IDA group’s Github repository, and the HuggingFace model store. The software and data are listed in alphabetical order, with co-authors in parentheses, mostly PhD students.
Software
- ACT: metric for the Accuracy of Connective Translation (with N. Hajlaoui, Th. Meyer)
- APT: metric for the Accuracy of Pronoun Translation (with L. Miculicich Werlen)
- CBrec: content-based recommender (with N. Pappas)
- CRPO: Computer-assisted poetic creation (with A.R. Atrio, G. Luthier, V. Minder, A. Xanthos)
- DiscoConn: classifier for connective labeling in English texts (with Th. Meyer)
- DocRec: keyword extraction and document recommendation (with M. Habibi)
- EMORec: emotion-based recommendation generator (with N. Pappas)
- PLACAT: robust conversational agent for information access (with G. Luthier)
- SLOG: learning similarity metrics on graphs (with M. Yazdani)
- WMIL-SGD: multiple-instance learning for sentiment analysis (with N. Pappas)
Data
- AREX: AMI requests for explanations, with relevance judgments (with M. Habibi)
- Discourse Connective Annotation (with Th. Meyer, B. Cartoni, S. Zufferey)
- HATDOC: Human Attention Scores in Document Classification (with N. Pappas)
- Tense-annotation: parallel verb tense annotation on Europarl (with Th. Meyer, C. Grisot)
- TED Ratings: ground truth human-made recommendations (with N. Pappas)
Models
- GPT2-Poetry: a gpt2-small model (137 M parameters) fine-tuned on the Gutenberg Poetry Corpus to generate English poetry (with T. Ferrari)
- RoBERTa-Poetry: a series of eight RoBERTa-based models (125M parameters) fine-tuned on poetry, respectively for one of five topics, or one of three emotions (with T. Ferrari)