With a background on Information Retrieval and Distributed Database Systems, during the last years I have conducted several projects involving data mining algorithms, big data (storage, analytics and machine learning tools) and more recently on data pipelines and stream processing .
Working on interdisciplinary projects has always fascinated me. I am curious on learning from and communicating with the professionals of other disciplines. I am specially interested in working on projects on:
- Media, Arts and Cultural Heritage
- Sustainable Energy
- Social Impact
- Grid Data Digger (Innosuisse, 2019-2021): An automated Distribution Grid
Operation Assistance Tool using Data-Driven solutions on
a big-data Platform.
- Dermintel – An AI powered Digital Care Compagnion (2016-2018). Technical supervision leading to the co-funding of the start-up by my Master student.
- DermaQA – Automatic Gerneration of a Dermatology Question Dataset (International Internship, 2018): Automatic detection of similar dermatology-related questions from community-based question-answering forums using IR and deep-learning.
- CrowdStreams (HES-SO grant, 2015-2017): Real-time analysis and monitoring of mobility in the proximity of big events using Apache Spark and ML-LIB machine-learning library.
- Livre Artist (Contract for Bibliothèque National Suisse, 2014-2015): Extraction and analysis of annotations from the complete bibliographic collection possessed by Bibliothèque National Suisse (BNS) in order to characterize artistic works.
- RT-DLP (Contract for Crossing-Tech SA, 2014-2015): Real-Time Content-based Data Loss Prevention (DLP) Technology Feasibility Study. Comparison of Spark, Storm, Hadoop and Hbase for the implementation of a scalable DLP tool.
- SR-DLP (Hasler Foundation grant, 2013-2014): Efficient and Scalable Near Duplicate Detection for Content-based Data Leakage Detection, comparison of four MapReduce algorithms.
- Ef-NDD (RCSO grant, 2012-2013): Efficient Near Duplicate Detection. Improving the efficiency of Near Duplicate Detection algorithms for security audit.
- Thesauro (RCSO grant): Thesaurus Automatic Reorganization. Association rules mining on the RTS (Radio Télévision Swiss Romande) archive to restructure their thesaurus. The result prototype is currently used at the RTS.
- ClusterSITG (Contract for l’état de Génève, 2012-2013): Automatic clustering and semantic linking of the geographical metadata of the terms used by “Service des Systèmes d’Information et de Géomatique (SSIG) de l’état de Genève”.
- Health Social Media Monitoring (prototype for SwissRe Life & Health R&D, 2011-2012) : A tweet classification prototype to analyse patients, diseases and medications.
- Thesauro (RCSO grant): Automatic Reorganization. Association rules mining on the RTS (Radio Télévision Swiss Romande) archive to restructure their thesaurus. The result prototype is currently used at the RTS.
- NotreHistoireMobile (Contract for RTS, 2011-2012): Mobile applications on iOS and Android developed following the Mobiwalk platform for the RTS notrehistoire.ch.
- NDD (Contract for Price WaterhouseCoopers, 2010-2011): Design and implementation of Near Duplicate Detection algorithms with high precision. The result was an audit tool tested and exclusively used for one year by PWC.
- Mobilwalk (RCSO grant, 2009-2011): A generic platform to provide multimedia- based services to the users on the move.
- Walking-the-edit (Contract for ECAL, 2008-2009).
In the past, I have been working on audiovisual retrieval for cultural heritage applications
- Video annotation enrichment
- User oriented video querying and browsing
- Audiovisual description standards, MPEG-7 querying