Main Interests:
- Scala and Java programming
- Distributed, scalable and fault-tolerant system design
- Big Data analytics (MapReduce, Hadoop, Spark, Storm)
Current projects (2012-present):
- CrowdStreams: design and implementations of a real-time traffic data monitoring system to estimate/forecast traffic flow in the neighbourhood of a social event, such as Paleo, using the Spark Streaming framework.
- Livre Artist: extracting, normalizing and creating a relational database for the Helvetica Catalog of Swiss National Library.
- Realtime Data Loss Prevention: design and implementation of a content-based DLP solution for “data in motion” in order to construct realtime data leakage detection platform using Apache HBase, Storm and Spark computing platforms.
- CITATION: design of a data middleware infrastructure on the Cloud for crowd monitoring project in the context of Internet of Things at IICT (iNuit project) and implementation of a REST API using MongoDB and Python.
- Efficient Near Duplicate Detection: design and implementation of an efficient tool to detect similar (duplicated) contents in large collections using the MapReduce programming model and evaluation using Apache Hadoop and Spark platforms.
- SITG Catalog Clustering: automatic clustering of the documents of the Système d’Information du Territoire à Genève (SITG) using Apache Mahout.
- Social Media Monitoring: desing and implementation of a classification algorithm for sentiment analysis in social networks (Twitter).
- THESAURO: automatic reorganization of the Radio Télévision Suisse Romande (RTS) thesaurus using association rule mining.
- GARSV: semi-automatic generation of resumes from the videos of the Grand Conseil Neuchâtelois.
Postdoctoral projects (2011):
- Designing Semi-Synchronous Algorithms: algorithms that their progress is independent of the system timeout.
- Transforming Consensus to Fair Atomic Broadcast: algorithms that their progress is independent of the behavior of faulty processes.
- Publications: EPFL InfoScience Library
Doctoral project (2006-2010):
- Design, implementation and quantitative analysis of fault-tolerant distributed algorithms to provide reliable and highly available systems.
- Development of a decentralized Byzantine consensus algorithm that outperforms the existing centralized algorithms in the worst case.
- Simulation and performance evaluation of distributed algorithms in the context of wireless mobile ad hoc networks.
- Ph.D. Thesis: Round-Based Consensus Algorithms, Predicate Implementations and Quantitative Analysis
- Publications: EPFL InfoScience Library
Master project (2005-2006):
- Optimizing XML-based expressions of Scala using XQuery transformations and shipping.
- Proposing a formalism that maps XQuery expressions into Scala comprehensions.
- XQuery to Scala Translator
- Master Thesis: Efficient Semi-structured Queries in Scala using XQuery Shipping