About Me

I am currently a Staff Scientist with the AdScience team of Pandora Media. Pandora is the US leader music streaming company, providing personalized music recommendation to over 75M listeners in the US only. As part of the Adscience team, I'm involved in the following projects:
- Machine Learning related projects, such as building audience segments, supply forecasting, listener feature selection.
- Listeners embedding: build and improve our embedding models in order to represent accurately Pandora's listener music taste and Ad preference.
- Technologies used include Hive, Python, Scala, Spark, MlLib, AirFlow, Hadoop.
Before joining Pandora, I was a Distinguished Engineer, Data Science, with Marin Software, the worldwide leader company in Search advertising.
Regarding my scientific background, I obtained my Ph.D. degree in computer science from the Applied Mathematics and Systems (MAS) laboratory of Ecole Centrale Paris, France, in early 2013. My PhD dissertation, titled "Building and Using Knowledge Models for Semantic Image Annotation", was supervised by Dr. Céline Hudelot. Afterwards, I held the position of a research and development engineer at the CEA-List Vision & Content Engineering laboratory, Atomic Energy Commission, France.

Summary of Research Interests

My main research interests include image semantics modeling, image annotation, multimedia information retrieval, image and data mining, statistical machine learning, and ontological engineering. I am also interested in social media processing, semantic analysis of social content and social marketing.

About Semantic Image Annotation

Automatic image annotation is a challenging problem dealing with the textual description of images. This process usually consists in the building of a computational model that enables to associate a text description (often reduced to a set of semantic keywords) to digital images. A wide number of approaches have been proposed to address this concern and to narrow the well-known semantic gap problem. Most approaches rely on machine learning techniques to provide a mapping function that allows classifying images in semantic classes using their visual features. However, these approaches face the scalability problem when dealing with broad content image databases, i.e. their performances decrease significantly when the concept number is high and depend on the targeted datasets as well. This variability may be explained by the huge intra-concept variability and the wide inter-concept similarities on their visual properties that often lead to conflicted and incoherent annotations. Therefore, the only use of machine learning seems to be insufficient to solve the image annotation problem.

Recently, some approaches have proposed to use structured knowledge models, such as semantic hierarchies and ontologies, in order to improve the image annotation accuracy. Indeed, semantic hierarchies and ontologies have shown to be very useful to narrow the semantic gap. They allow identifying, in a formal way, the dependency relationships between the different concepts and therefore provide a valuable information source for many computer vision related problems. For instance, they allow improving image annotation by supplying a formal framework to reason about the consistency of extracted information from images. They can also be used as a hierarchical framework for image classification, and allow for efficiencies in both learning and representation.

The scope of my researches is to explore how to effectively build and use multimedia knowledge models, such as semantic hierarchies and ontologies, in order to represent image semantics in an effective way. The ultimate goal is to improve the accuracy of multimedia retrieval systems and to produce semantically relevant image annotation systems.