About Me
I am currently a Staff Scientist with the AdScience team of Pandora Media. Pandora is the US
leader
music streaming company, providing personalized music recommendation to over 75M listeners in
the US
only. As part of the Adscience team, I'm involved in the following projects:
- Machine Learning related projects, such as building audience segments, supply forecasting,
listener feature selection.
- Listeners embedding: build and improve our embedding models in order to represent accurately
Pandora's listener music taste and Ad preference.
- Technologies used include Hive, Python, Scala, Spark, MlLib, AirFlow, Hadoop.
Before joining Pandora, I was a Distinguished Engineer, Data Science, with Marin Software, the
worldwide leader company in Search advertising.
Regarding my scientific background, I obtained my Ph.D. degree in computer science from the
Applied
Mathematics and Systems (MAS) laboratory of Ecole Centrale Paris, France, in early 2013. My PhD
dissertation, titled "Building and Using Knowledge Models for Semantic Image Annotation", was
supervised by Dr. Céline Hudelot. Afterwards, I held the position of a research and
development engineer at the CEA-List Vision & Content Engineering laboratory, Atomic Energy
Commission, France.
Summary of Research Interests
My main research interests include image semantics modeling, image annotation, multimedia
information retrieval, image and data mining, statistical machine learning, and ontological
engineering. I am also interested in social media processing, semantic analysis of social
content and social marketing.
About Semantic Image Annotation
Automatic image annotation is a challenging problem dealing with the textual description of
images.
This process usually consists in the building of a computational model that enables to associate
a
text description (often reduced to a set of semantic keywords) to digital images. A wide number
of
approaches have been proposed to address this concern and to narrow the well-known semantic gap
problem. Most approaches rely on machine learning techniques to provide a mapping function that
allows classifying images in semantic classes using their visual features. However, these
approaches
face the scalability problem when dealing with broad content image databases, i.e. their
performances decrease significantly when the concept number is high and depend on the targeted
datasets as well. This variability may be explained by the huge intra-concept variability and
the
wide inter-concept similarities on their visual properties that often lead to conflicted and
incoherent annotations. Therefore, the only use of machine learning seems to be insufficient to
solve the image annotation problem.
Recently, some approaches have proposed to use structured knowledge models, such as semantic
hierarchies and ontologies, in order to improve the image annotation accuracy. Indeed, semantic
hierarchies and ontologies have shown to be very useful to narrow the semantic gap. They allow
identifying, in a formal way, the dependency relationships between the different concepts and
therefore provide a valuable information source for many computer vision related problems. For
instance, they allow improving image annotation by supplying a formal framework to reason about
the
consistency of extracted information from images. They can also be used as a hierarchical
framework
for image classification, and allow for efficiencies in both learning and representation.
The scope of my researches is to explore how to effectively build and use multimedia knowledge
models, such as semantic hierarchies and ontologies, in order to represent image semantics in an
effective way. The ultimate goal is to improve the accuracy of multimedia retrieval systems and
to
produce semantically relevant image annotation systems.