Almut Sophia Koepke


I am a post-doctoral research fellow at the University of Tübingen and a visiting scholar at UC Berkeley, advised by Alexei (Alyosha) Efros. I also lead the junior research group MuMoL at the Technical University of Munich. My research focusses on multi-modal learning problems with sound, vision, and text.

I completed my DPhil in the Visual Geometry Group (VGG) at the University of Oxford, supervised by Andrew Zisserman. After that, I was a post-doctoral researcher in the Explainable Machine Learning (EML) group, led by Zeynep Akata. I spent a summer at Reichman University working with Yael Moses.


Email  /  LinkedIn  /  Twitter  /  Bluesky  /  Google Scholar  

Publications

teaser-image Dissecting temporal understanding in text-to-audio retrieval
Andreea-Maria Oncescu, João F. Henriques, A. Sophia Koepke
ACM Multimedia (ACMMM), 2024
paper / project page / code
teaser-image Fantastic gains and where to find them: On the existence and prospect of general knowledge transfer between any pretrained model
Karsten Roth*, Lukas Thede*, A. Sophia Koepke, Oriol Vinyals, Olivier J. Hénaff, Zeynep Akata
International Conference on Learning Representations (ICLR), 2024
paper / code / openreview
Spotlight.
teaser-image A sound approach: Using large language models to generate audio descriptions for egocentric text-audio retrieval
Andreea-Maria Oncescu, João F. Henriques, Andrew Zisserman, Samuel Albanie, A. Sophia Koepke
The International Conference on Acoustics, Speech, & Signal Processing (ICASSP), 2024
paper / project page / code
teaser-image Addressing caveats of neural persistence with deep graph persistence
Leander Girrbach, Anders Christensen, Ole Winther, Zeynep Akata, A. Sophia Koepke
Transactions on Machine Learning Research (TMLR), 2023
paper / code / openreview / video
A part of this was also presented at the TAG-ML Workshop at ICML 2023.
teaser-image Zero-shot audio captioning with audio-language model guidance and audio context keywords
Leonard Salewski, Stefan Fauth, A. Sophia Koepke, Zeynep Akata
NeurIPS Workshop on Machine Learning for Audio, 2023
paper / code
teaser-image Video-adverb retrieval with compositional adverb-action embeddings
Thomas Hummel, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata
British Machine Vision Conference (BMVC), 2023
paper / project page / code
Oral presentation.
teaser-image Waffling around for performance: Visual classification with random words and broad concepts
Karsten Roth, Jae Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
paper / code
teaser-image Image-free classifier injection for zero-shot classification
Anders Christensen, Massimiliano Mancini, A. Sophia Koepke, Ole Winther, Zeynep Akata
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
paper / code
teaser-image Text-to-feature diffusion for audio-visual few-shot learning
Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
DAGM German Conference on Pattern Recognition (GCPR), 2023
paper / code
teaser-image Zero-shot translation of attention patterns in VQA models to natural language
Leonard Salewski, A. Sophia Koepke, Hendrik Lensch, Zeynep Akata
DAGM German Conference on Pattern Recognition (GCPR), 2023
paper / code
teaser-image Exposing and mitigating spurious correlations for cross-modal retrieval
Jae Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata
Multimodal Learning and Applications Workshop at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPRW), 2023
paper / code
teaser-image PlanT: Explainable planning transformers via object-level representations
Katrin Renz, Kashyap Chitta, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata, Andreas Geiger
Conference on Robot Learning (CoRL), 2022
paper / project page / code
teaser-image Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea*, Thomas Hummel*, A. Sophia Koepke, Zeynep Akata
European Conference on Computer Vision (ECCV), 2022
paper / code
teaser-image Audio-visual generalised zero-shot learning with cross-modal attention and language
Otniel-Bogdan Mercea, Lukas Riesch, A. Sophia Koepke, Zeynep Akata
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
paper / code
teaser-image CLEVR-X: A visual reasoning dataset for natural language explanations
Leonard Salewski, A. Sophia Koepke, Hendrik Lensch, Zeynep Akata
Springer Lecture Notes on Artificial Intelligence, 2022
paper / project page / code
This was also presented at the CVPR 2022 Workshop on Explainable AI for Computer Vision (XAI4CV).
teaser-image Audio retrieval with natural language queries: A benchmark study
A. Sophia Koepke*, Andreea-Maria Oncescu*, João F. Henriques, Zeynep Akata, Samuel Albanie
Transactions on Multimedia, 2022
paper / project page / code
Extension of the INTERSPEECH paper with a new dataset and new results.
teaser-image Audio retrieval with natural language queries
Andreea-Maria Oncescu*, A. Sophia Koepke*, João F. Henriques, Zeynep Akata, Samuel Albanie
INTERSPEECH, 2021
paper / project page / code
Shortlisted for best student paper award.
teaser-image Distilling audio-visual knowledge by compositional contrastive learning
Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
paper / code
teaser-image Sight to sound: An end-to-end approach for visual piano transcription
A. Sophia Koepke, Olivia Wiles, Yael Moses, Andrew Zisserman
The International Conference on Acoustics, Speech, & Signal Processing (ICASSP), 2020
paper / project page
Oral presentation.
teaser-image Self-supervised learning of class embeddings from video
Olivia Wiles, A. Sophia Koepke, Andrew Zisserman
Compact and Efficient Feature Representation and Learning in Computer Vision Workshop at the IEEE/CVF International Conference on Computer Vision (ICCV Workshop), 2019
paper
teaser-image Visual pitch estimation
A. Sophia Koepke, Olivia Wiles, Andrew Zisserman
Sound and Music Computation Conference (SMC), 2019
paper / project page
teaser-image Self-supervised learning of a facial attribute embedding from video
Olivia Wiles*, A. Sophia Koepke*, Andrew Zisserman
British Machine Vision Conference (BMVC), 2018
paper / supplementary material / project page / code
Oral presentation.
teaser-image X2Face: A network for controlling face generation by using images, audio, and pose codes
Olivia Wiles*, A. Sophia Koepke*, Andrew Zisserman
European Conference on Computer Vision (ECCV), 2018
paper / project page / code
* denotes equal contribution
Teaching

Introduction to Machine Learning (University of Tübingen, Summer semester 2023)

Community service

Area chair:


Outstanding reviewer:
Reviewer:
  • ICCV (2023)
  • NeurIPS (2023)
  • ECCV (2022)
  • CVPR (2021, 2022)
  • ACCV (2020)
  • CVPR workshop: Learning with Limited Labelled Data for Image and Video Understanding (2022)
  • ICCV workshop: Closing the loop between Vision and Language (2021)
  • NeurIPS workshop: The preregistration experiment: an alternative publication model for machine learning research (2020)
  • CVPR workshop: Women in Computer Vision (2019, 2020)
  • ICCV workshop: Neural Architects (2019)
  • IJCV
  • TPAMI
  • IEEE Access
  • Eurographics

Workshop organisation:

Misc

Tübingen AI Center
KI macht Schule (AI education for pupils)