Antoni Bigata Casademunt

Antoni Bigata Casademunt

PhD student at Imperial College London. Working on human avatars and generative AI. Intern @ Meta and @ Disney Research.

Research

KeySync
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Antoni Bigata, Rodrigo Mira, Stella Bounareli, Michał Stypułkowski, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
arXiv, 2025

KeySync is a two-stage framework for lip synchronization that addresses key challenges in aligning lip movements with new audio, such as temporal inconsistencies, expression leakage, and facial occlusions. Unlike previous approaches, KeySync uses a novel masking strategy to handle occlusions and reduce leakage from the input video. It achieves state-of-the-art performance in lip reconstruction and cross-synchronization, as measured by our new LipLeak metric, and its effectiveness is validated through comprehensive ablation studies.

KeyFace
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
Antoni Bigata, Michał Stypułkowski, Rodrigo Mira, Stella Bounareli, Konstantinos Vougioukas, Zoe Landgraf, Nikita Drobyshev, Maciej Zieba, Stavros Petridis, Maja Pantic
CVPR, 2025

Current facial animation methods struggle with consistency over long durations, leading to unnatural motion and identity drift. We introduce KeyFace, a novel two-stage diffusion-based framework that generates keyframes at low frame rates and interpolates smooth transitions, ensuring natural and coherent animation. Our model captures continuous emotions and non-speech vocalizations (NSVs) like laughter and sighs, setting a new standard for long-form facial animation.

CroissantLLM
🥐 CroissantLLM: A Truly Bilingual French-English Language Model
Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F.T. Martins, Gautier Viaud, Céline Hud Hudelot
TMLR, 2025

CroissantLLM is a 1.3B bilingual model trained on 3T English and French tokens, designed for high performance on consumer hardware. With a 1:1 English-French training approach, a custom tokenizer, and FrenchBench for evaluation, it sets a new standard for multilingual NLP. Fully open-sourced, it includes datasets, checkpoints, and fine-tuned models.

EMOPortraits
EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Nikita Drobyshev, Antoni Bigata Casademunt, Konstantinos Vougioukas, Zoe Landgraf, Stavros Petridis, Maja Pantic
CVPR, 2024

EMOPortraits is a head reenactment model that enhances realism in expressing intense, asymmetric emotions and sets new standards in emotion transfer. Additionally, we integrated a speech-driven mode for improved audio-visual animation and introduced a novel multi-view video dataset that captures a broader range of expressions, addressing a critical gap in existing data.

Laughing Matters
Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Antoni Bigata Casademunt, Rodrigo Mira, Nikita Drobyshev, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
BMVC, 2023

While speech-driven animation has made impressive strides, non-verbal communication—especially laughter, remains an open challenge. Our work introduces a novel model that generates realistic laughter sequences from a still portrait and an audio clip. By leveraging diffusion models and training on diverse laughter datasets, we outperform traditional facial animation methods, setting a new benchmark for laughter synthesis.