
Antoni Bigata, Rodrigo Mira, Stella Bounareli, Michał Stypułkowski, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
arXiv, 2025
KeySync is a two-stage framework for lip synchronization that addresses key challenges in aligning lip movements with new audio, such as temporal inconsistencies, expression leakage, and facial occlusions. Unlike previous approaches, KeySync uses a novel masking strategy to handle occlusions and reduce leakage from the input video. It achieves state-of-the-art performance in lip reconstruction and cross-synchronization, as measured by our new LipLeak metric, and its effectiveness is validated through comprehensive ablation studies.