Unifying Generation and Representation: Toward Omni-Modal, Reasoning-Aware Search Systems

发布者：梁慧丽发布时间：2025-11-28浏览次数：10

报告人

Chenghao Xiao

Durham University

时间

2025年10月14日星期二

下午 14:00-15:00

地点

602会议室

Abstract

Modern AI search systems typically treat retrieval, reasoning and generation as separate modules, leading to fragmented pipelines, error propagation and missed opportunities for deep semantic understanding. In this talk, I present a unifying vision: representation learning as the alignment with latent generative knowledge. I will show how this principle bridges the long-standing divide between generative and representation models across text, image, video and audio.

Throughout the talk, I will discuss key advances from my recent work., including 1) RAR-b: where I proposed the “reasoning as retrieval” paradigm, leading to early conceptualization of representation learning as “alignment of models’ representation capabilities with their generative capabilities”. 2) MIEB, where I introduced the largest multimodal embedding benchmark, which demonstrates that multimodal generative models achieve superior representational performance with orders-of-magnitude less contrastive activation than CLIP paradigm. 3) LCO-Embedding, a language-centric training paradigm of omni-modal representation model I led at Alibaba DAMO Academy, where I also introduced “Generation-Representation Scaling Law”.

Finally, I outline frontiers directions such as 1) reinforcement learning for representation learning (RL for RL); 2) exploring non-autoregressive generative backbones (e.g., diffusion language models) for representation learning; 3) Environment-aware AI search. Together, these steps pave the way toward truly unified omni-modal AI search systems – where retrieval emerges not as a separate component, but as an intrinsic capability of generative intelligence.

Biography

Chenghao Xiao is a final-year PhD candidate at Durham University, UK. His research interest is primarily on unifying representation learning and generative models. His research has resulted in 20 publications in top-tier conferences and journals such as NeurIPS, ACL, ICCV, ICLR, EMNLP, NAACL, and TACL. He proposed the reasoning as retrieval paradigm, a revolutionary paradigm that conceptualizes training representation models as an alignment of models’ representation capabilities with their generative capabilities. He led and was the core contributor of widely-adopted embedding benchmarks like RAR-b, MIEB, and MMTEB. He led LCO-Embedding, a language-centric omni-modal representation model, at Alibaba DAMO Academy.

导航

学术交流

Unifying Generation and Representation: Toward Omni-Modal, Reasoning-Aware Search Systems

联系我们

友情链接

搜索
您想要找的

导航

学术交流

Unifying Generation and Representation: Toward Omni-Modal, Reasoning-Aware Search Systems

联系我们

友情链接

搜索您想要找的

搜索
您想要找的