Computationally Budgeted Online Kernel Learning and Its Application to Efficient LLM Inference

发布者：梁慧丽发布时间：2026-05-12浏览次数：10

主讲人

Junfan Li

Harbin Institute of Technology (Shenzhen)

时间

2026年5月14日星期四

下午 14:00-15:00

地点

学院104会议室

Abstract

Transformer architectures rely on the self-attention mechanism, whose time complexity during inference is quadratic in the length of the input, restricting the deployment in long-context inference tasks. It has been proved that the self-attention mechanism is essentially a kernel model. Inspired by this connection, we investigate the problem of computationally budgeted online kernel learning, explore the fundamental trade-off between learning performance and computational budget, and further characterize the conditions as well as the corresponding algorithms that can break this inherent trade-off. If the eigenvalues of kernel matrix decay fast, then it is possible to maintain optimal or nearly optimal learning performance and significantly reduce the computational cost simultaneously. Finally, we show how the proposed online kernel learning algorithms can be applied to efficient LLM inference, achieving a nearly linear acceleration in computational efficiency.

Biography

Junfan Li is currently a Postdoctoral Fellow at Harbin Institute of Technology (Shenzhen), where he works with Prof. Liqiang Nie and Prof. Zenglin Xu. He received his Ph.D and M.S. in Computer Science from Tianjin University, advised by Prof. Shizhong Liao, and his B.S. in Biomedical Engineering from University of Electronic Science and Technology of China. His research interests lie in machine learning under resource constraints, such as computational resource, communication, attributes and privacy, with a particular focus on resource-limited online (kernel) learning, federated learning, and efficient LLM inference. He has published first-author papers in COLT, ICML, NeurIPS, AAAI, ECML-PKDD, ML, and JCST.

导航

学术交流

Computationally Budgeted Online Kernel Learning and Its Application to Efficient LLM Inference

联系我们

友情链接

搜索
您想要找的

导航

学术交流

Computationally Budgeted Online Kernel Learning and Its Application to Efficient LLM Inference

联系我们

友情链接

搜索您想要找的

搜索
您想要找的