曹婷,清华大学人工智能产业研究院(AIR)研究员/教授。研究方向包括边缘人工智能、神经网络推理系统、新型神经网络加速器以及基础模型算法等。研究成果已发表在ISCA、ASPLOS、MobiCom、MobiSys、NSDI、OSDI、PLDI、EuroSys、SC、PPoPP等计算机系统国际顶会,以及ICCV、ACL、KDD等人工智能领域国际顶会。
获得奖项包括2012 ACM Research Highlights、2012 IEEE Micro Top Picks、2021 ACM SIGMOBILE Research Highlights,以及PPoPP'24、MobiSys'21、NAS'14、ICCD'10最佳论文奖等。成果研究极大推动了边缘人工智能应用发展,使能复杂神经网络模型业界首次在手机和个人电脑等的产品部署,集成到微软Office、Windows、Bing、华为鸿蒙等数以百万计用户的产品中。
曹婷博士毕业于澳大利亚国立大学,师从Prof. Steve Blackburn 和Prof. Kathryn McKinley, 博士课题聚焦高能效软硬件协同设计。图灵奖得主David Patterson教授曾在《Communication of the ACM》撰写专文推荐她在量化分析软硬件能效方面的创新工作,并将相关成果纳入其经典教材《计算机体系结构:量化研究方法》。
曹婷博士现任IEEE Transactions on Computers期刊副主编,并多次担任MobiSys、PLDI、OOPSLA、VEE、ChinaSys、ISMM等国际会议的程序委员会委员。
邮箱地址:
tingcao@mail.tsinghua.edu.cn
个人主页:
https://tingcao952.github.io/
工作经历:
2025年7月至今 清华大学智能产业研究院 研究员/教授
2018-2025 微软亚洲研究院 首席研究员及研究主管
2016-2018 华为编译器与编程语言实验室 高级软件工程师
研究领域:
边缘人工智能、基础模型算法、神经网络推理系统以及新型神经网络加速器
近期论文著作摘选:
1. Yuxuan Yan, Shiqi Jiang, Ting Cao, Yifan Yang, Qianqian Yang, Yuanchao Shu, qing Yang, Lili Qiu, “AVA: Towards Agentic Video Analytics Systems with Video Language Models” USENIX Symposium on Networked Systems Design and Implementation (NSDI), May 2026
2. Xin Ding, Hao Wu, Yifan Yang, Shiqi Jiang, Qianxi Zhang, Donglin Bai, Zhibo Chen, Ting Cao, “StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition” International Conference on Computer Vision (ICCV), Oct. 2025
3. Tuowei Wang, Ruwen Fan, Minxing Huang, Zixu Hao, Kun Li, Ting Cao, Youyou Lu, Yaoxue Zhang, Ju Ren, “Neuralink: Fast on-Device LLM Inference with Neuron Co-Activation Linking”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2026
4. Tuowei Wang, Xingyu Chen, Kun Li, Ting Cao, Ju Ren, Yaoxue Zhang, “Jenga: Enhancing Long-Context Fine-tuning of LLMs with Contextual Token Sparsity”, USENIX Annual Technical Conference (ATC), July, 2025
5. Qipeng Wang, Shiqi Jiang, Yifan Yang, Ruiqi Liu, Yuanchun Li, Ting Cao, Xuanzhe Liu, “Efficient and Adaptive Diffusion Model Inference Through Lookup Table on Mobile Devices”, IEEE Transactions on Mobile Computing (TMC).
6. Zhiwen Mo, Lei Wang, Jianyu Wei, Zhiwen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang, “LUTensor: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference”, The 52nd Annual International Symposium on Computer Architecture 2025 (ISCA), June 2025
7. Shenghong Dai, Shiqi Jiang, Yifan Yang, Ting Cao, Mo Li, S. Banerjee, Lili Qiu, “Babel: A Scalable Pre-trained Model for Multi-Modal Sensing via Expandable Modality Alignment”, the 23rd ACM Conference on Embedded Networked Sensor Systems (SenSys), May 2025
8. Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang, “T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge”, The 2025 ACM European Conference on Computer Systems (EuroSys), May 2025
9. Guoyu Li, Chunyun Chen, Shengyu Ye, Yang Wang, Fan Yang, Ting Cao, Mohamed M. Sabry Aly, Cheng Liu, Mao Yang, “LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator”, 31st IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2025
10. Yiwei Zhang, Kun Li, Liang Yuan, Yunquan Zhang, Ting Cao, Mao Yang, “Jigsaw: Toward Conflict-free Vectorized Stencil Computation by Tessellating Swizzled Registers”, 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), Mar. 2025
11. Haozhi Han, Kun Li, Wei Cui, Donglin Bai, Yifeng Chen, Ting Cao, Mao Yang, “FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units”, 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), Mar. 2025
12. Yifei Liu, Jicheng Wen, Yang Wang, Shengyu Ye, Li Lyna Zhang, Ting Cao, Cheng Li, Mao Yang, “VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models”, The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2024
13. Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li, Yun Ma, Ting Cao, Xuanzhe Liu, “Anatomizing Deep Learning Inference in Web Browsers”, ACM Transactions on Software Engineering and Methodology (TOSEM), 2024-0102.R2
14. Hanfei Geng, Yifei Liu, Yujie Zheng, Li Lyna Zhang, Jingwei Sun, Yujing Wang, Yang Wang, Guangzhong Sun, Mao Yang, Ting Cao, Yunxin Liu, “PruneAug: Bridging DNN Pruning and Inference Latency on Diverse Sparse Platforms Using Automatic Layerwise Block Pruning”, IEEE Transactions on Computers (TC), July, 2024
15. DaYou Du, Yijia Zhang, Shijie Cao, Jiaqi Guo, Ting Cao, Xiaowen Chu, Ningyi Xu, “BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation”, 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 Main Conference, Long paper), Aug. 2024
16. Yijia Zhang, Sicheng Zhang, Shijie Cao, DaYou Du, Jianyu Wei, Ting Cao, Ningyi Xu, “AFPQ: Asymmetric Floating Point Quantization for LLMs”, 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 Finding short paper), Aug. 2024
17. Yiwei Zhang, Kun Li, Liang Yuan, Jiawen Cheng, Yunquan Zhang, Ting Cao, Mao Yang, “LoRAStencil: Low-Rank Adaptation of Stencil Computation on Tensor Cores”, International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’24), Nov. 2024
18. Tuowei Wang, Kun Li, Zixu Hao, Donglin Bai, Ju Ren, Yaoxue Zhang, Ting Cao, Mao Yang, “Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity”, International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’24), Nov. 2024
19. Yijia Zhang, Lingran Zhao, Shijie Cao, Wenqiang Wang, Ting Cao, Fan Yang, Mao Yang, Shanghang Zhang, Ningyi Xu “Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models” IEEE International Conference on Multimedia and Expo (ICME’24), July 2024
20. R. Hwang, J. Wei, S. Cao, C. Hwang, X. Tang, Ting Cao, M. Yang “Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference” The 51st Annual International Symposium on Computer Architecture 2024 (ISCA’24), June 2024.Microsoft Research Focus.
21. L. Wang, L. Ma, S. Cao, Q. Zhang, J. Xue, Y. Shi, N. Zheng, Z. Miao, F. Yang, Ting Cao, Y. Yang, M. Yang “Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation” The 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI'24), July 2024
22. F. Jia, S. Jiang, Ting Cao, W. Cui, T. Xia, X. Cao, Y. Li, Q. Wang, D. Zhang, J. Ren, Y. Liu, L. Qiu, M. Yang “Empowering In-Browser Deep Learning Inference on Edge Through Just-In-Time Kernel Optimization” The 22nd Annual International Conference on Mobile Systems, Applications and Services (MobiSys’24), Jun. 2024