Dr. Ting Cao is a Research Professor at the Institute of AI Industry Research (AIR), Tsinghua University. Her research interests include Edge AI, large foundation model algorithms, AI inference systems, and novel AI accelerators. Her research works have been published in top-tier computer system conferences such as ISCA, ASPLOS, MobiCom, MobiSys, NSDI, OSDI, PLDI, EuroSys, SC, and PPoPP, as well as leading AI algorithm venues including ICCV, ACL, and KDD.
She has numerous awards, such as 2012 ACM Research highlights, 2012 IEEE Micro Top Picks, 2021 ACM SIGMOBILE Research highlights, and PPoPP’24, MobiSys’21, NAS’14 and ICCD’10 Best paper awards. Dr. Cao’s research has significantly advanced the field of AI on edge devices, making it possible to deploy complex deep neural networks and large language models directly onconsumer devices such as smartphones and PCs, greatly lowering Cloud operational costs. Her innovations have been integrated into products used by millions, including Microsoft Office, Windows, Bing, and Huawei HarmonyOS.
She received her PhD from the School of Computing at the Australian National University, where she was honoured to be supervised by [Prof. Steve Blackburn](https:www.steveblackburn.org) and [Prof. Kathryn McKinley](https://www.cs.utexas.edu/~mckinley/). Her doctoral research focused on the Software and Hardware co-design for energy efficiency. Turing Award winner Prof. David Patterson wrote a [perspective article "For Better or Worse, Benchmarks Shape a Field"](https://cacm.acm.org/research/technical-perspective-for-better-or-worse-benchmarks-shape-a-field/), recommending her work on [quantitative energy analysis for modern software and hardware](https://cacm.acm.org/research/looking-back-and-looking-forward/) for the ACM Research Highlight, and integrated the results into the classic textbook "*Computer architecture: A Quatitative Approach*".
Dr. Cao serves as an Associate Editor for the journal *IEEE Transactions on Computers*. She has also served on the program committees of conferences such as MobiSys, PLDI, OOPSLA, VEE, ChinaSys, and ISMM.Dr. Cao is the associate editor of the "Transactions on Computers". She has also served as PCs for conferences including MobiSys, PLDI, OOPSLA, VEE, ChinaSys, and ISMM.
Email address:
tingcao@mail.tsinghua.edu.cn
Personal Academic Website:
https://tingcao952.github.io/
Career:
July 2025 till now Institute of AI Industrial Research, Tsinghua University Research Professor
2018-2025 Microsoft Research Asia Principal Researcher/Research Manager
2016-2018 Huawei Technologies Senior Software Engineer
Research Fields:
Edge AI, large foundation model algorithms, AI inference systems, and novel AI accelerators
Selected Publications:
1. Yuxuan Yan, Shiqi Jiang, Ting Cao, Yifan Yang, Qianqian Yang, Yuanchao Shu, qing Yang, Lili Qiu, “AVA: Towards Agentic Video Analytics Systems with Video Language Models” USENIX Symposium on Networked Systems Design and Implementation (NSDI), May 2026
2. Xin Ding, Hao Wu, Yifan Yang, Shiqi Jiang, Qianxi Zhang, Donglin Bai, Zhibo Chen, Ting Cao, “StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition” International Conference on Computer Vision (ICCV), Oct. 2025
3. Tuowei Wang, Ruwen Fan, Minxing Huang, Zixu Hao, Kun Li, Ting Cao, Youyou Lu, Yaoxue Zhang, Ju Ren, “Neuralink: Fast on-Device LLM Inference with Neuron Co-Activation Linking”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2026
4. Tuowei Wang, Xingyu Chen, Kun Li, Ting Cao, Ju Ren, Yaoxue Zhang, “Jenga: Enhancing Long-Context Fine-tuning of LLMs with Contextual Token Sparsity”, USENIX Annual Technical Conference (ATC), July, 2025
5. Qipeng Wang, Shiqi Jiang, Yifan Yang, Ruiqi Liu, Yuanchun Li, Ting Cao, Xuanzhe Liu, “Efficient and Adaptive Diffusion Model Inference Through Lookup Table on Mobile Devices”, IEEE Transactions on Mobile Computing (TMC).
6. Zhiwen Mo, Lei Wang, Jianyu Wei, Zhiwen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang, “LUTensor: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference”, The 52nd Annual International Symposium on Computer Architecture 2025 (ISCA), June 2025
7. Shenghong Dai, Shiqi Jiang, Yifan Yang, Ting Cao, Mo Li, S. Banerjee, Lili Qiu, “Babel: A Scalable Pre-trained Model for Multi-Modal Sensing via Expandable Modality Alignment”, the 23rd ACM Conference on Embedded Networked Sensor Systems (SenSys), May 2025
8. Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang, “T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge”, The 2025 ACM European Conference on Computer Systems (EuroSys), May 2025
9. Guoyu Li, Chunyun Chen, Shengyu Ye, Yang Wang, Fan Yang, Ting Cao, Mohamed M. Sabry Aly, Cheng Liu, Mao Yang, “LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator”, 31st IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2025
10. Yiwei Zhang, Kun Li, Liang Yuan, Yunquan Zhang, Ting Cao, Mao Yang, “Jigsaw: Toward Conflict-free Vectorized Stencil Computation by Tessellating Swizzled Registers”, 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), Mar. 2025
11. Haozhi Han, Kun Li, Wei Cui, Donglin Bai, Yifeng Chen, Ting Cao, Mao Yang, “FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units”, 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), Mar. 2025
12. Yifei Liu, Jicheng Wen, Yang Wang, Shengyu Ye, Li Lyna Zhang, Ting Cao, Cheng Li, Mao Yang, “VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models”, The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2024
13. Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li, Yun Ma, Ting Cao, Xuanzhe Liu, “Anatomizing Deep Learning Inference in Web Browsers”, ACM Transactions on Software Engineering and Methodology (TOSEM), 2024-0102.R2
14. Hanfei Geng, Yifei Liu, Yujie Zheng, Li Lyna Zhang, Jingwei Sun, Yujing Wang, Yang Wang, Guangzhong Sun, Mao Yang, Ting Cao, Yunxin Liu, “PruneAug: Bridging DNN Pruning and Inference Latency on Diverse Sparse Platforms Using Automatic Layerwise Block Pruning”, IEEE Transactions on Computers (TC), July, 2024
15. DaYou Du, Yijia Zhang, Shijie Cao, Jiaqi Guo, Ting Cao, Xiaowen Chu, Ningyi Xu, “BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation”, 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 Main Conference, Long paper), Aug. 2024
16. Yijia Zhang, Sicheng Zhang, Shijie Cao, DaYou Du, Jianyu Wei, Ting Cao, Ningyi Xu, “AFPQ: Asymmetric Floating Point Quantization for LLMs”, 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 Finding short paper), Aug. 2024
17. Yiwei Zhang, Kun Li, Liang Yuan, Jiawen Cheng, Yunquan Zhang, Ting Cao, Mao Yang, “LoRAStencil: Low-Rank Adaptation of Stencil Computation on Tensor Cores”, International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’24), Nov. 2024
18. Tuowei Wang, Kun Li, Zixu Hao, Donglin Bai, Ju Ren, Yaoxue Zhang, Ting Cao, Mao Yang, “Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity”, International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’24), Nov. 2024
19. Yijia Zhang, Lingran Zhao, Shijie Cao, Wenqiang Wang, Ting Cao, Fan Yang, Mao Yang, Shanghang Zhang, Ningyi Xu “Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models” IEEE International Conference on Multimedia and Expo (ICME’24), July 2024
20. R. Hwang, J. Wei, S. Cao, C. Hwang, X. Tang, Ting Cao, M. Yang “Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference” The 51st Annual International Symposium on Computer Architecture 2024 (ISCA’24), June 2024.Microsoft Research Focus.
21. L. Wang, L. Ma, S. Cao, Q. Zhang, J. Xue, Y. Shi, N. Zheng, Z. Miao, F. Yang, Ting Cao, Y. Yang, M. Yang “Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation” The 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI'24), July 2024
22. F. Jia, S. Jiang, Ting Cao, W. Cui, T. Xia, X. Cao, Y. Li, Q. Wang, D. Zhang, J. Ren, Y. Liu, L. Qiu, M. Yang “Empowering In-Browser Deep Learning Inference on Edge Through Just-In-Time Kernel Optimization” The 22nd Annual International Conference on Mobile Systems, Applications and Services (MobiSys’24), Jun. 2024