Profile image
Shuhuai Ren (任抒怀)
Ph.D. Student
School of EECS
Peking University

About Me


I am a PhD student (4th year) at Language Computing and Machine Learning Group, MOE Key Laboratory of Computational Linguistics, School of EECS, Peking University, supervised by Prof. Xu Sun.

Before that, I was an undergraduate student in Software Engineering, Huazhong University of Science and Technology under the guidance of Prof. Kun He.

My research interests lie within (1) Vision-Language Foundation Models, (2) Video Understanding and Generation, and (3) Open-Ended Visual Recognition.

I’m currently seeking a job (topic: multi-modal LLM, video understanding and generation, etc.). Feel free to reach out if you are interested!

News


  • [2024/07] One paper has been accepted by ECCV 2024.
  • [2024/05] Two papers has been accepted by ACL 2024.
  • [2024/04] One paper has been accepted by NAACL 2024.
  • [2024/02] One paper has been accepted by CVPR 2024.
  • [2023/10] One paper has been accepted by EMNLP 2023.
  • [2023/09] Two papers have been accepted by NeurIPS 2023.
  • [2023/05] One paper has been accepted by ACL 2023.
  • [2021/08] Three papers have been accepted by EMNLP 2021.
  • [2021/05] One paper has been accepted by ACL 2021 as oral presentation.
  • [2019/05] One paper has been accepted by ACL 2019 as oral presentation.
  • [2019/02] I attended the Artificial Intelligence Winter Camp at The University of California, Berkeley and Stanford University.

Selected Publications (Full List)


Video Understanding and Generation
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Shuhuai Ren*, Linli Yao*, Shicheng Li, Xu Sun, Lu Hou
CVPR 2024
Conference
Paper Code& Model
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Shuhuai Ren, Sishuo Chen, Shicheng Li, Xu Sun, Lu Hou
Findings of EMNLP 2023 (Long Paper)
Conference
Paper Code& Model
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Chaoyou Fu, Yuhan Dai, Yondong Luo, Lei Li, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun
Arxiv 2024
Arxiv
Paper Code& Model
FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Yuanxin Liu, Lei Li, Shuhuai Ren, Rundong Gao, Shicheng Li, Sishuo Chen, Xu Sun, Lu Hou
NeurIPS 2023 (Dataset & Benchmark Track)
Conference
Paper Code& Model

Vision-Language Alignment
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu Sun
NeurIPS 2023
Conference
Paper Code& Model
M3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning
Lei Li, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu Sun, Lingpeng Kong, Qi Liu
Arxiv 2023
Arxiv
Paper Dataset
Delving into the Openness of CLIP
Shuhuai Ren, Lei Li, Xuancheng Ren, Guangxiang Zhao, Xu Sun
Findings of ACL 2023 (Long Paper)
Conference
Paper Code& Model
Learning Relation Alignment for Calibrated Cross-modal Retrieval
Shuhuai Ren, Junyang Lin, Guangxiang Zhao, Rui Men, An Yang, Jingren Zhou, Xu Sun*, Hongxia Yang
ACL 2021 (Long Paper, Oral)
Conference
Paper Code& Model
Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency
Shuhuai Ren, Yihe Deng, Kun He*, Wangxiang Che
ACL 2019 (Long Paper, Oral)
Conference
Paper Code& Model

Education


Ph.D. in Computer Science, Peking University, Beijing, China
2020 - 2025 (expected)
B.Eng. in Software Engineering, Huazhong University of Science and Technology, Wuhan, China
GPA: 3.95/4.00; Rank: 4/180
2016 - 2020

Selected Awards and Competitions


Peking University (PKU),  Sep. 2020 - Jul. 2025 (Expected)

  • NeurIPS Scholar Award, 2023
  • The third Prize of Peking University Scholarship, 2020-21
  • Award for Scientific Research, 2020-22

  • Huazhong University of Science and Technology (HUST),  Sep. 2016 - Jul. 2020

  • Pacemaker to Merit Student (The highest honor of undergraduate students),  2017-18
  • National Scholarship of China,  2017-18
  • Outstanding Graduates of HUST,  2020
  • Outstanding Undergraduate Thesis,  2020
  • Baosteel Scholarship,  2019
  • Hainan Airlines Scholarship,  2016-17
  • American College Students Mathematical Modeling Competition (MCM/ICM), Honorable Mention,  2018
  • Academic Service


  • Reviewer/Program committee for conferences: ICLR (2023), NeurIPS (2023), ACL (2021 Outstanding Reviewer Award-23), EMNLP (2021-23), NAACL (2022-23), ARR (2021-23), COLING (2020, 2022)
  • Reviewer for journals: TPAMI, IJCV
  • Teaching assistant: Introduction to Natural Language processing (PKU, 2021 Fall), Artificial Intelligence Frontier and Industry Trends (PKU, 2022 Spring)
  • Work Experience


    Research Intern
    Microsoft, Research Lab Asia
    Advised by Shuming Ma
    Beijing, China
    Applied Scientist Intern
    Amazon Web Services, AI Research
    Advised by Aston Zhang and Yi Zhu
    San Francisco Bay Area, USA (remote)
    Aug. 2022 - Aug. 2023
    Research Intern
    Alibaba, DAMO Academy
    Advised by Junyang Lin and Hongxia Yang
    Beijing, China
    Aug. 2020 - Feb. 2021
    Research Intern
    Tencent, WeChat AI (WXG-PRC)
    Advised by Jinchao Zhang
    Beijing, China
    Oct. 2019 - Jun. 2020
    Research Intern
    HUST, John Hopcroft Lab for Data Science
    Advised by Kun He
    Wuhan, China
    Mar. 2018 - Oct. 2019

    Patents


  • 数据增广、业务处理方法、装置、计算机设备和存储介质 [link]:任抒怀; 张金超
  • 信息处理方法、装置、计算机可读存储介质和计算机设备 [link]:李磊; 林衍凯; 任抒怀; 李鹏; 周杰; 孙栩
  • Invited Talks


  • [2024/06] Microsoft Research Asia. Long Video Understanding Based on Large Language Models, [slides (in Chinese)].
  • [2023/12] Department of Mathematics, Houston University. Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition.
  • [2023/08] Noah's Ark Lab, Huawei. 开眼看世界: 大语言模型的视觉接口设计.
  • [2023/04] Shanghai AI Lab. Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition, [slides].
  • [2022/07] Noah's Ark Lab, Huawei. Delving into the Openness of CLIP, [slides (in Chinese)].
  • [2022/06] DAMO Academy, Alibaba. Delving into the Openness of CLIP.
  • [2021/10] Student Forums on Frontiers of Artificial Intelligence (SFFAI). Learning Relation Alignment for Calibrated Cross-modal Retrieval, [video (in Chinese)].