Top Seed Talent Program
What We Are Looking For
“About the Program” 

The Top Seed Talent Program is an exclusive initiative launched by the ByteDance Doubao team to attract top-tier university research talent. The program includes full-time positions for recent Ph.D. graduates and research internships for outstanding current students.
We are committed to identifying and recruiting the world's leading AI researchers globally to join us in pushing the boundaries of AI.


Firm technical conviction and passion
A willingness to tackle the industry's most challenging problems, explore uncharted technical paths, and conduct research to reshape AI's history.
Exceptional research capabilities
Demonstrate deep technical expertise in a specific area of AI, publish high-quality and impactful papers, or make significant open-source contributions.
Curiosity and Drive
Possess refined technical taste and intuition, coupled with curiosity, drive, and a proven track record. While past achievements are necessary, we equally value your research potential.
Research Internship
Candidates graduating in or after September 2025
Campus Recruitment
PhD candidates graduating between September 2024 and August 2025
Research Internship
Candidates graduating in or after September 2025
01
We Prioritize the Experience of Top
Seed Interns
Seed Interns

Interns are Highly Valued
Interns can receive the same access and resources as full-time employees.
High Degree of Research Freedom
Interns can independently choose their research topics and benefit from flexible internship arrangements, including remote work and university-industry collaborations.
Open and Collaborative Culture
We encourage the publication of research findings and foster a culture of open knowledge-sharing within the team.
Competitive Compensation
We offer top-tier internship compensation far exceeding the industry average.
02
Research Topics

LLMs
Generalization of reward models and reinforcement learning models
Self-learning of reward and reinforcement learning models
Interpretability of large language models
Authenticity of large language models
Next-generation reinforcement learning algorithms
Large language models based on self-play

Machine Learning Algorithms and Systems
Develop efficient LLM architectures to optimize performance while minimizing training and inference costs.
Research massive training clusters to enhance training stability and machine fraction utilization, facilitating effective cross-cluster training.
Address memory-bound issues during inference, investigate multi-machine inference, and develop parallel inference strategies.
Integrate next-generation computing systems to advance model architectures, training methods, and inference techniques.
Explore algorithm innovation for LLM foundation models.

Multimodal Generation
Visual Generation Foundation Models: Research and develop highly interactive and controllable image/video generation foundation models, and explore video visual pattern modeling and multi-task applications.
3D/4D Generation and Physical World Modeling: Explore rendering and physics engines driven by generative models.
Multimodal Generation Model Optimization: Network architecture design, diffusion model acceleration, and efficient distributed training and inference techniques.

Multimodal Understanding
Multimodal Understanding Foundation Models: Develop general models integrating language, vision, and audio, and strengthen fundamental capabilities such as text, layout, and spatial relationships in images and videos.
Multimodal Reasoning and Agent Breakthroughs: Multimodal retrieval augmented generation, visual chain-of-thought reasoning, and the construction of general-purpose agents for GUI and game scenarios.
Unified Generative-Understanding Modeling: Joint representation and training methods for continuous and discrete signals to enable dynamic interaction.
Multimodal World Model Construction: Model virtual/real environments using simulation and pre-training to explore multimodal interaction.

Speech
Audio Foundation Model Research and Development: Unified modeling of speech recognition/synthesis/translation and music/sound effect generation.
Multimodal Speech Model Optimization: Innovative network architectures, lightweight diffusion models, and on-device inference acceleration.
Reinforcement Learning Applications in Speech: RL algorithms and system optimization for multimodal speech/audio tasks.
03
Collaborate with
Exceptional Researchers
Exceptional Researchers

Yonghui Wu
Head of Seed Fundamental Research2023 Google Fellow
Yonghui graduated with a B.S. from Nanjing University in 2001, followed by a Ph.D. in Computer Science and an M.S. in Statistics from the University of California, Riverside. After joining Google in 2008, he spent 17 years contributing to various projects, beginning with search algorithm optimization and later becoming a core member of the Google Brain team. He was a key contributor to Google's Neural Machine Translation and RankBrain projects. In 2023, he was promoted to Google Fellow and Vice President of Research at Google DeepMind. In 2025, Yonghui joined ByteDance's Doubao team as the Head of Fundamental Research.

Chengyi Wang
Research on RLHF and Super Reasoning
Chengyi is a joint PhD graduate from Microsoft Research Asia and Nankai University, mentored by Dr. Zhou Ming. Her doctoral research focused on pre-training speech large language models, resulting in multiple publications at top international conferences such as ACL and ICML, as well as several patents. Her representative work includes WavLM and VALL-E. Chengyi's publications received nominations for Best Student Paper at Interspeech and Best Paper at IEEE SPS. VALL-E was selected as one of the top ten innovative projects of 2023 by Netexplo. Chengyi joined ByteDance's Doubao team in 2023. Her current research focuses on cutting-edge areas such as RLHF and super reasoning.
.png)
Yujia Qin
Multimodal AgentsOpen-source Projects Gained 20K+ GitHub Stars
Yujia holds a B.S. in Electronic Engineering from Tsinghua University and a Ph.D. in Computer Science from Tsinghua University. Yujia has published numerous papers at top international conferences, including ICLR, NeurIPS, Nature Machine Intelligence, ACL, and EMNLP, accumulating over 3,800 Google Scholar citations. He released several open-source projects, such as UI-TARS, XAgent, and ToolBench, garnering over 20,000 GitHub stars. Yujia joined ByteDance's Doubao team in 2024, focusing on research in multimodal agents, including computer use and game agents.

Wanjun Zhong
Super Alignment and Reasoning Research for LLM2021 MSRA Fellowship
Wanjun graduated in 2023 from the joint Ph.D. program between Sun Yat-sen University and Microsoft Research Asia, earning a Ph.D. in Computer Science under the supervision of Dr. Ming Zhou and Prof. Jian Yin and Jiahai Wang. She received the 2021 MSRA Fellowship and the National Scholarship for Doctoral Students during her doctoral studies, and won the CVPR challenge. Wanjun is currently a researcher in the LLM research team at ByteDance Doubao, focusing on super alignment and reasoning research for large language models. The large language model evaluation suite she led, AGIEval, is used by major manufacturers.
.png)
Liang Xiang
FoundationMachine Learning and Recommendation Systems
Liang graduated from the Department of Automation, University of Science and Technology of China in 2006. He was later directly admitted to the Institute of Automation, Chinese Academy of Sciences, and received his Ph.D. His research focused on machine learning and recommendation systems. He is the author of "Recommendation System Practice" and the founder of the ResysChina recommendation system community. After joining ByteDance, he researched video understanding at AI Lab and later became the head of the recommendation system team. In 2021, he joined AML and is currently the head of the ByteDance AML and Doubao Foundation team, leading the team in exploring fundamental and cutting-edge AI algorithms and engineering technologies.
.png)
Yuxuan Wang
Deep Learning and Speech Technology ResearchGoogle Scholar Citations 18,000+
Yuxuan earned his Ph.D. from Ohio State University, focusing on deep learning and speech technology. He led several groundbreaking speech industry projects, with results widely adopted in academia and industry. Yuxuan leads the speech team at ByteDance Doubao, guiding the team in fundamental research and product development related to multimodal generation and understanding.
.png)
Jiashi Feng
Vision ResearchGoogle Scholar Citations 69,000+
Jiashi received his Ph.D. from the National University of Singapore, specializing in computer vision and machine learning and their applications in multimedia. He leads the vision research team at ByteDance Doubao, focusing on cutting-edge research and applications in visual and multimodal foundation models, AIGC, and 3D avatar/object reconstruction and generation. Previously, he was an Assistant Professor in the Department of Electrical and Computer Engineering at the National University of Singapore, leading a machine learning and computer vision research lab and mentoring or co-mentoring nearly 20 Ph.D. students.
.png)
Zhi Tian
Visual Generative ModelsGoogle Scholar Citations 15,000+
Zhi received his Ph.D. in Computer Science from the University of Adelaide, Australia, under the supervision of Prof. Chunhua Shen. He was a recipient of the 2019 Google Ph.D. Fellowship. He is currently a research scientist for visual generative models at ByteDance Doubao, focusing on computer vision generation and understanding algorithms. He has published numerous highly-cited papers in top conferences and journals, accumulating over 15,000 Google Scholar citations.
.png)
Zhuo Chen
Audio Generation ResearchGoogle Scholar Citations 12,000+
Zhuo received his Ph.D. from Columbia University in 2017 and worked as a principal applied scientist at Microsoft. Currently, he leads the audio generation research team at ByteDance Doubao. He has published over 100 research papers and patents, advancing state-of-the-art in several speech tasks. His contributions are notable in speech generation, recognition, and translation; speech separation and enhancement; speaker recognition and diarization; speech self-supervised learning; multi-channel processing; and open-source speech datasets.
.png)
Mingxuan Wang
LLM ResearchArea Chair for NeurIPS, ACL, etc.
Mingxuan received his Ph.D. from the Institute of Computing Technology, Chinese Academy of Sciences. He currently leads the LLM research team at ByteDance Doubao. In machine translation, he has published over 50 papers at top conferences and won the WMT international machine translation evaluation competition multiple times. He has contributed to several open-source projects, including LightSeq and mRASP, which are widely used in the industry. He has been an area chair and sponsorship chair for conferences such as NeurIPS, ACL, and EMNLP.
04
Interns are Conducting Globally Impactful Research in Doubao Team
Trusted by:




Campus Recruitment
PhD candidates graduating between September 2024 and August 2025
Campus Recruitment
The 2026 Top Seed Campus Recruitment is launching soon. Stay tuned!PhD candidates graduating between September 2024 and August 2025
01
Research Topics

LLMs
Generalization of reward models and reinforcement learning models
Self-learning of reward and reinforcement learning models
Interpretability of large language models
Authenticity of large language models
Next-generation reinforcement learning algorithms
Large language models based on self-play

Machine Learning Algorithms and Systems
Develop efficient LLM architectures to optimize performance while minimizing training and inference costs.
Research massive training clusters to enhance training stability and Machine Fraction Utilization (MFU), facilitating effective cross-cluster training.
Address memory-bound issues during inference, investigate multi-machine inference, and develop parallel inference strategies.
Integrate next-generation computing systems to advance model architectures, training methods, and inference techniques.
Explore algorithm innovation for LLM foundation models.

Multi-Modal Understanding and Generation
Develop foundational models for multi-modal understanding and generation across images, audio, and video, pursuing a unified modeling approach.
Design and optimize the architecture of multi-modal models and diffusion models, emphasizing efficient large-scale distributed training and inference systems.
Explore efficient methods for representing 3D objects and scenes, learn world knowledge from video data, and construct models of the physical world.
Unified foundation models for audio understanding and generation, including speech recognition, audio synthesis, voice conversion, music generation, and sound effects.
02
Collaborate with
Exceptional Researchers
Exceptional Researchers

Lin Yan
Post-trainRLHFSelf-learning Model
Lin holds a graduate degree from the Institute of Computing Technology, Chinese Academy of Sciences, and currently leads the Doubao LLM post-training team at ByteDance. His research interests include instruction tuning, reward modeling, RLHF, RLAIF, and self-learning models. He is also conducting cutting-edge research on large language models' generalization, interpretability, and factuality.

Chenggang Li
Pre-trainData Cleaning, Synthesis and ProportioningScaling Capability
Chenggang holds a B.S. and M.S. in Mechanical Engineering from Tsinghua University. He served as the technical lead for web search at ByteDance's Toutiao and video search at TikTok, developing ByteDance's search system from the ground up. He made significant innovations in ranking architecture and algorithms, as well as in multilingual and multimodal relevance, achieving leading-edge results in Chinese search. He currently leads the pre-training team for ByteDance's Doubao LLM, focusing on data cleaning, synthesis, and proportioning; associative and curriculum learning; training algorithms; and scaling capability.

Liang Xiang
FoundationMachine Learning and Recommendation Systems
Liang graduated from the Department of Automation, University of Science and Technology of China in 2006. He was later directly admitted to the Institute of Automation, Chinese Academy of Sciences, and received his Ph.D. His research focused on machine learning and recommendation systems. He is the author of "Recommendation System Practice" and the founder of the ResysChina recommendation system community. After joining ByteDance, he researched video understanding at AI Lab and later became the head of the recommendation system team. In 2021, he joined AML and is currently the head of the ByteDance AML and Doubao Foundation team, leading the team in exploring fundamental and cutting-edge AI algorithms and engineering technologies.

Yuxuan Wang
Deep Learning and Speech Technology ResearchGoogle Scholar Citations 18,000+
Yuxuan earned his Ph.D. from Ohio State University, focusing on deep learning and speech technology. He led several groundbreaking speech industry projects, with results widely adopted in academia and industry. Yuxuan leads the speech team at ByteDance Doubao, guiding the team in fundamental research and product development related to multimodal generation and understanding.

Jiashi Feng
Vision ResearchGoogle Scholar Citations 69,000+
Jiashi received his Ph.D. from the National University of Singapore, specializing in computer vision and machine learning and their applications in multimedia. He leads the vision research team at ByteDance Doubao, focusing on cutting-edge research and applications in visual and multimodal foundation models, AIGC, and 3D avatar/object reconstruction and generation. Previously, he was an Assistant Professor in the Department of Electrical and Computer Engineering at the National University of Singapore, leading a machine learning and computer vision research lab and mentoring or co-mentoring nearly 20 Ph.D. students.

Zhi Tian
Visual Generative ModelsGoogle Scholar Citations 15,000+
Zhi received his Ph.D. in Computer Science from the University of Adelaide, Australia, under the supervision of Prof. Chunhua Shen. He was a recipient of the 2019 Google Ph.D. Fellowship. He is currently a research scientist for visual generative models at ByteDance Doubao, focusing on computer vision generation and understanding algorithms. He has published numerous highly-cited papers in top conferences and journals, accumulating over 15,000 Google Scholar citations.

Zhuo Chen
Audio Generation ResearchGoogle Scholar Citations 12,000+
Zhuo received his Ph.D. from Columbia University in 2017 and worked as a principal applied scientist at Microsoft. Currently, he leads the audio generation research team at ByteDance Doubao. He has published over 100 research papers and patents, advancing state-of-the-art in several speech tasks. His contributions are notable in speech generation, recognition, and translation; speech separation and enhancement; speaker recognition and diarization; speech self-supervised learning; multi-channel processing; and open-source speech datasets.

Mingxuan Wang
LLM ResearchArea Chair for NeurIPS, ACL, etc.
Mingxuan received his Ph.D. from the Institute of Computing Technology, Chinese Academy of Sciences. He currently leads the LLM research team at ByteDance Doubao. In machine translation, he has published over 50 papers at top conferences and won the WMT international machine translation evaluation competition multiple times. He has contributed to several open-source projects, including LightSeq and mRASP, which are widely used in the industry. He has been an area chair and sponsorship chair for conferences such as NeurIPS, ACL, and EMNLP.
Q&A
Campus Recruitment
Research Internship
Q
What is the difference between the Top Seed Talent Program and the ByteDance Soaring Star Talent Program?
A
Both talent programs are aimed at PhD candidates graduating in 2025, with different research directions. If you aspire to work in fields such as LLM, speech, vision, world models, foundational architectures, AI infrastructure, and next-generation AI interaction, choose the Top Seed Talent Program. If you are keen on diving into fields such as AI applications, search, recommendations, advertising, AI safety, privacy and security, hardware, video architecture, and engineering architecture, go for the ByteDance Soaring Star Program.
Q
What is the application mechanism like for the two talent programs?
A
The Top Seed Talent Program is open for recruitment all year round. An applicant is granted one chance to apply, while in the ByteDance Soaring Star Program, an applicant has two opportunities to apply. The application opportunities for the two talent programs are independent, and we will prioritize the position that is applied for first. Moreover, applications for both talent programs will not impact the regular application process for the 2025 campus recruitment. We welcome exceptional individuals to join us!
Q
If I am a candidate for the class of 2025 and have already received an internship offer from another team, can I still apply for a position under the Top Seed Talent Program?
A
A candidate can only be in the process for one position at a time. In special circumstances, please contact the HR of the current position for assistance.
Apply Now