Latest Releases
April 16, 2025
Seedream 3.0 Officially Released
It supports native 2K resolution output, offers faster response speeds, generates more accurate small text, improves text layout effects, enhances aesthetics and structural quality, and demonstrates excellent fidelity and detail performance. It has achieved leading rankings in multiple evaluations.
April 16, 2025
Seedream 3.0 Officially Released
It supports native 2K resolution output, offers faster response speeds, generates more accurate small text, improves text layout effects, enhances aesthetics and structural quality, and demonstrates excellent fidelity and detail performance. It has achieved leading rankings in multiple evaluations.
March 11, 2025
Seedream 2.0 Tech Report
It aims to address critical limitations in existing image generation systems, including model bias, insufficient text rendering capabilities, and deficiencies in understanding culturally nuanced prompts.
March 11, 2025
Seedream 2.0 Tech Report
It aims to address critical limitations in existing image generation systems, including model bias, insufficient text rendering capabilities, and deficiencies in understanding culturally nuanced prompts.
January 22, 2025
Doubao-1.5-pro
It uses MoE , and can surpass the performance of first-class extremely large dense pre-trained models with only a small activation parameter, and achieves excellent results on multiple evaluation benchmarks.
January 22, 2025
Doubao-1.5-pro
It uses MoE , and can surpass the performance of first-class extremely large dense pre-trained models with only a small activation parameter, and achieves excellent results on multiple evaluation benchmarks.
Latest Releases
April 16, 2025
Seedream 3.0 Officially Released
It supports native 2K resolution output, offers faster response speeds, generates more accurate small text, improves text layout effects, enhances aesthetics and structural quality, and demonstrates excellent fidelity and detail performance. It has achieved leading rankings in multiple evaluations.
April 16, 2025
Seedream 3.0 Officially Released
It supports native 2K resolution output, offers faster response speeds, generates more accurate small text, improves text layout effects, enhances aesthetics and structural quality, and demonstrates excellent fidelity and detail performance. It has achieved leading rankings in multiple evaluations.
March 11, 2025
Seedream 2.0 Tech Report
A Native Chinese-English Bilingual Image Generation Foundation Model
It aims to address critical limitations in existing image generation systems, including model bias, insufficient text rendering capabilities, and deficiencies in understanding culturally nuanced prompts.
March 11, 2025
Seedream 2.0 Tech Report
A Native Chinese-English Bilingual Image Generation Foundation Model
It aims to address critical limitations in existing image generation systems, including model bias, insufficient text rendering capabilities, and deficiencies in understanding culturally nuanced prompts.
January 22, 2025
Doubao-1.5-pro
Balance between Superior Model Performance and Optimal Inference Efficiency
It uses MoE , and can surpass the performance of first-class extremely large dense pre-trained models with only a small activation parameter, and achieves excellent results on multiple evaluation benchmarks.
January 22, 2025
Doubao-1.5-pro
Balance between Superior Model Performance and Optimal Inference Efficiency
It uses MoE , and can surpass the performance of first-class extremely large dense pre-trained models with only a small activation parameter, and achieves excellent results on multiple evaluation benchmarks.
View more
Selected Papers

Apr 15, 2025
Seedream 3.0 Technical Report
We present Seedream 3.0, a high-performance Chinese-English bilingual image generation foundation model. We develop several technical improvements to address existing challenges in Seedream 2.0, including alignment with complicated prompts, fine-grained typography generation, suboptimal visual aesthetics and fidelity, and limited image resolutions. Specifically, the advancements of Seedream 3.0 stem from improvements across the entire pipeline, from data construction to model deployment. At the data stratum, we double the dataset using a defect-aware training paradigm and a dual-axis collaborative data-sampling framework. Furthermore, we adopt several effective techniques such as mixed-resolution training, cross-modality RoPE, representation alignment loss, and resolution-aware timestep sampling in the pre-training phase. During the post-training stage, we utilize diversified aesthetic captions in SFT, and a VLM-based reward model with scaling, thereby achieving outputs that well align with human preferences. Furthermore, Seedream 3.0 pioneers a novel acceleration paradigm. By employing consistent noise expectation and importance-aware timestep sampling, we achieve a 4 to 8 times speedup while maintaining image quality. Seedream 3.0 demonstrates significant improvements over Seedream 2.0: it enhances overall capabilities, in particular for text-rendering in complicated Chinese characters which is important to professional typography generation. In addition, it provides native high-resolution output (up to 2K), allowing it to generate images with high visual quality.
Seed Vision Team
Vision
Computer Vision
2025.04.15
Seedream 3.0 Technical Report
Seed Vision Team
Vision
Computer Vision

Apr 10, 2025
Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning
We introduce Seed-Thinking-v1.5, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. Seed-Thinking-v1.5 achieves 86.7 on AIME 2024, 55.0 on Codeforces and 77.3 on GPQA, demonstrating excellent reasoning abilities in STEM and coding. Beyond reasoning tasks, the method demonstrates notable generalization across diverse domains. For instance, it surpasses DeepSeek R1 by 8% in win rate on non-reasoning tasks, indicating its broader applicability. Compared to other state-of-the-art reasoning models, Seed-Thinking-v1.5 is a Mixture-of-Experts (MoE) model with a relatively small size, featuring 20B activated and 200B total parameters. As part of our effort to assess generalized reasoning, we develop two internal benchmarks, BeyondAIME and Codeforces, both of which will be publicly released to support future research.
Jiaze Chen, TianTian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu,Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang
Seed
LLM
LLM
2025.04.10
Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning
Jiaze Chen, TianTian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu,Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang
Seed
LLM
LLM

Apr 03, 2025
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
The task of issue resolving is to modify a codebase to generate a patch that addresses a given issue. However, existing benchmarks, such as SWE-bench, focus almost exclusively on Python, making them insufficient for evaluating Large Language Models (LLMs) across diverse software ecosystems. To address this, we introduce a multilingual issue-resolving benchmark, called Multi-SWE-bench, covering Java, TypeScript, JavaScript, Go, Rust, C, and C++. It includes a total of 1,632 high-quality instances, which were carefully annotated from 2,456 candidates by 68 expert annotators, ensuring that the benchmark can provide an accurate and reliable evaluation. Based on Multi-SWE-bench, we evaluate a series of state-of-the-art models using three representative methods (Agentless, SWE-agent, and OpenHands) and present a comprehensive analysis with key empirical insights. In addition, we launch a Multi-SWE-RL open-source community, aimed at building large-scale reinforcement learning (RL) training datasets for issue-resolving tasks. As an initial contribution, we release a set of 4,723 well-structured instances spanning seven programming languages, laying a solid foundation for RL research in this domain. More importantly, we open-source our entire data production pipeline, along with detailed tutorials, encouraging the open-source community to continuously contribute and expand the dataset. We envision our Multi-SWE-bench and the ever-growing Multi-SWE-RL community as catalysts for advancing RL toward its full potential, bringing us one step closer to the dawn of AGI.
Daoguang Zan, Zhirong Huang, Wei Liu, Hanwu Chen, Linhao Zhang, Shulin Xin, Lu Chen, Qi Liu, Xiaojian Zhong, Aoyan Li, Siyao Liu, Yongsheng Xiao, Liangqiang Chen, Yuyu Zhang, Jing Su, Tianyu Liu, Rui Long, Kai Shen, Liang Xiang
Seed
LLM
LLM
2025.04.03
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
Daoguang Zan, Zhirong Huang, Wei Liu, Hanwu Chen, Linhao Zhang, Shulin Xin, Lu Chen, Qi Liu, Xiaojian Zhong, Aoyan Li, Siyao Liu, Yongsheng Xiao, Liangqiang Chen, Yuyu Zhang, Jing Su, Tianyu Liu, Rui Long, Kai Shen, Liang Xiang
Seed
LLM
LLM

Apr 01, 2025
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
The rapid escalation from elementary school-level to frontier problems of the difficulty for LLM benchmarks in recent years have weaved a miracle for researchers that we are only inches away from surpassing human intelligence. However, is the LLMs' remarkable reasoning ability indeed comes from true intelligence by human standards, or are they simply reciting solutions witnessed during training at an Internet level? To study this problem, we propose RoR-Bench, a novel, multi-modal benchmark for detecting LLM's recitation behavior when asked simple reasoning problems but with conditions subtly shifted, and conduct empirical analysis on our benchmark. Surprisingly, we found existing cutting-edge LLMs unanimously exhibits extremely severe recitation behavior; by changing one phrase in the condition, top models such as OpenAI-o1 and DeepSeek-R1 can suffer 60%.
Kai Yan, Yufei Xu, Zhengyin Du, Xuesong Yao, Zheyu Wang, Xiaowen Guo, Jiecao Chen
LLM
LLM
2025.04.01
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Kai Yan, Yufei Xu, Zhengyin Du, Xuesong Yao, Zheyu Wang, Xiaowen Guo, Jiecao Chen
LLM
LLM

Mar 20, 2025
Multi-Reward as Condition for Instruction-based Image Editing
High-quality training triplets (instruction, original image, edited image) are essential for instruction-based image editing. Predominant training datasets (e.g., InsPix2Pix) are created using text-to-image generative models (e.g., Stable Diffusion, DALL-E) which are not trained for image editing. Accordingly, these datasets suffer from inaccurate instruction following, poor detail preserving, and generation artifacts. In this paper, we propose to address the training data quality issue with multi-perspective reward data instead of refining the ground-truth image quality. 1) we first design a quantitative metric system based on best-in-class LVLM (Large Vision Language Model), i.e., GPT-4o in our case, to evaluate the generation quality from 3 perspectives, namely, instruction following, detail preserving, and generation quality. For each perspective, we collected quantitative score in 0∼5 and text descriptive feedback on the specific failure points in ground-truth edited images, resulting in a high-quality editing reward dataset, i.e., RewardEdit20K. 2) We further proposed a novel training framework to seamlessly integrate the metric output, regarded as multi-reward, into editing models to learn from the imperfect training triplets. During training, the reward scores and text descriptions are encoded as embeddings and fed into both the latent space and the U-Net of the editing models as auxiliary conditions. 3) We also build a challenging evaluation benchmark with real-world images/photos and diverse editing instructions, named Real-Edit. Experiments indicate that our multi-reward conditioned model outperforms its no-reward counterpart on two popular editing pipelines, i.e., InsPix2Pix and SmartEdit. Code is released at this https URL[https://github.com/bytedance/Multi-Reward-Editing].
Xin Gu, Ming Li, Libo Zhang, Fan Chen, Longyin Wen, Tiejian Luo, Sijie Zhu
Vision
Computer Vision
2025.03.20
Multi-Reward as Condition for Instruction-based Image Editing
Xin Gu, Ming Li, Libo Zhang, Fan Chen, Longyin Wen, Tiejian Luo, Sijie Zhu
Vision
Computer Vision

Mar 20, 2025
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Transformers have found extensive applications across various domains due to the powerful fitting capabilities. This success can be partially attributed to their inherent nonlinearity. Thus, in addition to the ReLU function employed in the original transformer architecture, researchers have explored alternative modules such as GeLU and SwishGLU to enhance nonlinearity and thereby augment representational capacity. In this paper, we propose a novel category of polynomial composition activations (PolyCom), designed to optimize the dynamics of transformers. Theoretically, we provide a comprehensive mathematical analysis of PolyCom, highlighting its enhanced expressivity and efficacy relative to other activation functions. Notably, we demonstrate that networks incorporating PolyCom achieve the optimal approximation rate.
Zhijian Zhuo, Ya Wang, Yutao Zeng, Xiaoqing Li, Xun Zhou, Jinwen Ma
Fundation
LLM
2025.03.20
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Zhijian Zhuo, Ya Wang, Yutao Zeng, Xiaoqing Li, Xun Zhou, Jinwen Ma
Fundation
LLM
View more