Granite 4.1 LLMs: How They’re Built
Summary (EN)
IBM’s Granite team published a technical release note for Granite 4.1, a new family of dense decoder-only language models offered in 3B, 8B, and 30B sizes under the Apache 2.0 license. The post says the models were trained from scratch on roughly 15 trillion tokens through a five-stage pipeline that moved from broad web-scale pretraining toward progressively higher-quality mixtures for math, code, technical material, instruction data, and long-chain-of-thought reasoning traces. IBM also extended context length in staged fashion from 4K to 32K, 128K, and ultimately 512K tokens, using a long-context extension process and model merges after each stage to preserve short-context performance. After pretraining, Granite 4.1 was supervised-fine-tuned on about 4.1 million curated samples with LLM-as-judge and rule-based filtering for quality control, then further refined through reinforcement learning using on-policy GRPO with DAPO loss. IBM highlights that the 8B instruct model matches or exceeds the previous Granite 4.0-H-Small 32B-A9B MoE model while using a simpler dense architecture. The article frames the release as a data-quality-driven approach to building smaller, deployment-friendly models that still perform well on coding, math, instruction following, and general chat tasks. It positions Granite 4.1 as a practical open model family for teams that want permissively licensed LLMs with long context and modern post-training methods.
Summary (ZH)
IBM Granite 团队发布了 Granite 4.1 的技术说明,这是一个新的稠密型 decoder-only 语言模型家族,包含 3B、8B 和 30B 三个规模,并以 Apache 2.0 许可证开放。文章称,该系列模型从零开始训练,总训练语料约 15 万亿 token,采用五阶段训练流程,数据配比从大规模通用网络语料逐步过渡到更高质量的数学、代码、技术文档、指令数据以及长链路推理数据。Granite 4.1 还通过分阶段长上下文扩展,将上下文窗口从 4K 依次提升到 32K、128K,最终达到 512K,并在各阶段后进行模型合并,以尽量保持短上下文能力不退化。预训练之后,IBM 使用约 410 万条高质量筛选样本进行监督微调,结合 LLM-as-judge 和规则过滤控制数据质量,随后又通过基于 on-policy GRPO 与 DAPO loss 的强化学习继续优化。IBM 特别强调,8B instruct 版本在多项任务上可达到或超过此前 Granite 4.0-H-Small 32B-A9B MoE 的表现,同时结构更简单、参数更易部署。整篇文章将 Granite 4.1 描述为一种以数据质量驱动的小模型构建路径,目标是在代码、数学、指令跟随与通用对话等场景中,以更友好的许可证和更长上下文,为企业与开发者提供实用的开放模型选择。
Source