ApX 标志ApX 标志

趋近智

Yi-9B

参数

9B

上下文长度

4.096K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

6 Mar 2024

训练数据截止日期

Jun 2023

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

44

注意力头

32

键值头

4

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Yi-9B

The Yi-9B model is a sophisticated dense transformer-based large language model developed by 01.AI, designed to optimize the trade-off between parameter count and reasoning depth. It serves as a performance-oriented extension of the foundational Yi-6B model, engineered through a process of architectural expansion and multi-stage incremental training. By increasing the model's depth and continuing pre-training on an additional 0.8 trillion high-quality tokens, the developers have produced a model that excels in technical domains such as mathematics and code generation while maintaining robust bilingual fluency in English and Chinese.

Technically, Yi-9B utilizes a decoder-only architecture that mirrors the established Llama framework, enabling immediate compatibility with the broader ecosystem of LLM tools and libraries. Key architectural features include Grouped-Query Attention (GQA) to improve inference throughput and reduce memory overhead, and SwiGLU activation functions within the feed-forward layers for enhanced representational capacity. The model employs Rotary Position Embedding (RoPE) to manage sequence data and utilizes Root Mean Square Layer Normalization (RMSNorm) to stabilize training dynamics across its 44 layers.

Designed for computational efficiency, Yi-9B is particularly suited for deployment in resource-constrained environments, including consumer-grade hardware. Its extensive training on a total of 3.9 trillion tokens provides the model with a strong knowledge base for complex reasoning, reading comprehension, and common-sense logic. This makes it an effective choice for developers building AI-native applications that require a balance of high-performance technical reasoning and efficient local execution.

关于 Yi

Yi series models are large language models trained from scratch by 01.AI. Bilingual (English/Chinese), featuring strong performance in language understanding, reasoning, and code generation.


其他 Yi 模型

评估基准

没有可用的 Yi-9B 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
2k
4k

所需显存:

推荐 GPU