Typhoon-2-70B：规格和 GPU 显存要求

Typhoon-2-70B

开源

开放权重

参数

70B

上下文长度

128K

模态

Text

架构

Dense

许可证

Apache-2.0

发布日期

1 Jun 2024

训练数据截止日期

Dec 2023

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

8192

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Typhoon-2-70B

Typhoon-2-70B is a high-capacity Thai-English large language model developed by SCB 10X, specifically architected to address the linguistic complexities of the Thai language. Built upon the Llama 3.1 70B backbone, this model undergoes extensive continual pre-training on a curated corpus of over 5 billion high-quality Thai tokens. This training process is designed to align the model with Thai cultural nuances and linguistic structures while preserving the original English reasoning capabilities of the underlying architecture. The resulting model serves as a foundation for enterprise-level applications requiring high precision in bilingual contexts.

The technical architecture employs a dense, decoder-only transformer structure with Grouped-Query Attention (GQA) to optimize inference efficiency and memory throughput. It utilizes a 128K token context window, enabling the processing of lengthy legal documents, technical manuals, and multi-turn conversational histories. The model integrates advanced post-training techniques, including supervised fine-tuning (SFT) and Direct Preference Optimization (DPO), to enhance its instruction-following accuracy and function-calling capabilities. These optimizations allow the model to interact with external tools and APIs, facilitating complex agentic workflows.

Released under the Llama 3.1 Community License, Typhoon-2-70B provides a transparent path for developers to integrate sovereign AI capabilities into production environments. Its design emphasizes performance in specialized Thai domains such as legal reasoning, cultural content generation, and sophisticated data analysis. By bridging the gap between English-centric foundation models and local language requirements, Typhoon-2-70B enables the development of localized AI solutions that maintain parity with global standards of reasoning and accuracy.

关于 Typhoon

Typhoon is a Thai language model family developed by SCB 10X. It is specifically optimized for the Thai language, addressing complexities such as the lack of word delimiters and tonal nuances. The models are trained on Thai-centric datasets including legal, cultural, and historical documents to ensure localized context and knowledge.

其他 Typhoon 模型

Typhoon-2-8B

评估基准

没有可用的 Typhoon-2-70B 评估基准。

排名

编程排名

模型透明度

总分

65 / 100

上游

21.0 / 30

模型

25.0 / 40

下游

18.5 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

63k

125k

所需显存:

资源

官方文档阅读论文下载权重源代码

Typhoon-2-70B

技术规格

Typhoon-2-70B

关于 Typhoon

其他 Typhoon 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源