Hunyuan A13B: Specifications and GPU VRAM Requirements

Hunyuan A13B

开源

开放权重

活跃参数

80B

上下文长度

256K

模态

Text

架构

Mixture of Experts (MoE)

许可证

Apache 2.0

发布日期

25 Jun 2025

训练数据截止日期

技术规格

专家参数总数

13.0B

专家数量

活跃专家

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

系统要求

不同量化方法和上下文大小的显存要求

Hunyuan A13B

Tencent's Hunyuan A13B is a large language model engineered with a Mixture-of-Experts (MoE) architecture, featuring a total of 80 billion parameters with 13 billion parameters actively engaged during inference. This design approach aims to optimize computational efficiency while maintaining strong performance capabilities. The model is presented as an open-source resource, intended for researchers and developers seeking to deploy advanced AI solutions in contexts where resource allocation requires careful consideration. Its development addresses the challenge of scaling large language models by providing a framework that allows for extensive model capacity without requiring the full activation of all parameters for every task.

The core innovation of Hunyuan A13B lies in its sparse MoE architecture, which dynamically routes input through a subset of specialized "expert" neural networks. Specifically, the architecture comprises 32 layers and incorporates SwiGLU activation functions. It utilizes Grouped Query Attention (GQA) to enhance inference efficiency and reduce memory footprint during processing. A notable feature is its hybrid reasoning mode, enabling the model to adjust its processing depth dynamically between a "fast thinking" mode for rapid responses and a "slow thinking" mode for more intricate, multi-step problem-solving, depending on the complexity of the input. The model was trained on a substantial corpus exceeding 20 trillion tokens, including a significant emphasis on data from scientific, technological, engineering, and mathematical (STEM) domains.

Hunyuan A13B supports an ultra-long context window of up to 256,000 tokens, facilitating comprehensive understanding and generation of content from extensive documents or prolonged conversational sequences. The model has been optimized for agent-based tasks, demonstrating capabilities in areas such as mathematical reasoning, logical analysis, and complex instruction following. Its design emphasizes efficient inference, supporting various quantization formats including FP8 and INT4, which allows for deployment in environments with diverse hardware specifications. This makes it suitable for applications requiring both robust language processing capabilities and optimized computational resource utilization, even potentially on single mid-range GPUs.

关于 Hunyuan

Tencent Hunyuan large language models with various capabilities.

其他 Hunyuan 模型

评估基准

排名适用于本地LLM。

没有可用的 Hunyuan A13B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

125k

250k

所需显存:

资源

官方文档阅读论文下载权重源代码

Hunyuan A13B

技术规格

系统要求

Hunyuan A13B

关于 Hunyuan

其他 Hunyuan 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源