ApX 标志ApX 标志

趋近智

Hunyuan Lite

活跃参数

7B

上下文长度

250K

模态

Text

架构

Mixture of Experts (MoE)

许可证

Tencent Hunyuan Community License

发布日期

30 Oct 2024

训练数据截止日期

Aug 2024

技术规格

专家参数总数

-

专家数量

-

活跃专家

-

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

32

注意力头

32

键值头

8

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

Hunyuan Lite

Hunyuan Lite is a specialized, text-based language model developed by Tencent, engineered to deliver sophisticated linguistic and reasoning capabilities within a compact computational footprint. Part of the broader Hunyuan ecosystem, this model is designed for deployment on edge devices such as laptops, smartphones, and in-vehicle systems. Its primary objective is to provide a highly efficient solution for natural language understanding, code generation, and complex mathematical problem-solving without the high resource overhead typically associated with large-scale models. By optimizing the balance between performance and latency, the model enables advanced AI integration in environments where memory and power consumption are critical constraints.

The architectural framework of the 7B variant employs a dense Transformer-based structure, departing from the Mixture of Experts (MoE) design used in its larger counterparts like Hunyuan-Large or Hunyuan-A13B. A defining technical innovation of this series is its support for an ultra-long context window of 256,000 tokens, which allows for the ingestion and analysis of extensive documents, complete books, or lengthy conversation histories. The model integrates Grouped Query Attention (GQA) to accelerate inference speed and reduce the memory footprint of the KV cache. Additionally, it features a unique dual-mode reasoning capability, enabling users to switch between a "fast-thinking" mode for immediate responses and a "slow-thinking" mode that utilizes chain-of-thought processing for deeper analytical tasks.

Hunyuan Lite is optimized for versatile deployment and is compatible with mainstream inference frameworks like vLLM, SGLang, and TensorRT-LLM. The model adopts a Rotary Position Embedding (RoPE) scheme to maintain stability across its expanded context window and utilizes SwiGLU activation for enhanced expressive power in its feed-forward layers. Engineered for agentic workflows, it demonstrates high proficiency in tool-use and structured data generation. The release of open weights under a community license facilitates specialized fine-tuning and integration into private-domain knowledge engines and automated assistant platforms.

关于 Hunyuan

Tencent Hunyuan large language models with various capabilities.


其他 Hunyuan 模型

评估基准

没有可用的 Hunyuan Lite 评估基准。

排名

排名

-

编程排名

-

模型透明度

总分

B-

60 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
122k
244k

所需显存:

推荐 GPU

Hunyuan Lite:规格和 GPU 显存要求