趋近智
活跃参数
7B
上下文长度
250K
模态
Text
架构
Mixture of Experts (MoE)
许可证
Tencent Hunyuan Community License
发布日期
30 Oct 2024
训练数据截止日期
Aug 2024
专家参数总数
-
专家数量
-
活跃专家
-
注意力结构
Multi-Head Attention
隐藏维度大小
4096
层数
32
注意力头
32
键值头
8
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
Absolute Position Embedding
Hunyuan Lite is a specialized, text-based language model developed by Tencent, engineered to deliver sophisticated linguistic and reasoning capabilities within a compact computational footprint. Part of the broader Hunyuan ecosystem, this model is designed for deployment on edge devices such as laptops, smartphones, and in-vehicle systems. Its primary objective is to provide a highly efficient solution for natural language understanding, code generation, and complex mathematical problem-solving without the high resource overhead typically associated with large-scale models. By optimizing the balance between performance and latency, the model enables advanced AI integration in environments where memory and power consumption are critical constraints.
The architectural framework of the 7B variant employs a dense Transformer-based structure, departing from the Mixture of Experts (MoE) design used in its larger counterparts like Hunyuan-Large or Hunyuan-A13B. A defining technical innovation of this series is its support for an ultra-long context window of 256,000 tokens, which allows for the ingestion and analysis of extensive documents, complete books, or lengthy conversation histories. The model integrates Grouped Query Attention (GQA) to accelerate inference speed and reduce the memory footprint of the KV cache. Additionally, it features a unique dual-mode reasoning capability, enabling users to switch between a "fast-thinking" mode for immediate responses and a "slow-thinking" mode that utilizes chain-of-thought processing for deeper analytical tasks.
Hunyuan Lite is optimized for versatile deployment and is compatible with mainstream inference frameworks like vLLM, SGLang, and TensorRT-LLM. The model adopts a Rotary Position Embedding (RoPE) scheme to maintain stability across its expanded context window and utilizes SwiGLU activation for enhanced expressive power in its feed-forward layers. Engineered for agentic workflows, it demonstrates high proficiency in tool-use and structured data generation. The release of open weights under a community license facilitates specialized fine-tuning and integration into private-domain knowledge engines and automated assistant platforms.
Tencent Hunyuan large language models with various capabilities.
没有可用的 Hunyuan Lite 评估基准。