Hunyuan Lite: Specifications and GPU VRAM Requirements

Hunyuan Lite

Open Source

Open Weights

Active Parameters

Context Length

250K

Modality

Text

Architecture

Mixture of Experts (MoE)

License

Tencent Hunyuan Community License

Release Date

30 Oct 2024

Knowledge Cutoff

Technical Specifications

Total Expert Parameters

Number of Experts

Active Experts

Attention Structure

Multi-Head Attention

Hidden Dimension Size

Number of Layers

Attention Heads

Key-Value Heads

Activation Function

Normalization

Position Embedding

Absolute Position Embedding

System Requirements

VRAM requirements for different quantization methods and context sizes

Hunyuan Lite

Hunyuan Lite is a compact, text-based language model developed by Tencent, designed for efficiency and broad deployment across various computational environments. This model variant is part of the larger Hunyuan family, strategically optimized for resource-constrained edge devices such as laptops, smartphones, and smart cabin systems, making advanced AI capabilities more accessible. Its fundamental purpose is to provide robust natural language processing, code generation, and mathematical reasoning within a lightweight framework, catering to a range of applications where computational overhead is a critical consideration.

The architectural foundation of Hunyuan Lite incorporates a Mixture of Experts (MoE) structure, a design choice enabling enhanced performance characteristics while maintaining computational efficiency. This configuration was a significant upgrade, implemented on October 30, 2024, alongside an expanded context window. The model supports an ultra-long context length of 256,000 tokens, facilitating the processing and comprehension of extensive textual inputs, such as entire documents or lengthy conversations. A notable design aspect is its fusion-reasoning capability, which allows for distinct "fast-thinking" and "slow-thinking" modes, adapting its processing strategy to the complexity and required depth of reasoning for a given task.

In terms of operational characteristics, Hunyuan Lite is engineered for general language understanding and generation tasks. It exhibits proficient capabilities in processing and responding to queries related to natural language, mathematical problems, and coding challenges. The model is made available with open weights and associated inference code, fostering its integration into diverse development workflows and facilitating specialized fine-tuning for particular industry requirements. It functions as a text-only model, without inherent support for search or multimedia processing directly integrated into its core capabilities.

About Hunyuan

Tencent Hunyuan large language models with various capabilities.

Other Hunyuan Models

Evaluation Benchmarks

Ranking is for Local LLMs.

No evaluation benchmarks for Hunyuan Lite available.

Rankings

Overall Rank

Coding Rank

GPU Requirements

Full Calculator

Quantization

Choose the quantization method for model weights

Context Size: 1,024 tokens

122k

244k

VRAM Required:

Recommended GPUs

Resources

Official Documentation Download Weights Source Code