Understanding LLM Model Sizes and Hardware Requirements
Practical Quantization for Large Language Models
How To Build A Large Language Model
Deploying Quantized LLMs for Efficient Inference
Mixture of Experts: Advanced Architecture, Training, and Scaling