Prerequisites: Strong ML & Systems Background
Level:
Advanced IR Design
Analyze and utilize sophisticated Intermediate Representations like MLIR for expressing and optimizing complex ML computations.
Graph-Level Optimization
Implement advanced graph optimization passes including operator fusion, layout transformations, and algebraic simplifications.
Tensor-Level Optimization
Apply polyhedral modeling, advanced loop transformations, and auto-vectorization techniques for tensor operations.
Heterogeneous Code Generation
Generate highly optimized code for diverse hardware targets including multi-core CPUs, GPUs (CUDA/ROCm), and specialized AI accelerators.
ML Runtime Systems
Design and analyze runtime components for dynamic shape handling, efficient memory management, and heterogeneous task scheduling.
JIT Compilation for ML
Implement and analyze JIT compilation techniques for ML models, focusing on specialization and adaptive compilation.
Low-Precision Optimization
Apply compiler and runtime techniques to support and optimize models using quantization and low-precision arithmetic.
Performance Analysis
Utilize advanced profiling tools to diagnose performance bottlenecks in compiled ML code execution.