Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Was this section helpful?
Representation Engineering: A Top-Down Approach to AI Alignment, Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks, 2023ArXivDOI: 10.48550/arXiv.2310.01405 - Introduces representation engineering, which involves identifying and manipulating concept representations within LLMs to enhance safety and steer model behavior, highly relevant to the applications of probing for alignment.