Optimizing Chunking Strategies for Diverse Data Sources
New · Open Source
Kerb - LLM Development Toolkit
Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Was this section helpful?
Text splitters, LangChain, 2024 (LangChain) - Official documentation explaining various text splitting strategies for different content types and their use within the LangChain framework.
Retrieval-Augmented Generation for Large Language Models: A Survey, Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, Haofen Wang, 2024arXiv preprint arXiv:2312.10997DOI: 10.48550/arXiv.2312.10997 - A comprehensive survey providing an academic overview of RAG systems, including discussions on document processing, chunking, and retrieval techniques.