Blog posts

长上下文LLM问答的主流技术路线总结

less than 1 minute read

Published: July 21, 2025

This post explores four key techniques for enhancing large language models (LLMs) in long-context scenarios. It begins with Retrieval-Augmented Generation (RAG), which retrieves relevant knowledge snippets to serve as context. It then discusses sparse attention mechanisms, such as BigBird and Longformer, which improve efficiency by connecting only selected tokens. The post also introduces context compression methods like MemoryBank, enabling LLMs to retain essential user information across dialogues. Finally, it highlights MemAgent, a system that recursively summarizes long inputs and leverages memory for reasoning, reinforced using GRPO.

Renjie Gu

Blog posts

2025

长上下文LLM问答的主流技术路线总结

2013

Blog Post number 1