MRAgent Cuts Token Use to 118K per Query – LangMem Burns 3.26M
## NUS‑Backed MRAgent Slashes Token Footprint by Over 90% Compared to LangMem A research team from the National University of Singapore has unveiled **MRAgent**, an agentic memory architecture that redefines how large language models retrieve and process information. By reconstructing active memory on‑the‑fly, MRAgent limits token consumption to roughly **118 k per query**, a stark contrast to competing systems such as LangMem, which can burn **3.26 M tokens** for similar tasks. The breakthrough promises to curb the prohibitive context‑overload costs that have hampered retrieval‑augmented generation pipelines. ### Key Takeaways - **Drastic token reduction**: MRAgent processes queries with ~118 k tokens versus LangMem’s 3.26 M, cutting usage by over 96 %. - **Active memory reconstruction** enables the model to adapt queries mid‑reasoning, eliminating irrelevant data from the context window. - Traditional retrieval pipelines **flood LLMs with noise**, leading to expensive and inefficient inference. - The architecture **optimizes relevance**, delivering tighter, more focused context that improves reasoning accuracy. - Lower token counts translate to **significant cost savings** and open the door for more scalable deployment of advanced LLMs. #MRAgent #AgenticMemory #TokenEfficiency #LLM #NUSResearch #LangMem #AIOptimization #RetrievalAugmentedGeneration #ComputationalCost #newsababil360 [Read Full Article](https://news.ababil360.com/mragent-cuts-token-use-to-118k-per-query-langmem-burns-3-26m/)














