mbb
mbb
mbb
mbb
LCP
mbb
mbb
mbb
mbb
mbb
mbb
mbb
mbb
mbb
mbb
mbb
mbb

Multitasking memory

Machine learning

Mastering long-context multi-task reasoning with transformers and recurrent memory

Long-context reasoning with language models remains computationally costly as attention scales quadratically and contexts grow to millions of tokens. We show that a compact recurrent-memory transformer, trained across several reasoning tasks and guided by task descriptions, can answer questions over very long texts more accurately than far larger models, while generalising to longer inputs, new tasks and input noise.