Beyond Fact Retrieval:
Episodic Memory for RAG with
Generative Semantic Workspaces

Shreyas Rajesh, Pavan Holur, Chenda Duan, David Chong, Vwani Roychowdhury

University of California, Los Angeles

AAAI 2026 Oral

NeurIPS 2025 Workshop on Language Agents and World Models ⭐ Spotlight

TL;DR

GSW achieves state-of-the-art episodic memory performance with an F1-score of 0.85 on EpBench, outperforming structured RAG baselines by up to 20% in recall while reducing context tokens by 51%.

Motivation

Large Language Models face fundamental challenges with long-context reasoning. Current RAG solutions—from semantic embeddings to knowledge graphs—are designed for fact retrieval but fail to build the space-time-anchored narrative representations needed for tracking entities through evolving situations.

The vast majority of texts are not lists of facts but narratives of evolving real-world situations. Crime reports, political briefings, corporate filings, and news coverage all describe actors that adopt roles and transition through states while interacting across space and time.

Our Approach: Generative Semantic Workspaces (GSW)

Brain-inspired GSW architecture

Brain-Inspired Design: GSW mirrors the neocortical-hippocampal architecture. The Operator (neocortex) extracts semantic roles and states. The Reconciler (hippocampus) binds them into coherent spatiotemporal sequences.

GSW is a neuro-inspired generative memory framework that builds structured, interpretable representations of evolving situations. It comprises two core components:

🔍 Operator

Maps incoming text to intermediate semantic structures:

  • Actors & Entities: People, places, objects, times
  • Roles: Situation-relevant descriptors
  • States: Evolving conditions of actors
  • Verbs & Valences: Actions and their arguments
  • Spatio-Temporal Links: Shared locations/times
  • Forward-Falling Questions: Predicted developments

🔄 Reconciler

Integrates semantic structures into a persistent workspace:

  • Entity Resolution: Resolves entity matches across evolving narratives.
  • Actor States: Tracks actor states over time.
  • Spatio-Temporal Coherence: Enforces consistency and grounds actors to the right space and time.
  • Predictive Questions: Resolves unanswered predictive questions.
GSW Pipeline

End-to-End Pipeline: Documents are chunked and processed by the Operator to create local semantic graphs. The Reconciler integrates these into a unified global memory. At query time, entity-specific summaries are retrieved, re-ranked, and passed to an LLM.

Results

0.850
F1-Score
EpBench-200
0.773
F1-Score
EpBench-2000 (10x scale)
+20%
Recall
vs. HippoRAG2 (next best)
51%
Token Reduction
vs. GraphRAG (next best)

Performance on EpBench-200 (F1-Score by Query Complexity)

Method 0 Cues 1 Cue 2 Cues 3-5 Cues 6+ Cues Overall
GSW (Ours) 0.978 0.745 0.806 0.867 0.834 0.850
HippoRAG2 0.828 0.675 0.762 0.755 0.746 0.753
Embedding RAG 0.906 0.727 0.724 0.744 0.678 0.770
GraphRAG 0.950 0.625 0.624 0.658 0.607 0.714
LightRAG 0.944 0.593 0.587 0.578 0.560 0.677
Vanilla LLM 0.883 0.709 0.582 0.484 0.323 0.642

Token Efficiency

Method Avg. Tokens/Query Avg. Cost/Query
GSW (Ours) ~3,587 ~$0.0090
GraphRAG ~7,340 ~$0.0184
Embedding RAG ~8,771 ~$0.0219
HippoRAG2 ~8,771 ~$0.0219
LightRAG ~40,476 ~$0.1012
Vanilla LLM ~101,120 ~$0.2528

Key Insight: GSW's entity-specific summaries provide targeted, query-relevant information—reducing hallucinations and drastically cutting inference costs.

Code

Coming soon.

Cite

If you find our work useful, kindly cite our paper:

@misc{rajesh2025factretrievalepisodicmemory,
  title={Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces},
  author={Shreyas Rajesh and Pavan Holur and Chenda Duan and David Chong and Vwani Roychowdhury},
  year={2025},
  eprint={2511.07587},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2511.07587}
}