Beyond Fact Retrieval:
Episodic Memory for RAG with
Generative Semantic Workspaces

Shreyas Rajesh, Pavan Holur, Chenda Duan, David Chong, Vwani Roychowdhury

University of California, Los Angeles

AAAI 2026 Oral

NeurIPS 2025 Workshop on Language Agents and World Models ⭐ Spotlight

📄 Paper

TL;DR

GSW achieves state-of-the-art episodic memory performance with an F1-score of 0.85 on EpBench, outperforming structured RAG baselines by up to 20% in recall while reducing context tokens by 51%.

Motivation

Large Language Models face fundamental challenges with long-context reasoning. Current RAG solutions—from semantic embeddings to knowledge graphs—are designed for fact retrieval but fail to build the space-time-anchored narrative representations needed for tracking entities through evolving situations.

The vast majority of texts are not lists of facts but narratives of evolving real-world situations. Crime reports, political briefings, corporate filings, and news coverage all describe actors that adopt roles and transition through states while interacting across space and time.

Our Approach: Generative Semantic Workspaces (GSW)

Brain-Inspired Design: GSW mirrors the neocortical-hippocampal architecture. The Operator (neocortex) extracts semantic roles and states. The Reconciler (hippocampus) binds them into coherent spatiotemporal sequences.

GSW is a neuro-inspired generative memory framework that builds structured, interpretable representations of evolving situations. It comprises two core components:

🔍 Operator

Maps incoming text to intermediate semantic structures:

Actors & Entities: People, places, objects, times
Roles: Situation-relevant descriptors
States: Evolving conditions of actors
Verbs & Valences: Actions and their arguments
Spatio-Temporal Links: Shared locations/times
Forward-Falling Questions: Predicted developments

🔄 Reconciler

Integrates semantic structures into a persistent workspace:

Entity Resolution: Resolves entity matches across evolving narratives.
Actor States: Tracks actor states over time.
Spatio-Temporal Coherence: Enforces consistency and grounds actors to the right space and time.
Predictive Questions: Resolves unanswered predictive questions.

End-to-End Pipeline: Documents are chunked and processed by the Operator to create local semantic graphs. The Reconciler integrates these into a unified global memory. At query time, entity-specific summaries are retrieved, re-ranked, and passed to an LLM.

Results

0.850

F1-Score

EpBench-200

0.773

F1-Score

EpBench-2000 (10x scale)

+20%

Recall

vs. HippoRAG2 (next best)

51%

Token Reduction

vs. GraphRAG (next best)

Performance on EpBench-200 (F1-Score by Query Complexity)

Method	0 Cues	1 Cue	2 Cues	3-5 Cues	6+ Cues	Overall
GSW (Ours)	0.978	0.745	0.806	0.867	0.834	0.850
HippoRAG2	0.828	0.675	0.762	0.755	0.746	0.753
Embedding RAG	0.906	0.727	0.724	0.744	0.678	0.770
GraphRAG	0.950	0.625	0.624	0.658	0.607	0.714
LightRAG	0.944	0.593	0.587	0.578	0.560	0.677
Vanilla LLM	0.883	0.709	0.582	0.484	0.323	0.642

Token Efficiency

Method	Avg. Tokens/Query	Avg. Cost/Query
GSW (Ours)	~3,587	~$0.0090
GraphRAG	~7,340	~$0.0184
Embedding RAG	~8,771	~$0.0219
HippoRAG2	~8,771	~$0.0219
LightRAG	~40,476	~$0.1012
Vanilla LLM	~101,120	~$0.2528

Key Insight: GSW's entity-specific summaries provide targeted, query-relevant information—reducing hallucinations and drastically cutting inference costs.

Code

Coming soon.

Cite

If you find our work useful, kindly cite our paper:

@misc{rajesh2025factretrievalepisodicmemory,
  title={Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces},
  author={Shreyas Rajesh and Pavan Holur and Chenda Duan and David Chong and Vwani Roychowdhury},
  year={2025},
  eprint={2511.07587},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2511.07587}
}