-
Notifications
You must be signed in to change notification settings - Fork 114
Description
@samwillis and I wrote this
Summary
Adds automatic and configurable lifecycle management to TanStack DB collections to optimize resource usage and align with application routing behavior.
Introduction
This PRD proposes the introduction of automatic lifecycle management for collections in TanStack DB. As applications grow, developers define many collections—some of which are used rarely or only in specific UI flows. Without lifecycle control, all collections persist indefinitely, increasing memory and network usage while forcing developers to manage resource cleanup manually. By adding a simple, configurable lifecycle system—lazy initialization, garbage collection after inactivity, and minimal imperative controls—we enable scalable, efficient usage of collections without burdening developers with complex lifecycle logic.
Background
TanStack DB allows applications to declaratively define collections of data and interact with them through queries, subscriptions, and sync primitives. As applications scale, it is common for developers to define many collections—some used pervasively, others specific to isolated screens or components. Without lifecycle controls, these collections persist indefinitely once defined, consuming memory and maintaining sync connections even when not in use.
This leads to several problems:
- Collections may fetch data on app start even when not immediately needed.
- Collections used temporarily (e.g. during onboarding flows or modal views) may never be cleaned up, leading to unnecessary memory and network usage.
- Complex chains of dependent collections (e.g. A depends on B depends on C) are difficult to manually track or tear down.
- Defining collections in a shared file (e.g.
collections.ts) creates startup pressure that contradicts the goal of lazy, demand-driven data loading.
TanStack Query solves a related problem for queries by introducing cache- and garbage-collection policies. Inspired by that design, this PRD proposes a lifecycle model for TanStack DB collections that allows automatic initialization on usage, configurable teardown after idle time, and minimal imperative control for advanced cases.
The goal is to improve performance and developer ergonomics, particularly in larger apps with dynamic navigation, staged data loading, or ephemeral UI flows.
Problem
As applications grow, developers commonly define many collections in shared modules or route-specific components. Without lifecycle management, these collections:
- Start too early — Collections fetch or initialize on app startup, even if the user never visits the associated route or component.
- Never stop — Collections remain active and in memory indefinitely, even when no part of the app is using them.
- Waste resources — Maintaining sync state, indexes, and memory for unused collections increases CPU usage, network chatter, and memory pressure.
- Are hard to manage manually — In apps where collections depend on other collections, or where multiple transient routes activate different data sets, manually tracking when to start/stop collections is error-prone and doesn’t scale.
These problems result in slower startups, higher memory usage, and more complex application logic. They are especially acute in:
- Large single-page apps that define many collections up front.
- Navigation-based apps where only a small subset of collections are active at any one time.
- Apps that use TanStack DB’s sync primitives, incurring real backend/network cost.
Automatic lifecycle management would solve this by ensuring collections only start when needed, stop when unused, and can be controlled predictably when necessary.
Personas
1. The Route-Based Developer
“I define collections for each route or screen, but they all load even if the user never visits that part of the app.”
- Builds SPAs with a file like
collections.tsthat registers all collections - Wants navigation to trigger data loading lazily, not on startup
- Struggles with initial app load being slow due to eager collection start
2. The Performance-Sensitive Architect
“We want to keep memory and sync usage minimal—no reason to keep unused collections active.”
- Works on large apps with complex state and sync layers
- Wants collections to spin down when not actively used
- Prioritizes efficient memory, battery, and network use, especially on mobile
3. The Composition-Focused Engineer
“My collection A depends on B which depends on C, and I can’t track who’s using what anymore.”
- Composes collections into reusable utilities and hooks
- Doesn’t want to manually manage cascading dependencies
- Needs a system that auto-cleans unused chains without tight coupling
4. The Pragmatic Frontender
“Sometimes I know when I’m done with a collection—I just want to clean it up.”
- Builds flows like onboarding or modals where collections are temporary
- Wants a simple
.cleanup()API for known teardown moments - Appreciates predictability but avoids overly complex lifecycle APIs
Requirements and Phases
Phase 1: Collection Lifecycle Management
This phase introduces automatic and configurable lifecycle behavior for all collections in TanStack DB. It ensures that collections initialize only when needed, remain active while in use, and are garbage-collected after becoming idle. It also provides minimal imperative controls for preload and cleanup.
Requirements:
- Collections are lazy by default: creating a collection does not start it.
- A collection is started when any of the following occur:
- A call to
collection.query()orcollection.subscribe() - A call to
collection.preload()
- A call to
- A collection is garbage-collected after no active usage (queries/subscribers) for a configurable
gcTime(default: 5 minutes). - Each collection exposes:
.status:"idle" | "loading" | "ready" | "error" | "cleaned-up"
- Collections automatically restart on next usage after being GCed or cleaned up.
- Collections expose two lifecycle methods:
collection.preload(): Promise<void>— triggers immediate load; resolves once initial load completes; concurrent calls share a promise.collection.cleanup(): Promise<void>— immediately tears down the collection (even if preload is in progress).
gcTimeis configured per collection.- Preloading overrides GC to ensure the collection stays alive for at least
gcTime.
Acceptance Criteria:
- Querying or subscribing to an unused collection causes it to start (
.statustransitions to"loading"then"ready"). - When no usage remains, the collection is torn down after
gcTime. - After cleanup or GC, a new query causes reinitialization.
- Multiple concurrent
preload()calls return the same in-progress promise. cleanup()removes the collection’s in-memory state and unsubscribes from sync.- Preload does not block
cleanup()but continues in background unless cancellable. .statusreflects the current lifecycle state at all times.
Considerations:
- Should
gcTimebe reset on every access, or only when the collection transitions to “no longer used”? - What if preload fails — do we memoize the error or try again on next call?
- Should
cleanup()throw or warn if the collection is actively used? - Do we expose GC-related events to devtools or keep this internal?
User Research
While no formal interviews have been conducted, several recurring pain points have emerged through experience building and maintaining applications with TanStack DB:
-
App startup cost due to static collection registration
Developers often define all collections up front in a shared file (e.g.collections.ts). Without lazy initialization, every collection begins syncing or fetching immediately—even if the user never navigates to the parts of the app that use them. -
Memory and network overhead from inactive collections
In real-world apps, many collections are only used temporarily (e.g. for a modal, onboarding flow, or admin page). Without lifecycle management, these collections remain resident indefinitely, using memory and sometimes maintaining sync connections. -
Complexity of manual lifecycle management in compositional setups
When collections are composed (e.g.Adepends onB, which depends onC), it becomes difficult to track which are still in use. Developers currently have no good way to coordinate or garbage-collect unused chains. -
Desire for minimal imperative control
Developers building ephemeral flows often know when a collection is no longer needed and want to clean it up directly. While automation handles the general case, manual control viacleanup()is necessary in targeted flows.
These pain points suggest that collection lifecycle is a critical missing abstraction in real-world TanStack DB usage. Inspired by TanStack Query’s approach to query caching and GC, this PRD proposes a parallel design that aligns with actual developer needs.