Self-hostable, multi-layer semantic caching library for Node.js LLM apps. Runs two cache layers — exact Redis match then Qdrant semantic search — before falling through to your LLM, so repeated or similar queries cost nothing. Caches embeddings to avoid re-embedding, returns full latency breakdown and estimated cost saved, and works with any LLM provider. Named after déjà vu.
- TypeScript
- Node.js
- Redis
- Qdrant
- OpenAI Embeddings

