LokiVector: An Embedded Document Vector DB Crash-Tested Durability

1 points by rckflr 17 hours ago

Hey HN/Reddit/dev community,

I'm excited to open-source *LokiVector* - an embedded document database with vector search capabilities, built for modern AI applications.

### What Makes It Different

Most vector databases are either: - Cloud-only services (expensive, vendor lock-in) - Complex to deploy (require Kubernetes, lots of moving parts) - Missing durability guarantees (what happens if it crashes?)

LokiVector solves this by being: - *Embeddable* - Runs in Node.js or browser, no external services - *Crash-Safe* - Validated with automated E2E crash recovery tests - *Simple* - JSON documents + vector search, no schema migrations - *Fast* - In-memory performance with disk persistence

### The Durability Story

This is what I'm most proud of. We test crash recovery across: - Documents and collections - Vector indexes (HNSW) - Partial writes and idempotency

All validated with 7 comprehensive E2E test scenarios. You can literally kill the process mid-write and it recovers correctly.

*Note:* Replication recovery is tested in Commercial editions. The Community Edition focuses on core durability.

### What's Included

*Core:* - Document store (JSON-like, flexible schema) - Vector search (HNSW index) - HTTP REST API + TCP server - API key authentication - Crash recovery & durability

*Pro/Enterprise (commercial):* - Leader-follower replication - Advanced caching - Multi-tenant support - SSO/SAML, RBAC - 24/7 support

### Use Cases

- Semantic search in RAG applications - Document similarity and clustering - Recommendation systems - Real-time analytics - Embedded AI applications

### Tech Stack

- Node.js (works in browser too) - HNSW algorithm for vector search - Journal-based persistence - Express.js for HTTP server

### Getting Started

```bash npm install @lokivector/core ```

```javascript const loki = require('@lokivector/core'); require('@lokivector/core/src/core/loki-vector-plugin');

const db = new loki('example.db', { autosave: true }); const items = db.addCollection('items', { vectorIndices: { embedding: { m: 16 } } });

items.insert({ id: 1, embedding: [0.1, 0.2, 0.3] }); const results = items.findNearest('embedding', [0.1, 0.2, 0.3], 5); ```

### Documentation

- [Full Documentation](https://github.com/MauricioPerera/LOKIVECTOR) - [Durability & Crash Recovery](https://github.com/MauricioPerera/LOKIVECTOR/blob/main/docs/DURABILITY.md) - [Deployment Guide](https://github.com/MauricioPerera/LOKIVECTOR/blob/main/docs/DEPLOYMENT.md) - [Editions Comparison](https://github.com/MauricioPerera/LOKIVECTOR/blob/main/EDITIONS.md)

### License

Community Edition: MIT (free for any use) Pro/Enterprise: Commercial license available

### Why Open Source This?

I built this because I needed a crash-safe vector database that I could embed in applications without vendor lock-in. The durability testing was crucial - I've seen too many databases lose data on crashes.

I'm open-sourcing the core because: 1. I believe in open source 2. I want community feedback 3. Commercial features (replication, multi-tenant) fund development

### What's Next

- More vector distance metrics - Graph database capabilities (in progress) - Performance optimizations - Community feedback and contributions

### Try It Out

```bash git clone https://github.com/MauricioPerera/LOKIVECTOR.git cd LOKIVECTOR npm install npm test node server/core/index.js ```

I'd love to hear your feedback, use cases, and contributions!

*GitHub:* https://github.com/MauricioPerera/LOKIVECTOR *Docs:* https://github.com/MauricioPerera/LOKIVECTOR/tree/main/docs *Issues:* https://github.com/MauricioPerera/LOKIVECTOR/issues