Show HN

1 points by adamqureshi 17 hours ago

What it is We run a niche marketplace called Only Used Tesla. Answering dozens of buyer questions each day (“Does this VIN have FSD?”, “What’s the battery health?”) was eating our time, so we built OUT (Only-Used-Tesla AI). It’s a chat interface that:

searches our live inventory in real-time and surfaces matching cars

decodes the VIN on the fly to answer spec questions

links straight to the listing so you can keep the convo flowing while you browse

handles follow-ups like range estimates, tax-credit rules, etc.

How it works Stack: Next.js 14 on Vercel, Vercel AI SDK for streamed chat UI

Model: GPT-4o (8k window) via OpenAI API

Retrieval

car listings + VIN metadata sit in Postgres → pgvector

embeddings with ada-002 + hybrid SQL expansion to tighten recall

Prompt: system + user query + top-k listings (JSON) → GPT-4o → markdown → user

Cost control: we truncate fields, switch to GPT-3.5-turbo for lightweight queries, and cache embeddings with a checksum on the record

The repo is private for now, but we’ll open-source the retrieval layer once it stabilises.

Why post this on HN? We’ve built marketplaces before, but this is our first serious RAG+LLM production feature. We want brutally honest feedback on:

Search relevance – do the returned cars make sense?

Latency – first token vs. full answer feels OK to us, but we’re biased.

Prompt design / hallucinations – what edge cases break it?

On-ramp – is “no signup → chat” the right balance, or should we expose more filters first?

Tear it apart; the more it hurts, the better the next iteration will be.

Appreciate any feedback—technical, product, or brutal nit-picks. Thanks! contact@onlyusedtesla.com . ( Adam Qureshi) Han SOLO founder. used o3 to write this so please forgive me :-)