Vector search over archival prose: what actually worked
Our journey through pgvector, embedding choices, and the strange semantic texture of 19th-century correspondence.
Long-form pieces from the team — the engineering behind Archively.AI, and our view on where archival practice is going. Subscribe below and we'll email new posts.
No spam. Unsubscribe any time.
Our journey through pgvector, embedding choices, and the strange semantic texture of 19th-century correspondence.
How a specialist library moved from a CSV catalog to a published finding aid, without hiring a developer.
Thoughts on encoded archival description in 2026: still necessary, still ugly, still the only way to interop across national aggregators.
The difference between 'good enough for search' and 'good enough to cite' is visible at the word level. A tour of our transcription pipeline.

Fonds, series, items — and the many-to-many headaches that emerge when you try to store provenance faithfully.
How we separate AI output, curator review, and published snapshots — and why collapsing those layers is where most tools go wrong.
Every catalog tool claims to be 'ISAD(G) compliant'. Here's what that actually means in our data model — and what we refused to compromise.
See the platform behind the posts — upload a collection, let AI draft the description, and publish a standards-compliant record.