A complete platform, not a collection of add-ons.
Archively.AI bundles cataloging, digital asset management, AI description, and publishing into one product — without the legacy baggage.
AI that drafts, never decides.
Every automated result is a proposal the curator can accept, edit, or reject — with full change history preserved across the AI, review, and published layers.
OCR for images and PDFs
Layout-aware OCR falls back to AWS Textract for scanned pages, with word-level confidence and in-place correction.
Audio & video transcription
gpt-4o-transcribe with word timings and sentence segmentation. Edit captions in a dedicated proofreader view.
Entity extraction
Named people, organizations, places, and dates are extracted, linked to authority records, and ready for review.
Summaries & tags
Draft abstracts and suggested tags generated per item, with full track-changes over every AI suggestion.
Structure detection
Chapters, sections, and topics with page ranges for long documents — the bones of a finding aid, auto-drafted.
Custom prompts
Run your own prompts over OCR text for specialised metadata (provenance notes, condition, subject classification).
Seven modules, all standards-aligned.
Every module is tenant-scoped, track-changes enabled, and exposes the same high-quality search and detail UX.
Items
ISAD(G)-shaped records with title, dates, extent, scope, and custodial history.
Fonds
Hierarchical arrangement with series and sub-series, ready for EAD export.
People
Authority records for creators, subjects, and contributors.
Organizations
Corporate bodies with relationships and institutional history.
Accessions
Acquisition records from receipt through processing.
Subjects
Topical access with thesaurus-aware subject headings.
Events
Contextual events, exhibitions, and timelines.
The working surface for every record.
AI-drafted fields sit alongside the source asset. A persistent track-changes bar tracks unsaved edits across tabs, and publish locks an immutable snapshot for citation.
Four steps, one audit trail.
Upload, describe, review, publish. Every transition is logged with agent, timestamp, and before/after metadata.
One file detail. Six AI extractions.
Images, PDFs, audio, video, Office docs, and data — each gets the right viewer and the right pipeline. Caption, keywords, transcript, entities, locations, and sentiment land as a dual AI / Reviewed pair, with per-field accept / reject.
See how your files relate to each other.
Switch the Media Library from grid to network and the files arrange themselves into clusters — same survey, same photographer, same recurring subject. Click any file to recentre the graph; hit Find similar to open a star-graph of its closest matches.
Crop, deskew, caption — all in the catalog.
Six preset crop bounds visible at once, slider-driven adjustments, before / after compare, AI object detection, and an AI archival caption — without round-tripping to Photoshop. Export PNG, JPEG, or WebP.
Every record gets a hero image.
Most archives inherit boxes of paper records with no photograph for the catalog. Generate a documentary-style hero image from the record's own metadata — title, dates, scope, provenance. Curators steer the look by picking from a tenant-curated theme vocabulary; the result lands in the Media Library tagged, ready to refine or replace.
Visual theme
Ask
items I edited this week
Plain-language search, across every module.
A floating prompt bubble lets curators ask the catalog in their own words. Items I edited this week. Fonds from the 1920s. Drafts ready for review. Each result jumps straight to the matching record. Recent prompts are remembered so routine queries are one click away.
Find what you mean, not just what you typed.
Every catalog field is indexed for fast keyword search — across titles, descriptions, transcripts, OCR text, and notes. On top of that, AI embeddings let curators surface related records even when the wording is different: search for weaver's strike and the system also returns the photographs of the picket line and the labour-union correspondence that never mentioned the word.
- Keyword + prefix search across every module and field
- "Find related items" surfaces what shares meaning, not just words
- "Find similar files" clusters look-alike photographs and documents
Find related to
Photograph — Weavers' Picket Line, Bradford 1894
The blog your archive always wanted.
A Notion-style block editor with nine entity-aware blocks — heading, paragraph, quote, image, item card, entity series, external link, divider, and 2-4 column rows. Track changes included. Publish freezes a snapshot to a citable URL.
One deployment. A branded portal per tenant.
Each tenant lives at its own subdomain with a templated hero, accent colour, logo / dark logo / favicon / hero slots, accent + secondary colour pickers, and four feature flags. Public search across items, collections, and stories.
Four editions. Live usage. Soft landings.
Free, Standard, Pro, and Enterprise — each with metered storage, items, files, users, and AI credits. AI credits are billed per successful processing run. The platform shows usage live, fires notifications at 80 / 95 / 100%, and offers a one-click in-app upgrade. A five-state lifecycle guarantees reactivation is always possible.
Your schema. No fork required.
Define your own fields on Items, Files, People, Organizations, Accessions, Fonds, Subjects, and Events. Nine field types — text, textarea, rich text, number, boolean, date, select, multi-select, multi-tag — with track-changes and publish-snapshot resolution included.
DR-2024-0188
Silver gelatin · Card mount
2049-12-31
Yes
Northbridge Photographs · Founders Letters
84 items edited · 12 fonds described · 30 transcripts reviewed
A workspace each curator carries with them.
Pinned favourites, recently viewed files, contributions history, and personal preferences travel with the account — not the browser. The security basics — verified email, two-factor, sign out everywhere — sit on the same page, ready whenever they're needed.
We don't just store your files. We check on them.
The first time a file lands in your archive we compute a SHA-256 baseline. After that, a recurring worker quietly re-verifies every file on a monthly cadence and writes an append-only event log of every check. The moment a stored file diverges from its baseline, your tenant admins get an in-app and email alert — long before it surfaces as a broken download.
- SHA-256 baseline on every stored object
- Recurring re-verification — 6-hour worker cycle, 30-day window per file
- "Verify integrity now" button on any file detail page
- Append-only preservation event log — pass, fail, or skipped per check
- Notifications to tenant admins on the first sign of corruption
Ask the archive
Which donors gave the most material in the 1920s?
The largest 1920s donations came from the Hartwell family and the Roper estate12, with subsequent transfers from the Bradford Mill Workers' Union recorded in 19283.
Talk to your archive. Every claim cited.
A retrieval-augmented chat that only reads from your own holdings. Each answer carries inline [N]chips that link straight to the cited item, file, or story. No training-data drift: when a tenant's archive is empty, the model is held back entirely until there's real content to retrieve from.
The things that matter, in the bell.
A real notification feed for the events curators care about: an ingestion run finished, a fixity check flagged a file, someone mentioned you on an item. In-app, in email, and per-user controllable per channel.
- Categories curators actually care about — imports finishing or failing, fixity flagging a file, mentions in comments, account-deletion confirmations
- Per-user × per-channel preferences (in-app + email independently)
- Email delivery via Resend — instant, deliverable, no SMTP plumbing to operate
Plate XII — Hayes survey · 12 min ago
northbridge-archive/2024-accession · 1,218 of 1,224 rows · 1h ago
Item — "Letter to Henrietta, 1986" · 3h ago
S3 connector. AI-drafted mapping. Manifest-aware.
Point at any S3 bucket with per-source credentials, run a sample scan, and get an AI-drafted mapping plan you can edit before committing. Drop a metadata manifest in the bucket as a sidecar — or upload one directly — and every file row inherits the matching manifest row.
Boring infrastructure, done right.
The parts of archive software you notice only when they break.
White-label per institution
Each archive runs under its own brand at its own subdomain — logo, colors, hero, access policies, and storage all separated cleanly.
Role-based permissions
Roles, permissions, and approval workflow are first-class — not bolted on.
Search that finds, not just filters
Fast keyword search across every word in every record, plus AI-powered "find related items" that surfaces what shares meaning — not just words.
Track changes everywhere
Pristine vs dirty vs published on all editable records, with inline diffs and a floating save bar.
Bulk actions
Select hundreds of items at once and re-assign workflow stage, move collection, change status, or remove — with a single confirmation.
Background processing
Long-running AI work runs as sequential jobs so curators never wait on uploads.
Import pipeline
Bring in existing catalogs via configurable field-mapping sources.
Want to see the whole list?
The full feature breakdown — plus roadmap and what's coming next — lives in our product docs.