LibriScan
Client project: Harvard Graduate Capstone (Amesbury Library Consortium)
Technical Deep Dive
Led stakeholder-driven system design and shipped a human-in-the-loop digitization platform with Django + Huey orchestration, RBAC, and immutable provenance.
Client Context
Amesbury Library needed an affordable workflow to digitize archaic handwritten manuscripts without sacrificing archival correctness.
Execution
Designed Upload -> Textract -> Review -> Approval state transitions, implemented Django + HTMX review tooling with OpenSeadragon manuscript inspection, and tracked every word edit with reviewer attribution and immutable history.
Outcome
Reduced manual transcription effort by roughly 90% while sustaining around 85% baseline OCR accuracy backed by governed human review and audit-ready provenance.
“Reduced manual transcription by ~90% with AWS Textract and achieved ~85% accuracy via human-in-the-loop review.”
Core Stack
Metrics
reduction
90%
accuracy
85%
workflow
Human-in-the-loop