
KitabiAI transforms Arabic and English PDF books into structured, searchable digital formats — including HTML, Markdown, and JSONL with auto-generated tables of contents. Built on a hybrid ML pipeline using Azure Document Intelligence and custom NLP, it achieves 100% language routing accuracy and 82.5% TOC F1 score across a corpus of 4,185+ pages. The underlying research has been accepted for presentation at OSACT, LREC 2026 in Geneva, Switzerland.
Visit KitabiAI →