Text Recognition Introduction

This workshop introduces researchers to Optical Character Recognition (OCR). At the conclusion of the lesson you will understand how OCR software function and ways to improve OCR quality.

Prerequisites

For this workshop, we will use two software: Adobe Acrobat and ABBYY FineReader 14.
Adobe Acrobat is available to all current Yale students, faculty, and staff through Adobe Creative Cloud; downloadable via Yale’s Software Library. ABBYY FineReader is installed on select desktop machines across YUL, including 8 machines in the DHLab. Yale does not provide individual copies of ABBYY FineReader, but anyone can download a 30-Day free trial from ABBYY.

Schedule

Setup Download files required for the lesson
00:00 1. Introduction to OCR What are PDFs? What is OCR?
00:10 2. OCR with Adobe Acrobat How do I OCR a PDF using Acrobat?
00:30 3. OCR with ABBYY FineReader How is ABBYY different from Acrobat?
How can you enhance PDFs for OCR?
How can you preview, edit, and export your OCR?
01:05 4. ABBYY OCR Editor How do I OCR PDFs with tables?
How do I change the language setting in ABBYY?
01:40 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.