The PDF OCR Reader node extracts text from scanned or image-based PDF documents using AI vision models. It supports advanced handwriting recognition, multilingual input, and structured output for use in automation flows.
Purpose: Convert non-text PDFs into structured, readable text
Usage: Document digitization, handwriting recognition, form extraction
Model Requirement: Requires AI vision models for processing
File Name
Upload or select a PDF from your MergePoint storage
Use Link
Read from a publicly accessible PDF URL
Specify Pages
Limit processing to specific page numbers or ranges (e.g., 1-3, 5, 7-10)
Split PDF Content by Page
Returns a list where each item contains extracted text from one page
Useful for workflows that need per-page parsing
Image Model
Choose the AI model to use for OCR
(see Available Models below)
Temperature
Controls OCR creativity/accuracy (range: 0–1). Lower is more precise, higher allows for softer interpretation.
Cache Response
Enable to store results and avoid reprocessing the same file
Enable dynamic control from previous nodes:
Temperature
PDF Contents
Returns either:
A single string (full document)
A list of page-level strings (if "Split by Page" is enabled)
This node can:
Perform OCR on image-based PDFs
Read printed text and handwriting
Handle multi-page documents
Detect content in various layouts and languages
Accept file uploads or public URLs
Input: contracts_scanned.pdf
Setup: Advanced model, all pages
Output: Searchable legal text
Use Case: Document archiving and search
Input: expenses_receipts.pdf
Setup: Split by page, mid-range temperature
Output: Line items per page
Use Case: Automated expense reporting
Input: meeting_notes_handwritten.pdf
Setup: Claude Sonnet or GPT-4.1 Vision
Output: Parsed, readable notes
Use Case: Digitizing notebooks and whiteboard scans