Document Read Agent

The Document Read Agent extracts information from documents (PDF, DOCX, TXT, PNG, JPG, Google Docs) using LLMs and outputs the result in text format(TXT).

API KEY

Enter an OpenAI or Google API key. The %VARIABLE% format is supported.

OpenAI API Key: Refer to https://platform.openai.com/api-keys
Google API Key: Refer to https://ai.google.dev/gemini-api/docs/api-key

MODEL

The following model types are currently supported:

Platform

Model

OpenAI

o3, gpt-4.1, gpt-4.1-mini

Google

gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash

Generally, the models listed above are billed based on usage. For billing details, refer to:
OpenAI: https://openai.com/pricing
Google Gemini: https://ai.google.dev/gemini-api/docs/pricing

PDF/DOCX/TXT/PNG/GOOGLEDOC

Click the "PICK" button or use the %FILENAME% variable to select the document file.

Supported formats: PDF, DOCX, TXT, PNG, JPG, Google Docs

ADD PROMPT

Add a natural language prompt to guide the model.

FILENAME: Name of the output file. The output will be saved as a .txt file.
PROMPT: Natural language instruction.

Example

Goal: Extract information from the document (PDF) file below

Use the natural language prompt to guide the model

Result: The output will be a text object.

We are dedicated to improving our content. Please let us know if you come across any errors, including spelling, grammar, or other mistakes, as your feedback is valuable to us! 🤖️⚡️

PreviousWeb Read Agent NextTable Read Agent

Last updated 22 days ago