# File download

## File Download

File Download lets your AI workers fetch files from the internet and read their contents. Toolhouse adds this capability automatically when your worker needs it, or you can add it manually.

### How File Download works

When your worker encounters a URL pointing to a file — whether you provided it directly or the worker discovered it through Web Search — it hands that URL to the File Download integration. Toolhouse fetches the file on the worker's behalf and makes its contents available for the worker to read and act on.

This makes it possible to build workers that go beyond reading web pages. For example, you can create a research worker that finds a PDF report, downloads it, extracts the key figures, and summarizes them in a Slack message.

**Example prompt for your worker:**

> "Download the Q3 earnings report from this URL and summarize the revenue and margin figures."

### What you can do with downloaded files

Once a file is downloaded, your worker can use its contents to complete a wide range of tasks:

* **Edit files** — read the content, apply changes, and re-upload or forward the result
* **Move files across locations** — download from one source and upload to another, such as from a public URL to Google Drive
* **Upload files somewhere** — use the contents as input for another integration or API call
* **Process file contents** — extract data, summarize text, parse structured information, or feed the content into another step of a workflow

### File format and readability

Toolhouse does not check whether a file is in a readable format before downloading it. It will make a best-effort attempt to convert the file to text, but if the file is binary or non-textual (such as an image, compiled binary, or compressed archive), the worker will read it as-is — which may produce output that is not useful.

**Always prefer text-like files** — plain text, Markdown, CSV, HTML, JSON, XML, and similar formats work best.

For files like PDFs or scanned documents that require conversion to be readable, use **Document Parser** instead. Document Parser is purpose-built to extract clean, structured text from complex file types.

### Context window limits

Toolhouse limits the size of downloaded files to ensure the contents fit within your worker's context window. This cap is in place so the worker can continue operating effectively after reading the file — a file that exceeds the context window would crowd out the rest of the conversation and prevent the worker from completing its task.

If you need to process a very large file, consider using File Download together with the **Virtual Computer**, which can handle heavier processing outside the context window (see below).

### Using File Download with Virtual Computer

File Download and [Virtual computer](/toolhouse/capabilites/virtual-computer.md) work well together. File Download brings the file's contents into the worker's context, and Virtual Computer can then perform computation-heavy operations on that content — such as parsing a large CSV, running analysis, or transforming data — without exhausting the context window.

To configure this in Agent Editor, describe the two steps together:

> "Download the CSV file from the provided URL, then use the virtual computer to calculate the monthly totals per category and return a summary table."

You can also use File Download on its own, without Virtual Computer, when the file is small and the task is straightforward — such as reading a config file, extracting a few values from a JSON response, or reviewing a short document.

### Adding File Download manually

* Go to **Agents** in your Toolhouse dashboard
* Click on your worker to edit it
* Select **Integrations**, then click **Add Integration**
* Choose **File Download**
* Click **Save changes**

### Limitations and gotchas

| Constraint                       | Detail                                                                                                                                     |
| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| **File size cap**                | Downloaded files are truncated to fit within your worker's context window. Very large files may be cut off.                                |
| **No format validation**         | Toolhouse does not verify the file format before downloading. Binary or non-text files will be read as-is and may produce unusable output. |
| **Best-effort conversion**       | Toolhouse attempts to convert files to readable text, but conversion is not guaranteed for all file types.                                 |
| **Use Document Parser for PDFs** | PDFs and scanned documents are better handled by the Document Parser integration, which extracts clean text reliably.                      |
| **URL must be accessible**       | The file URL must be publicly reachable. Files behind authentication walls or private networks cannot be downloaded.                       |

### Frequently asked questions

**Can my worker download a file it found on its own?** Yes. If your worker uses Web Search and finds a URL pointing to a file, it can pass that URL directly to File Download without any additional input from you.

**What if the file is too large?** Toolhouse will truncate the file at the context window limit. If you need to process the full file, pair File Download with the Virtual Computer and prompt your worker to handle the content programmatically.

**When should I use Document Parser instead of File Download?** Use Document Parser whenever you're working with PDFs, scanned images, or other documents that require OCR or structured text extraction. File Download is best for files that are already in a text-friendly format.

**Can my worker download multiple files in one session?** Yes. Your worker can download files as many times as needed during a session. Each download is independent and the results are added to the worker's context as they come in.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.toolhouse.ai/toolhouse/capabilites/file-download.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
