Split

split.run()

Splits a document into categorized sections synchronously.

client.split.run(
    input="https://example.com/document.pdf",
    split_description=[
        {"category": "summary", "description": "Executive summary sections"},
        {"category": "financials", "description": "Financial data and tables"}
    ],
    parsing={...},
    settings={...},
    split_rules="Additional rules for splitting"
)

Parameters

input

string | list[string]

required

The URL of the document to be processed. You can provide one of the following:

A publicly available URL
A presigned S3 URL
A reducto:// prefixed URL obtained from the /upload endpoint after directly uploading a document
A jobid:// prefixed URL obtained from a previous /parse invocation
A list of URLs (for multi-document pipelines, V3 API only)

split_description

Iterable[SplitCategory]

required

The configuration options for processing the document. Define the categories and their descriptions for splitting.

parsing

ParseOptions

The configuration options for parsing the document. If you are passing in a jobid:// URL for the file, then this configuration will be ignored.

settings

object

The settings for split processing.

split_rules

string

The prompt that describes rules for splitting the document.

Response

SplitResponse

object

Returns the document split into categorized sections.

result

object

The categorized sections of the document.

split.run_job()

Splits a document into categorized sections asynchronously and returns a job ID immediately.

response = client.split.run_job(
    input="https://example.com/document.pdf",
    split_description=[
        {"category": "summary", "description": "Executive summary sections"},
        {"category": "financials", "description": "Financial data and tables"}
    ],
    async_={"webhook": {"url": "https://example.com/webhook"}},
    parsing={...},
    settings={...},
    split_rules="Additional rules for splitting"
)

print(response.job_id)  # Use this to check job status later

Parameters

input

string | list[string]

required

The URL of the document to be processed. You can provide one of the following:

A publicly available URL
A presigned S3 URL
A reducto:// prefixed URL obtained from the /upload endpoint after directly uploading a document
A jobid:// prefixed URL obtained from a previous /parse invocation
A list of URLs (for multi-document pipelines, V3 API only)

split_description

Iterable[SplitCategory]

required

The configuration options for processing the document. Define the categories and their descriptions for splitting.

async_

ConfigV3AsyncConfig

The configuration options for asynchronous processing (default synchronous).

parsing

ParseOptions

The configuration options for parsing the document. If you are passing in a jobid:// URL for the file, then this configuration will be ignored.

settings

object

The settings for split processing.

split_rules

string

The prompt that describes rules for splitting the document.

Response

SplitRunJobResponse

object

job_id

string

The ID of the asynchronous job. Use client.job.get(job_id) to retrieve the result when the job completes.

Client

Resources

Types

Exceptions

split.run()

Parameters

Response

split.run_job()

Parameters

Response

Build docs developers (and LLMs) love

Client

Resources

Types

Exceptions

Documentation Index

​split.run()

​Parameters

​Response

​split.run_job()

​Parameters

​Response

Build docs developers (and LLMs) love

split.run()

Parameters

Response

split.run_job()

Parameters

Response