Function Signature
Description
Saves annotated documents to a JSON Lines file. Each document is written as a separate JSON object on its own line, making it easy to process large datasets incrementally.Parameters
Iterator over AnnotatedDocument objects to save. These are the documents that have been processed and annotated by the LLM.
The directory to which the JSONL file should be written. Can be a Path object or a string. If None, defaults to
test_output/ directory.File name for the JSONL file. The file will be created at
output_dir/output_name.Whether to show a progress bar during the saving operation. Useful for tracking progress with large datasets.
Returns
None. The function writes data to disk and displays progress information if enabled.Exceptions
- IOError: If the output directory cannot be created.
- InvalidDatasetError: If no valid documents are produced (all documents have empty document_id).
Usage Example
Notes
- The output directory is created automatically if it doesn’t exist (including parent directories).
- Documents with empty
document_idfields are skipped. - Files are written with UTF-8 encoding and
ensure_ascii=Falsefor proper internationalization support. - Progress information includes the number of documents saved and the output file path.