Documentation Index
Fetch the complete documentation index at: https://mintlify.com/terrafloww/rasteret/llms.txt
Use this file to discover all available pages before exploring further.
Function Signature
Description
Build a Collection from an external Parquet/GeoParquet record table. A record table is a Parquet dataset where each row is a raster item (satellite scene, drone image, derived product, etc.) with at minimumid, datetime, geometry, assets, or columns that can be normalized into them via column_map and href_column.
This is the heavy ingest path: it can normalize schema and optionally enrich COG headers. For in-memory tables that are already read-ready, use as_collection().
When enrich_cog=True, COG headers are parsed from the asset URLs and cached as {band}_metadata struct columns in the Parquet index, enabling fast tiled reads and TorchGeo integration.
Parameters
Path/URI to a Parquet/GeoParquet file or dataset directory, or an in-memory Arrow object (
pyarrow.Table or pyarrow.dataset.Dataset).Optional collection name. When given without workspace_dir, the collection is cached in the default workspace.
Data source identifier for band mapping and URL policy.
Persist the collection as partitioned Parquet at this path. Defaults to
~/rasteret_workspace/{name}_records/ when name is provided.{source_name: contract_name} alias map. Source columns are preserved; contract-name columns are added as zero-copy aliases.Column containing COG URLs. When set and
assets is absent after aliasing, the normalization layer constructs the assets struct from this column and band_index_map.{band_code: sample_index} for multi-band COGs.{source_prefix: target_prefix} for URL rewriting during assets construction.PyArrow filesystem for reading remote URIs (e.g.,
S3FileSystem(anonymous=True)).Scan-time column projection.
Scan-time predicate pushdown.
Parse COG headers and add per-band metadata columns.
Bands to enrich. Defaults to all bands in
assets.Cloud configuration for URL rewriting.
Maximum concurrent HTTP connections for COG header parsing.
Rebuild even if a cached collection already exists at the resolved workspace path.
I/O backend for authenticated range reads during COG header parsing. See
create_backend().Returns
A Collection object ready for spatial queries and pixel reads.
Usage Example
Related Functions
- build() - Build from a registered dataset
- build_from_stac() - Build directly from STAC API
- as_collection() - Wrap an in-memory Arrow object without enrichment
- Collection - Collection class reference