The extractor component provides a uniform interface for reading the contents of archive files and returning them as open file descriptor objects. The abstract base classDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/ismael-sarmiento/kimera_python/llms.txt
Use this file to discover all available pages before exploring further.
Extractor defines a single read(**kwargs) contract that all implementations must satisfy; the concrete RawZipExtractor class fulfils this contract for ZIP archives, using Python’s standard zipfile.ZipFile to open every file inside the archive and return a list of readable descriptors — one per archived file.
Extractor ABC
Extractor is a simple abstract base class with a single abstract method:
Subclasses must implement
read(**kwargs) -> object. All arguments are passed as keyword arguments so that each implementation can declare exactly the parameters it needs without changing the call signature. The return type is implementation-defined — RawZipExtractor returns a list of file descriptor objects.RawZipExtractor
RawZipExtractor reads a ZIP archive and returns a list of open file descriptor objects — one for each file contained in the archive. Each descriptor is produced by ZipFile.open(name) and supports standard read operations.
Required Keyword Arguments
Path to the ZIP archive file. This key is validated by
ExceptionsUtils.raise_exception_if_key_not_in_dict before the archive is opened. Omitting it raises a plain Exception.Example
Return Value
A list of open file-like objects corresponding to every file listed in
ZipFile.namelist(). Each element supports .read() to retrieve the file’s raw bytes. The list order matches the order files appear in the ZIP’s central directory.The
filename kwarg is validated by ExceptionsUtils.raise_exception_if_key_not_in_dict('filename', kwargs) before ZipFile is constructed. If filename is not provided, a plain Exception is raised immediately, before any filesystem access occurs.How It Works Internally
The_file_descriptors static method opens the archive using ZipFile(**kwargs) inside a with block, then builds the descriptor list with a list comprehension over z.namelist():
Extending Extractor
ImplementExtractor to add support for other file formats or data sources. The only requirement is a read(**kwargs) method that returns some object — the type and shape are up to you.