plateau.core.factory module
- class plateau.core.factory.DatasetFactory(dataset_uuid: str, store_factory: str | KeyValueStore | Callable[[], KeyValueStore], load_schema: bool = True, load_all_indices: bool = False)[source]
Bases:
DatasetMetadataBase
Container holding metadata caching storage access.
- property dataset_metadata: DatasetMetadata
- load_all_indices(store: Any = None) T [source]
Load all registered indices into memory.
Note: External indices need to be preloaded before they can be queried.
- Parameters:
store – Object that implements the .get method for file/object loading.
- Returns:
dataset_metadata – Mutated metadata object with the loaded indices.
- Return type:
- load_index(column, store=None) T [source]
Load an index into memory.
Note: External indices need to be preloaded before they can be queried.
- Parameters:
column – Name of the column for which the index should be loaded.
store – Object that implements the .get method for file/object loading.
- Returns:
dataset_metadata – Mutated metadata object with the loaded index.
- Return type:
- load_partition_indices() T [source]
Load all filename encoded indices into RAM. File encoded indices can be extracted from datasets with partitions stored in a format like.
`dataset_uuid/table/IndexCol=IndexValue/SecondIndexCol=Value/partition_label.parquet`
Which results in an in-memory index holding the information
{ "IndexCol": { IndexValue: ["partition_label"] }, "SecondIndexCol": { Value: ["partition_label"] } }
- property store: KeyValueStore