plateau.io_components.write module

plateau.io_components.write.coerce_schema_timestamps(wrapper: SchemaWrapper) SchemaWrapper[source]
plateau.io_components.write.persist_common_metadata(schemas: Iterable[SchemaWrapper], update_dataset: DatasetFactory | None, store: KeyValueStore, dataset_uuid: str, table_name: str)[source]
plateau.io_components.write.persist_indices(store: str | KeyValueStore | Callable[[], KeyValueStore], dataset_uuid: str, indices: Dict[str, IndexBase]) Dict[str, str][source]
plateau.io_components.write.raise_if_dataset_exists(dataset_uuid, store)[source]
plateau.io_components.write.store_dataset_from_partitions(partition_list, store: str | KeyValueStore | Callable[[], KeyValueStore], dataset_uuid, dataset_metadata=None, metadata_merger=None, update_dataset=None, remove_partitions=None, metadata_storage_format='json')[source]
plateau.io_components.write.update_indices(dataset_builder, store, add_partitions, remove_partitions)[source]
plateau.io_components.write.update_metadata(dataset_builder, metadata_merger, dataset_metadata)[source]
plateau.io_components.write.update_partitions(dataset_builder, add_partitions, remove_partitions)[source]
plateau.io_components.write.write_partition(partition_df: DataFrame | Sequence | MetaPartition | None, secondary_indices: List[str], sort_partitions_by: List[str], dataset_uuid: str, partition_on: List[str], store_factory: Callable[[], KeyValueStore], df_serializer: DataFrameSerializer | None, metadata_version: int, dataset_table_name: str = 'table') MetaPartition[source]

Write a dataframe to store, performing all necessary preprocessing tasks like partitioning, bucketing (NotImplemented), indexing, etc.

in the correct order.