Aggregation¶
aggregate_to_shapes(ds, gdf, variable='DNB_BRDF-Corrected_NTL', agg_type='mean', is_valid_pct=False, valid_pct_threshold=None, geo_id_col='geonameid')
¶
High-level helper to aggregate an xarray dataset to vector shapes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
The xarray Dataset containing the variable to aggregate. |
required |
gdf
|
GeoDataFrame | DataFrame
|
The GeoDataFrame containing the vector shapes. |
required |
variable
|
str
|
The variable name to aggregate. |
'DNB_BRDF-Corrected_NTL'
|
agg_type
|
Literal['mean', 'median']
|
Aggregation type ('mean' or 'median'). |
'mean'
|
is_valid_pct
|
bool
|
Whether to calculate the percentage of non-nan pixels. |
False
|
valid_pct_threshold
|
Optional[float]
|
Percentage (0-1) below which aggregated values are set to np.nan. |
None
|
geo_id_col
|
str
|
The column in the GeoDataFrame identifying the shapes. |
'geonameid'
|
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset containing the aggregated spatial values. |
get_agg_per_shape(ds, mask, variable, agg_type='mean', is_valid_pct=False, valid_pct_threshold=None, geo_id_col='geonameid')
¶
Memory-safe aggregation using Dask and Zarr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Dataset containing the input variable. |
required |
mask
|
Dataset
|
Dataset containing the shape mappings. |
required |
variable
|
str
|
Name of the variable to aggregate. |
required |
agg_type
|
Literal['mean', 'median']
|
Aggregation type to apply ('mean' or 'median'). Defaults to 'mean'. |
'mean'
|
is_valid_pct
|
bool
|
Whether to calculate the percentage of non-nan pixels. |
False
|
valid_pct_threshold
|
float | None
|
Percentage (0-1) below which aggregated values are set to np.nan. |
None
|
geo_id_col
|
str
|
Column name containing shape IDs. |
'geonameid'
|
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset containing the aggregated spatial values and optionally the percentage of valid pixels. |
get_gdf_mask_for_ds(ds, gdf, geo_id_col='geonameid')
¶
Creates a spatial mask for an xarray Dataset based on a given GeoDataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
The reference dataset to match the spatial grid. |
required |
gdf
|
GeoDataFrame | DataFrame
|
The GeoDataFrame containing the vector shapes. |
required |
geo_id_col
|
str
|
The column name in the GeoDataFrame that uniquely identifies each shape. |
'geonameid'
|
Returns:
| Type | Description |
|---|---|
Dataset
|
A rasterized dataset mask where pixel values correspond to the geometry IDs. |