holoviews.core.data.interface module#
- exception holoviews.core.data.interface.DataError(msg, interface=None)[source]#
Bases:
ValueErrorDataError is raised when the data cannot be interpreted
- class holoviews.core.data.interface.Interface(*, name)[source]#
Bases:
Parameterized- Attributes:
- datatype
Methods
add_dimension(dataset, dimension, dim_pos, ...)Returns a copy of the data with the dimension values added.
applies(obj)Indicates whether the interface is designed specifically to handle the supplied object's type.
array(dataset, dimensions)Returns the data as a numpy.ndarray containing the selected dimensions.
as_dframe(dataset)Returns the data of a Dataset as a dataframe avoiding copying if it already a dataframe type.
assign(dataset, new_data)Adds a dictionary containing data for multiple new dimensions to a copy of the dataset.data.
cast(datasets[, datatype, cast_type])Given a list of Dataset objects, cast them to the specified datatype (by default the format matching the current interface) with the given cast_type (if specified).
columns(dataset, dimensions)Returns the data as a dictionary of 1D arrays indexed by column name.
compute(dataset)Converts a lazy Dataset to a non-lazy, in-memory format.
concatenate(datasets[, datatype, new_type])Utility function to concatenate an NdMapping of Dataset objects.
dframe(dataset, dimensions)Returns the data as a pandas.DataFrame containing the selected dimensions.
dtype(dataset, dimension)Returns the dtype for the selected dimension.
error()Error message raised if interface could not resolve data.
has_holes(dataset)Whether the Dataset contains geometries with holes.
histogram(array, bins[, density, weights])Computes the histogram on the dimension values with support for specific bins, normalization and weighting.
holes(dataset)Returns a list of lists of arrays containing the holes for each geometry in the Dataset.
iloc(dataset, index)Implements integer indexing on the rows and columns of the data.
indexed(dataset, selection)Given a Dataset object and selection to be applied returns boolean to indicate whether a scalar value has been indexed.
isscalar(dataset, dim)Whether the selected dimension is a scalar value.
isunique(dataset, dim[, per_geom])Whether the selected dimension has only a single unique value.
length(dataset)Returns the number of rows in the Dataset.
loaded()Indicates whether the required dependencies are loaded.
nonzero(dataset)Returns a boolean indicating whether the Dataset contains any data.
persist(dataset)Persists the data backing the Dataset in memory.
range(dataset, dimension)Computes the minimum and maximum value along a dimension.
redim(dataset, dimensions)Renames dimensions in the data.
reduce(dataset, reduce_dims, function, **kwargs)Reduces one or more dimensions using the supplied reduction function.
register(interface)Registers a new Interface.
reindex(dataset, kdims, vdims)Reindexes data given new key and value dimensions.
replace_value(data, nodata)Replace nodata value in data with NaN.
select_mask(dataset, selection)Given a Dataset object and a dictionary with dimension keys and selection keys (i.e. tuple ranges, slices, sets, lists, or literals) return a boolean mask over the rows in the Dataset object that have been selected.
shape(dataset)Returns the shape of the data.
validate(dataset[, vdims])Validation runs after the Dataset has been constructed and should validate that the Dataset is correctly formed and contains all declared dimensions.
values(dataset, dimension[, expanded, flat, ...])Returns the values along a dimension of the dataset.
aggregate
expanded
geom_type
groupby
initialize
mask
sample
select
sort
Parameter Definitions
- classmethod add_dimension(dataset, dimension, dim_pos, values, vdim)[source]#
Returns a copy of the data with the dimension values added.
- Parameters:
- dataset
Dataset The Dataset to add the dimension to
- dimension
Dimension The dimension to add
- dim_pos
int The position in the data to add it to
- valuesarray_like
The array of values to add
- vdimbool
Whether the data is a value dimension
- dataset
- Returns:
dataA copy of the data with the new dimension
- classmethod applies(obj)[source]#
Indicates whether the interface is designed specifically to handle the supplied object’s type. By default simply checks if the object is one of the types declared on the class, however if the type is expensive to import at load time the method may be overridden.
- classmethod array(dataset, dimensions)[source]#
Returns the data as a numpy.ndarray containing the selected dimensions.
- Parameters:
- Returns:
np.ndarrayA Numpy ndarray containing the selected dimensions
- classmethod as_dframe(dataset)[source]#
Returns the data of a Dataset as a dataframe avoiding copying if it already a dataframe type.
- Parameters:
- dataset
Dataset The dataset to convert
- dataset
- Returns:
DataFrameDataFrame representation of the data
- classmethod assign(dataset, new_data)[source]#
Adds a dictionary containing data for multiple new dimensions to a copy of the dataset.data.
- Parameters:
- dataset
Dataset The Dataset to add the dimension to
- new_data
dict Dictionary containing new data to add to the Dataset
- dataset
- Returns:
dataA copy of the data with the new data dimensions added
- classmethod cast(datasets, datatype=None, cast_type=None)[source]#
Given a list of Dataset objects, cast them to the specified datatype (by default the format matching the current interface) with the given cast_type (if specified).
- classmethod columns(dataset, dimensions)[source]#
Returns the data as a dictionary of 1D arrays indexed by column name.
- Parameters:
- Returns:
dict[str,np.ndarray]Dictionary mapping column names to arrays
- classmethod compute(dataset)[source]#
Converts a lazy Dataset to a non-lazy, in-memory format.
- Parameters:
- dataset
Dataset The dataset to compute
- dataset
- Returns:
DatasetDataset with non-lazy data
Notes
This is a no-op if the data is already non-lazy.
- classmethod concatenate(datasets, datatype=None, new_type=None)[source]#
Utility function to concatenate an NdMapping of Dataset objects.
- classmethod dframe(dataset, dimensions)[source]#
Returns the data as a pandas.DataFrame containing the selected dimensions.
- classmethod dtype(dataset, dimension)[source]#
Returns the dtype for the selected dimension.
- Parameters:
- dataset
Dataset The dataset to query
- dimension
strorDimension Dimension to return the dtype for
- dataset
- Returns:
numpy.dtypeThe dtype of the selected dimension
- classmethod has_holes(dataset)[source]#
Whether the Dataset contains geometries with holes.
- Parameters:
- dataset
Dataset The dataset to check
- dataset
- Returns:
- bool
Whether the Dataset contains geometries with holes
Notes
Only meaningful to implement on Interfaces that support geometry data.
- classmethod histogram(array, bins, density=True, weights=None)[source]#
Computes the histogram on the dimension values with support for specific bins, normalization and weighting.
- Parameters:
- arrayarray_like
In memory representation of the dimension values
- bins
np.ndarray|int An array of bins or the number of bins
- densitybool,
defaultTrue Whether to normalize the histogram
- weightsarray_like,
optional In memory representation of the weighting
- Returns:
tuple[np.ndarray,np.ndarray]Tuple of (histogram values, bin edges)
Notes
Usually the dimension_values and weights are assumed to be arrays but each interface should support data stored in whatever format it uses to store dimensions internally.
- classmethod holes(dataset)[source]#
Returns a list of lists of arrays containing the holes for each geometry in the Dataset.
- Parameters:
- dataset
Dataset The dataset to extract holes from
- dataset
- Returns:
list[list[np.ndarray]]List of list of arrays representing geometry holes
Notes
Only meaningful to implement on Interfaces that support geometry data.
- classmethod iloc(dataset, index)[source]#
Implements integer indexing on the rows and columns of the data.
- Parameters:
- Returns:
dataIndexed data
Notes
Only implement for tabular interfaces.
- classmethod indexed(dataset, selection)[source]#
Given a Dataset object and selection to be applied returns boolean to indicate whether a scalar value has been indexed.
- classmethod isunique(dataset, dim, per_geom=False)[source]#
Whether the selected dimension has only a single unique value.
Compatibility method introduced for v1.13.0 to smooth over addition of per_geom kwarg for isscalar method.
- classmethod length(dataset)[source]#
Returns the number of rows in the Dataset.
- Parameters:
- dataset
Dataset The dataset to get the length from
- dataset
- Returns:
intLength of the data
- classmethod nonzero(dataset)[source]#
Returns a boolean indicating whether the Dataset contains any data.
- Parameters:
- dataset
Dataset The dataset to check
- dataset
- Returns:
- bool
Whether the dataset is not empty
- classmethod persist(dataset)[source]#
Persists the data backing the Dataset in memory.
- Parameters:
- dataset
Dataset The dataset to persist
- dataset
- Returns:
DatasetDataset with the data persisted to memory
Notes
This is a no-op if the data is already in memory.
- classmethod range(dataset, dimension)[source]#
Computes the minimum and maximum value along a dimension.
- Parameters:
- dataset
Dataset The dataset to query
- dimension
strorDimension Dimension to compute the range on
- dataset
- Returns:
tuple[Any,Any]Tuple of (min, max) values
Notes
In the past categorical and string columns were handled by sorting the values and taking the first and last value. This behavior is deprecated and will be removed in 2.0. In future the range for these columns will be returned as (None, None).
- classmethod redim(dataset, dimensions)[source]#
Renames dimensions in the data.
- Parameters:
- Returns:
dataData after the dimension names have been transformed
Notes
Only meaningful for data formats that store dimension names.
- classmethod reduce(dataset, reduce_dims, function, **kwargs)[source]#
Reduces one or more dimensions using the supplied reduction function.
- classmethod reindex(dataset, kdims, vdims)[source]#
Reindexes data given new key and value dimensions.
- classmethod replace_value(data, nodata)[source]#
Replace nodata value in data with NaN.
- Parameters:
- data
np.ndarray The data array
- nodata
number The nodata value to replace
- data
- Returns:
np.ndarrayArray with the nodata value replaced with NaN
- classmethod select_mask(dataset, selection)[source]#
Given a Dataset object and a dictionary with dimension keys and selection keys (i.e. tuple ranges, slices, sets, lists, or literals) return a boolean mask over the rows in the Dataset object that have been selected.
- classmethod validate(dataset, vdims=True)[source]#
Validation runs after the Dataset has been constructed and should validate that the Dataset is correctly formed and contains all declared dimensions.
- classmethod values(dataset, dimension, expanded=True, flat=True, compute=True, keep_index=False)[source]#
Returns the values along a dimension of the dataset.
- Parameters:
- dataset
Dataset The dataset to query
- dimension
strorDimension Dimension to return the values for
- expandedbool,
defaultTrue When false returns unique values along the dimension
- flatbool,
defaultTrue Whether to flatten the array
- computebool,
defaultTrue Whether to load lazy data into memory as a NumPy array
- keep_indexbool,
defaultFalse Whether to return the data with an index (if present)
- dataset
- Returns:
- array_like
Dimension values in the requested format
Notes
The expanded keyword has different behavior for gridded interfaces where it determines whether 1D coordinates are expanded into a multi-dimensional array.
- class holoviews.core.data.interface.iloc(dataset)[source]#
Bases:
Accessoriloc is small wrapper object that allows row, column based indexing into a Dataset using the
.ilocproperty. It supports the usual numpy and pandas iloc indexing semantics including integer indices, slices, lists and arrays of values. For more information see theDataset.ilocproperty docstring.
- class holoviews.core.data.interface.ndloc(dataset)[source]#
Bases:
Accessorndloc is a small wrapper object that allows ndarray-like indexing for gridded Datasets using the
.ndlocproperty. It supports the standard NumPy ndarray indexing semantics including integer indices, slices, lists and arrays of values. For more information see theDataset.ndlocproperty docstring.