holoviews.core.data.interface module#

exception holoviews.core.data.interface.DataError(msg, interface=None)[source]#

Bases: ValueError

DataError is raised when the data cannot be interpreted

class holoviews.core.data.interface.Interface(*, name)[source]#

Bases: Parameterized

Attributes:
datatype

Methods

add_dimension(dataset, dimension, dim_pos, ...)

Returns a copy of the data with the dimension values added.

applies(obj)

Indicates whether the interface is designed specifically to handle the supplied object's type.

array(dataset, dimensions)

Returns the data as a numpy.ndarray containing the selected dimensions.

as_dframe(dataset)

Returns the data of a Dataset as a dataframe avoiding copying if it already a dataframe type.

assign(dataset, new_data)

Adds a dictionary containing data for multiple new dimensions to a copy of the dataset.data.

cast(datasets[, datatype, cast_type])

Given a list of Dataset objects, cast them to the specified datatype (by default the format matching the current interface) with the given cast_type (if specified).

columns(dataset, dimensions)

Returns the data as a dictionary of 1D arrays indexed by column name.

compute(dataset)

Converts a lazy Dataset to a non-lazy, in-memory format.

concatenate(datasets[, datatype, new_type])

Utility function to concatenate an NdMapping of Dataset objects.

dframe(dataset, dimensions)

Returns the data as a pandas.DataFrame containing the selected dimensions.

dtype(dataset, dimension)

Returns the dtype for the selected dimension.

error()

Error message raised if interface could not resolve data.

has_holes(dataset)

Whether the Dataset contains geometries with holes.

histogram(array, bins[, density, weights])

Computes the histogram on the dimension values with support for specific bins, normalization and weighting.

holes(dataset)

Returns a list of lists of arrays containing the holes for each geometry in the Dataset.

iloc(dataset, index)

Implements integer indexing on the rows and columns of the data.

indexed(dataset, selection)

Given a Dataset object and selection to be applied returns boolean to indicate whether a scalar value has been indexed.

isscalar(dataset, dim)

Whether the selected dimension is a scalar value.

isunique(dataset, dim[, per_geom])

Whether the selected dimension has only a single unique value.

length(dataset)

Returns the number of rows in the Dataset.

loaded()

Indicates whether the required dependencies are loaded.

nonzero(dataset)

Returns a boolean indicating whether the Dataset contains any data.

persist(dataset)

Persists the data backing the Dataset in memory.

range(dataset, dimension)

Computes the minimum and maximum value along a dimension.

redim(dataset, dimensions)

Renames dimensions in the data.

reduce(dataset, reduce_dims, function, **kwargs)

Reduces one or more dimensions using the supplied reduction function.

register(interface)

Registers a new Interface.

reindex(dataset, kdims, vdims)

Reindexes data given new key and value dimensions.

replace_value(data, nodata)

Replace nodata value in data with NaN.

select_mask(dataset, selection)

Given a Dataset object and a dictionary with dimension keys and selection keys (i.e. tuple ranges, slices, sets, lists, or literals) return a boolean mask over the rows in the Dataset object that have been selected.

shape(dataset)

Returns the shape of the data.

validate(dataset[, vdims])

Validation runs after the Dataset has been constructed and should validate that the Dataset is correctly formed and contains all declared dimensions.

values(dataset, dimension[, expanded, flat, ...])

Returns the values along a dimension of the dataset.

aggregate

expanded

geom_type

groupby

initialize

mask

sample

select

sort

Parameter Definitions


classmethod add_dimension(dataset, dimension, dim_pos, values, vdim)[source]#

Returns a copy of the data with the dimension values added.

Parameters:
datasetDataset

The Dataset to add the dimension to

dimensionDimension

The dimension to add

dim_posint

The position in the data to add it to

valuesarray_like

The array of values to add

vdimbool

Whether the data is a value dimension

Returns:
data

A copy of the data with the new dimension

classmethod applies(obj)[source]#

Indicates whether the interface is designed specifically to handle the supplied object’s type. By default simply checks if the object is one of the types declared on the class, however if the type is expensive to import at load time the method may be overridden.

classmethod array(dataset, dimensions)[source]#

Returns the data as a numpy.ndarray containing the selected dimensions.

Parameters:
datasetDataset

The dataset to convert

dimensionslist[str]

List of dimensions to include

Returns:
np.ndarray

A Numpy ndarray containing the selected dimensions

classmethod as_dframe(dataset)[source]#

Returns the data of a Dataset as a dataframe avoiding copying if it already a dataframe type.

Parameters:
datasetDataset

The dataset to convert

Returns:
DataFrame

DataFrame representation of the data

classmethod assign(dataset, new_data)[source]#

Adds a dictionary containing data for multiple new dimensions to a copy of the dataset.data.

Parameters:
datasetDataset

The Dataset to add the dimension to

new_datadict

Dictionary containing new data to add to the Dataset

Returns:
data

A copy of the data with the new data dimensions added

classmethod cast(datasets, datatype=None, cast_type=None)[source]#

Given a list of Dataset objects, cast them to the specified datatype (by default the format matching the current interface) with the given cast_type (if specified).

classmethod columns(dataset, dimensions)[source]#

Returns the data as a dictionary of 1D arrays indexed by column name.

Parameters:
datasetDataset

The dataset to convert

dimensionslist[str]

List of dimensions to include

Returns:
dict[str, np.ndarray]

Dictionary mapping column names to arrays

classmethod compute(dataset)[source]#

Converts a lazy Dataset to a non-lazy, in-memory format.

Parameters:
datasetDataset

The dataset to compute

Returns:
Dataset

Dataset with non-lazy data

Notes

This is a no-op if the data is already non-lazy.

classmethod concatenate(datasets, datatype=None, new_type=None)[source]#

Utility function to concatenate an NdMapping of Dataset objects.

classmethod dframe(dataset, dimensions)[source]#

Returns the data as a pandas.DataFrame containing the selected dimensions.

Parameters:
datasetDataset

The dataset to convert

dimensionslist[str]

List of dimensions to include

Returns:
DataFrame

A pandas DataFrame containing the selected dimensions

classmethod dtype(dataset, dimension)[source]#

Returns the dtype for the selected dimension.

Parameters:
datasetDataset

The dataset to query

dimensionstr or Dimension

Dimension to return the dtype for

Returns:
numpy.dtype

The dtype of the selected dimension

classmethod error()[source]#

Error message raised if interface could not resolve data.

classmethod has_holes(dataset)[source]#

Whether the Dataset contains geometries with holes.

Parameters:
datasetDataset

The dataset to check

Returns:
bool

Whether the Dataset contains geometries with holes

Notes

Only meaningful to implement on Interfaces that support geometry data.

classmethod histogram(array, bins, density=True, weights=None)[source]#

Computes the histogram on the dimension values with support for specific bins, normalization and weighting.

Parameters:
arrayarray_like

In memory representation of the dimension values

binsnp.ndarray | int

An array of bins or the number of bins

densitybool, default True

Whether to normalize the histogram

weightsarray_like, optional

In memory representation of the weighting

Returns:
tuple[np.ndarray, np.ndarray]

Tuple of (histogram values, bin edges)

Notes

Usually the dimension_values and weights are assumed to be arrays but each interface should support data stored in whatever format it uses to store dimensions internally.

classmethod holes(dataset)[source]#

Returns a list of lists of arrays containing the holes for each geometry in the Dataset.

Parameters:
datasetDataset

The dataset to extract holes from

Returns:
list[list[np.ndarray]]

List of list of arrays representing geometry holes

Notes

Only meaningful to implement on Interfaces that support geometry data.

classmethod iloc(dataset, index)[source]#

Implements integer indexing on the rows and columns of the data.

Parameters:
datasetDataset

The dataset to apply the indexing operation on

indextuple or int

Index specification (row_index, col_index) or row_index

Returns:
data

Indexed data

Notes

Only implement for tabular interfaces.

classmethod indexed(dataset, selection)[source]#

Given a Dataset object and selection to be applied returns boolean to indicate whether a scalar value has been indexed.

classmethod isscalar(dataset, dim)[source]#

Whether the selected dimension is a scalar value.

Parameters:
datasetDataset

The dataset to query

dimstr or Dimension

Dimension to check for scalar value

Returns:
bool

Whether the dimension is scalar

classmethod isunique(dataset, dim, per_geom=False)[source]#

Whether the selected dimension has only a single unique value.

Compatibility method introduced for v1.13.0 to smooth over addition of per_geom kwarg for isscalar method.

Parameters:
datasetDataset

The dataset to query

dimstr or Dimension

Dimension to check for scalar value

per_geombool, default False

Whether to check per geometry

Returns:
bool

Whether the dimension is scalar

classmethod length(dataset)[source]#

Returns the number of rows in the Dataset.

Parameters:
datasetDataset

The dataset to get the length from

Returns:
int

Length of the data

classmethod loaded()[source]#

Indicates whether the required dependencies are loaded.

classmethod nonzero(dataset)[source]#

Returns a boolean indicating whether the Dataset contains any data.

Parameters:
datasetDataset

The dataset to check

Returns:
bool

Whether the dataset is not empty

classmethod persist(dataset)[source]#

Persists the data backing the Dataset in memory.

Parameters:
datasetDataset

The dataset to persist

Returns:
Dataset

Dataset with the data persisted to memory

Notes

This is a no-op if the data is already in memory.

classmethod range(dataset, dimension)[source]#

Computes the minimum and maximum value along a dimension.

Parameters:
datasetDataset

The dataset to query

dimensionstr or Dimension

Dimension to compute the range on

Returns:
tuple[Any, Any]

Tuple of (min, max) values

Notes

In the past categorical and string columns were handled by sorting the values and taking the first and last value. This behavior is deprecated and will be removed in 2.0. In future the range for these columns will be returned as (None, None).

classmethod redim(dataset, dimensions)[source]#

Renames dimensions in the data.

Parameters:
datasetDataset

The dataset to transform

dimensionsdict[str, str]

Dictionary mapping from old to new dimension names

Returns:
data

Data after the dimension names have been transformed

Notes

Only meaningful for data formats that store dimension names.

classmethod reduce(dataset, reduce_dims, function, **kwargs)[source]#

Reduces one or more dimensions using the supplied reduction function.

Parameters:
datasetDataset

The dataset to reduce

reduce_dimslist

List of dimensions to reduce

functionstr or ufunc

Reduction operation to apply

**kwargs

Additional keyword arguments

Returns:
Dataset

Dataset containing the reduced (or aggregated) data

classmethod register(interface)[source]#

Registers a new Interface.

classmethod reindex(dataset, kdims, vdims)[source]#

Reindexes data given new key and value dimensions.

classmethod replace_value(data, nodata)[source]#

Replace nodata value in data with NaN.

Parameters:
datanp.ndarray

The data array

nodatanumber

The nodata value to replace

Returns:
np.ndarray

Array with the nodata value replaced with NaN

classmethod select_mask(dataset, selection)[source]#

Given a Dataset object and a dictionary with dimension keys and selection keys (i.e. tuple ranges, slices, sets, lists, or literals) return a boolean mask over the rows in the Dataset object that have been selected.

Parameters:
datasetDataset

The dataset to select from

selectiondict

Dictionary containing selections for each column

Returns:
ndarray of bool

Boolean array representing the selection mask

classmethod shape(dataset)[source]#

Returns the shape of the data.

Parameters:
datasetDataset

The dataset to get the shape from

Returns:
tuple[int, int]

The shape of the data (rows, cols)

classmethod validate(dataset, vdims=True)[source]#

Validation runs after the Dataset has been constructed and should validate that the Dataset is correctly formed and contains all declared dimensions.

classmethod values(dataset, dimension, expanded=True, flat=True, compute=True, keep_index=False)[source]#

Returns the values along a dimension of the dataset.

Parameters:
datasetDataset

The dataset to query

dimensionstr or Dimension

Dimension to return the values for

expandedbool, default True

When false returns unique values along the dimension

flatbool, default True

Whether to flatten the array

computebool, default True

Whether to load lazy data into memory as a NumPy array

keep_indexbool, default False

Whether to return the data with an index (if present)

Returns:
array_like

Dimension values in the requested format

Notes

The expanded keyword has different behavior for gridded interfaces where it determines whether 1D coordinates are expanded into a multi-dimensional array.

class holoviews.core.data.interface.iloc(dataset)[source]#

Bases: Accessor

iloc is small wrapper object that allows row, column based indexing into a Dataset using the .iloc property. It supports the usual numpy and pandas iloc indexing semantics including integer indices, slices, lists and arrays of values. For more information see the Dataset.iloc property docstring.

class holoviews.core.data.interface.ndloc(dataset)[source]#

Bases: Accessor

ndloc is a small wrapper object that allows ndarray-like indexing for gridded Datasets using the .ndloc property. It supports the standard NumPy ndarray indexing semantics including integer indices, slices, lists and arrays of values. For more information see the Dataset.ndloc property docstring.