holoviews.core.data.cudf module#

class holoviews.core.data.cudf.cuDFInterface(*, name)[source]#

Bases: PandasInterface

The cuDFInterface allows a Dataset objects to wrap a cuDF DataFrame object. Using cuDF allows working with columnar data on a GPU. Most operations leave the data in GPU memory, however to plot the data it has to be loaded into memory.

The cuDFInterface covers almost the complete API exposed by the PandasInterface with two notable exceptions:

  1. Aggregation and groupby do not have a consistent sort order (see rapidsai/cudf#4237)

  2. Not all functions can be easily applied to a cuDF so some functions applied with aggregate and reduce will not work.

Methods

add_dimension(dataset, dimension, dim_pos, ...)

Returns a copy of the data with the dimension values added.

applies(obj)

Indicates whether the interface is designed specifically to handle the supplied object's type.

dframe(dataset, dimensions)

Returns the data as a pandas.DataFrame containing the selected dimensions.

iloc(dataset, index)

Implements integer indexing on the rows and columns of the data.

loaded()

Indicates whether the required dependencies are loaded.

range(dataset, dimension)

Computes the minimum and maximum value along a dimension.

select_mask(dataset, selection)

Given a Dataset object and a dictionary with dimension keys and selection keys (i.e. tuple ranges, slices, sets, lists, or literals) return a boolean mask over the rows in the Dataset object that have been selected.

values(dataset, dim[, expanded, flat, ...])

Returns the values along a dimension of the dataset.

aggregate

concat_fn

groupby

init

select

sort

Parameter Definitions


classmethod add_dimension(dataset, dimension, dim_pos, values, vdim)[source]#

Returns a copy of the data with the dimension values added.

Parameters:
datasetDataset

The Dataset to add the dimension to

dimensionDimension

The dimension to add

dim_posint

The position in the data to add it to

valuesarray_like

The array of values to add

vdimbool

Whether the data is a value dimension

Returns:
data

A copy of the data with the new dimension

classmethod applies(obj)[source]#

Indicates whether the interface is designed specifically to handle the supplied object’s type. By default simply checks if the object is one of the types declared on the class, however if the type is expensive to import at load time the method may be overridden.

classmethod dframe(dataset, dimensions)[source]#

Returns the data as a pandas.DataFrame containing the selected dimensions.

Parameters:
datasetDataset

The dataset to convert

dimensionslist[str]

List of dimensions to include

Returns:
DataFrame

A pandas DataFrame containing the selected dimensions

classmethod iloc(dataset, index)[source]#

Implements integer indexing on the rows and columns of the data.

Parameters:
datasetDataset

The dataset to apply the indexing operation on

indextuple or int

Index specification (row_index, col_index) or row_index

Returns:
data

Indexed data

Notes

Only implement for tabular interfaces.

classmethod loaded()[source]#

Indicates whether the required dependencies are loaded.

classmethod range(dataset, dimension)[source]#

Computes the minimum and maximum value along a dimension.

Parameters:
datasetDataset

The dataset to query

dimensionstr or Dimension

Dimension to compute the range on

Returns:
tuple[Any, Any]

Tuple of (min, max) values

Notes

In the past categorical and string columns were handled by sorting the values and taking the first and last value. This behavior is deprecated and will be removed in 2.0. In future the range for these columns will be returned as (None, None).

classmethod select_mask(dataset, selection)[source]#

Given a Dataset object and a dictionary with dimension keys and selection keys (i.e. tuple ranges, slices, sets, lists, or literals) return a boolean mask over the rows in the Dataset object that have been selected.

classmethod values(dataset, dim, expanded=True, flat=True, compute=True, keep_index=False)[source]#

Returns the values along a dimension of the dataset.

Parameters:
datasetDataset

The dataset to query

dimensionstr or Dimension

Dimension to return the values for

expandedbool, default True

When false returns unique values along the dimension

flatbool, default True

Whether to flatten the array

computebool, default True

Whether to load lazy data into memory as a NumPy array

keep_indexbool, default False

Whether to return the data with an index (if present)

Returns:
array_like

Dimension values in the requested format

Notes

The expanded keyword has different behavior for gridded interfaces where it determines whether 1D coordinates are expanded into a multi-dimensional array.