MCPcopy
hub / github.com/dask/dask / histogram

Function histogram

dask/array/routines.py:861–1067  ·  view source on GitHub ↗

Blocked variant of :func:`numpy.histogram`. Parameters ---------- a : dask.array.Array Input data; the histogram is computed over the flattened array. If the ``weights`` argument is used, the chunks of ``a`` are accessed to check chunking compatibility betwe

(a, bins=None, range=None, normed=False, weights=None, density=None)

Source from the content-addressed store, hash-verified

859
860
861def histogram(a, bins=None, range=None, normed=False, weights=None, density=None):
862 """
863 Blocked variant of :func:`numpy.histogram`.
864
865 Parameters
866 ----------
867 a : dask.array.Array
868 Input data; the histogram is computed over the flattened
869 array. If the ``weights`` argument is used, the chunks of
870 ``a`` are accessed to check chunking compatibility between
871 ``a`` and ``weights``. If ``weights`` is ``None``, a
872 :py:class:`dask.dataframe.Series` object can be passed as
873 input data.
874 bins : int or sequence of scalars, optional
875 Either an iterable specifying the ``bins`` or the number of ``bins``
876 and a ``range`` argument is required as computing ``min`` and ``max``
877 over blocked arrays is an expensive operation that must be performed
878 explicitly.
879 If `bins` is an int, it defines the number of equal-width
880 bins in the given range (10, by default). If `bins` is a
881 sequence, it defines a monotonically increasing array of bin edges,
882 including the rightmost edge, allowing for non-uniform bin widths.
883 range : (float, float), optional
884 The lower and upper range of the bins. If not provided, range
885 is simply ``(a.min(), a.max())``. Values outside the range are
886 ignored. The first element of the range must be less than or
887 equal to the second. `range` affects the automatic bin
888 computation as well. While bin width is computed to be optimal
889 based on the actual data within `range`, the bin count will fill
890 the entire range including portions containing no data.
891 normed : bool, optional
892 This is equivalent to the ``density`` argument, but produces incorrect
893 results for unequal bin widths. It should not be used.
894 weights : dask.array.Array, optional
895 A dask.array.Array of weights, of the same block structure as ``a``. Each value in
896 ``a`` only contributes its associated weight towards the bin count
897 (instead of 1). If ``density`` is True, the weights are
898 normalized, so that the integral of the density over the range
899 remains 1.
900 density : bool, optional
901 If ``False``, the result will contain the number of samples in
902 each bin. If ``True``, the result is the value of the
903 probability *density* function at the bin, normalized such that
904 the *integral* over the range is 1. Note that the sum of the
905 histogram values will not be equal to 1 unless bins of unity
906 width are chosen; it is not a probability *mass* function.
907 Overrides the ``normed`` keyword if given.
908 If ``density`` is True, ``bins`` cannot be a single-number delayed
909 value. It must be a concrete number, or a (possibly-delayed)
910 array/sequence of the bin edges.
911
912 Returns
913 -------
914 hist : dask Array
915 The values of the histogram. See `density` and `weights` for a
916 description of the possible semantics.
917 bin_edges : dask Array of dtype float
918 Return the bin edges ``(length(hist)+1)``.

Callers

nothing calls this directly

Calls 14

sumMethod · 0.95
is_dask_collectionFunction · 0.90
unpack_collectionsFunction · 0.90
TaskClass · 0.90
ArrayClass · 0.90
asarrayFunction · 0.90
TaskRefClass · 0.90
flattenFunction · 0.90
from_collectionsMethod · 0.80
diffMethod · 0.80
tokenizeFunction · 0.50
ndimMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…