MCPcopy
hub / github.com/dask/dask / from_dask_array

Function from_dask_array

dask/dataframe/io/io.py:102–225  ·  view source on GitHub ↗

Create a Dask DataFrame from a Dask Array. Converts a 2d array into a DataFrame and a 1d array into a Series. Parameters ---------- x : da.Array columns : list or string list of column names if DataFrame, single string if Series index : dask.dataframe.Index, optiona

(x, columns=None, index=None, meta=None)

Source from the content-addressed store, hash-verified

100
101
102def from_dask_array(x, columns=None, index=None, meta=None):
103 """Create a Dask DataFrame from a Dask Array.
104
105 Converts a 2d array into a DataFrame and a 1d array into a Series.
106
107 Parameters
108 ----------
109 x : da.Array
110 columns : list or string
111 list of column names if DataFrame, single string if Series
112 index : dask.dataframe.Index, optional
113 An optional *dask* Index to use for the output Series or DataFrame.
114
115 The default output index depends on whether `x` has any unknown
116 chunks. If there are any unknown chunks, the output has ``None``
117 for all the divisions (one per chunk). If all the chunks are known,
118 a default index with known divisions is created.
119
120 Specifying `index` can be useful if you're conforming a Dask Array
121 to an existing dask Series or DataFrame, and you would like the
122 indices to match.
123 meta : object, optional
124 An optional `meta` parameter can be passed for dask
125 to specify the concrete dataframe type to be returned.
126 By default, pandas DataFrame is used.
127
128 Examples
129 --------
130 >>> import dask.array as da
131 >>> import dask.dataframe as dd
132 >>> x = da.ones((4, 2), chunks=(2, 2))
133 >>> df = dd.io.from_dask_array(x, columns=['a', 'b'])
134 >>> df.compute()
135 a b
136 0 1.0 1.0
137 1 1.0 1.0
138 2 1.0 1.0
139 3 1.0 1.0
140
141 See Also
142 --------
143 dask.bag.to_dataframe: from dask.bag
144 """
145 meta = _meta_from_array(x, columns, index, meta=meta)
146
147 name = "from-dask-array-" + tokenize(x, columns)
148 graph_dependencies = [x]
149 arrays_and_indices = [x.name, "ij" if x.ndim == 2 else "i"]
150 numblocks = {x.name: x.numblocks}
151
152 if index is not None:
153 # An index is explicitly given by the caller, so we can pass it through to the
154 # initializer after a few checks.
155 if index.npartitions != x.numblocks[0]:
156 msg = (
157 "The index and array have different numbers of blocks. "
158 f"({index.npartitions} != {x.numblocks[0]})"
159 )

Callers 1

to_dask_dataframeMethod · 0.90

Calls 14

tokenizeFunction · 0.90
BlockwiseDepDictClass · 0.90
is_series_likeFunction · 0.90
blockwiseFunction · 0.90
from_graphFunction · 0.90
optimizeFunction · 0.90
ensure_dictFunction · 0.90
key_splitFunction · 0.90
pyarrow_strings_enabledFunction · 0.90
new_collectionFunction · 0.90
_meta_from_arrayFunction · 0.85

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…