Compute the multi-dimensional non-normalized (i.e., without z-normalization) matrix profile with a `dask` cluster This is a highly distributed implementation around the Numba JIT-compiled parallelized `_maamp` function which computes the multi-dimensional matrix profile accordi
(
dask_client,
T_A,
T_B,
m,
excl_zone,
T_A_subseq_isfinite,
T_B_subseq_isfinite,
p,
include,
discords,
)
| 12 | |
| 13 | |
| 14 | def _dask_maamped( |
| 15 | dask_client, |
| 16 | T_A, |
| 17 | T_B, |
| 18 | m, |
| 19 | excl_zone, |
| 20 | T_A_subseq_isfinite, |
| 21 | T_B_subseq_isfinite, |
| 22 | p, |
| 23 | include, |
| 24 | discords, |
| 25 | ): |
| 26 | """ |
| 27 | Compute the multi-dimensional non-normalized (i.e., without z-normalization) matrix |
| 28 | profile with a `dask` cluster |
| 29 | |
| 30 | This is a highly distributed implementation around the Numba JIT-compiled |
| 31 | parallelized `_maamp` function which computes the multi-dimensional matrix |
| 32 | profile according to STOMP. Note that only self-joins are supported. |
| 33 | |
| 34 | Parameters |
| 35 | ---------- |
| 36 | dask_client : client |
| 37 | A `dask` client. Setting up a cluster is beyond the scope of this library. |
| 38 | Please refer to the `dask` documentation. |
| 39 | |
| 40 | T_A : numpy.ndarray |
| 41 | The time series or sequence for which to compute the multi-dimensional |
| 42 | matrix profile. Each row in `T_A` represents data from the same |
| 43 | dimension while each column in `T_A` represents data from a different |
| 44 | dimension. |
| 45 | |
| 46 | T_B : numpy.ndarray |
| 47 | The time series or sequence that will be used to annotate T_A. For every |
| 48 | subsequence in T_A, its nearest neighbor in T_B will be recorded. |
| 49 | |
| 50 | m : int |
| 51 | Window size |
| 52 | |
| 53 | excl_zone : int |
| 54 | The half width for the exclusion zone relative to the current |
| 55 | sliding window |
| 56 | |
| 57 | T_A_subseq_isfinite : numpy.ndarray |
| 58 | A boolean array that indicates whether a subsequence in `T_A` contains a |
| 59 | `np.nan`/`np.inf` value (False) |
| 60 | |
| 61 | T_B_subseq_isfinite : numpy.ndarray |
| 62 | A boolean array that indicates whether a subsequence in `T_B` contains a |
| 63 | `np.nan`/`np.inf` value (False) |
| 64 | |
| 65 | p : float |
| 66 | The p-norm to apply for computing the Minkowski distance. Minkowski distance is |
| 67 | typically used with `p` being 1 or 2, which correspond to the Manhattan distance |
| 68 | and the Euclidean distance, respectively. |
| 69 | |
| 70 | include : numpy.ndarray |
| 71 | A list of (zero-based) indices corresponding to the dimensions in `T` that |
nothing calls this directly
no test coverage detected