Compute the non-normalized (i.e., without z-normalization) matrix profile with a `dask`/`ray` cluster This is a highly distributed implementation around the Numba JIT-compiled parallelized `_aamp` function which computes the non-normalized matrix profile according to AAMP.
(client, T_A, m, T_B=None, ignore_trivial=True, p=2.0, k=1)
| 305 | |
| 306 | |
| 307 | def aamped(client, T_A, m, T_B=None, ignore_trivial=True, p=2.0, k=1): |
| 308 | """ |
| 309 | Compute the non-normalized (i.e., without z-normalization) matrix profile |
| 310 | with a `dask`/`ray` cluster |
| 311 | |
| 312 | This is a highly distributed implementation around the Numba JIT-compiled |
| 313 | parallelized `_aamp` function which computes the non-normalized matrix profile |
| 314 | according to AAMP. |
| 315 | |
| 316 | Parameters |
| 317 | ---------- |
| 318 | client : client |
| 319 | A `dask`/`ray` client. Setting up a cluster is beyond the scope of this library. |
| 320 | Please refer to the `dask`/`ray` documentation. |
| 321 | |
| 322 | T_A : numpy.ndarray |
| 323 | The time series or sequence for which to compute the matrix profile |
| 324 | |
| 325 | m : int |
| 326 | Window size |
| 327 | |
| 328 | T_B : numpy.ndarray, default None |
| 329 | The time series or sequence that will be used to annotate T_A. For every |
| 330 | subsequence in T_A, its nearest neighbor in T_B will be recorded. Default is |
| 331 | `None` which corresponds to a self-join. |
| 332 | |
| 333 | ignore_trivial : bool, default True |
| 334 | Set to `True` if this is a self-join. Otherwise, for AB-join, set this |
| 335 | to `False`. Default is `True`. |
| 336 | |
| 337 | p : float, default 2.0 |
| 338 | The p-norm to apply for computing the Minkowski distance. Minkowski distance is |
| 339 | typically used with `p` being 1 or 2, which correspond to the Manhattan distance |
| 340 | and the Euclidean distance, respectively. |
| 341 | |
| 342 | k : int, default 1 |
| 343 | The number of top `k` smallest distances used to construct the matrix profile. |
| 344 | Note that this will increase the total computational time and memory usage |
| 345 | when k > 1. |
| 346 | |
| 347 | Returns |
| 348 | ------- |
| 349 | out : numpy.ndarray |
| 350 | When k = 1 (default), the first column consists of the matrix profile, |
| 351 | the second column consists of the matrix profile indices, the third column |
| 352 | consists of the left matrix profile indices, and the fourth column consists |
| 353 | of the right matrix profile indices. However, when k > 1, the output array |
| 354 | will contain exactly 2 * k + 2 columns. The first k columns (i.e., out[:, :k]) |
| 355 | consists of the top-k matrix profile, the next set of k columns |
| 356 | (i.e., out[:, k:2k]) consists of the corresponding top-k matrix profile |
| 357 | indices, and the last two columns (i.e., out[:, 2k] and out[:, 2k+1] or, |
| 358 | equivalently, out[:, -2] and out[:, -1]) correspond to the top-1 left |
| 359 | matrix profile indices and the top-1 right matrix profile indices, respectively. |
| 360 | |
| 361 | For convenience, the matrix profile (distances) and matrix profile indices can |
| 362 | also be accessed via their corresponding named array attributes, `.P_` and |
| 363 | `.I_`,respectively. Similarly, the corresponding left matrix profile indices |
| 364 | and right matrix profile indices may also be accessed via the `.left_I_` and |