Compute the multi-dimensional z-normalized matrix profile This is a convenience wrapper around the Numba JIT-compiled parallelized ``_mstump`` function which computes the multi-dimensional matrix profile and multi-dimensional matrix profile index according to mSTOMP, a variant of
(
T, m, include=None, discords=False, normalize=True, p=2.0, T_subseq_isconstant=None
)
| 1112 | |
| 1113 | @core.non_normalized(maamp, exclude=["normalize", "T_subseq_isconstant"]) |
| 1114 | def mstump( |
| 1115 | T, m, include=None, discords=False, normalize=True, p=2.0, T_subseq_isconstant=None |
| 1116 | ): |
| 1117 | """ |
| 1118 | Compute the multi-dimensional z-normalized matrix profile |
| 1119 | |
| 1120 | This is a convenience wrapper around the Numba JIT-compiled parallelized |
| 1121 | ``_mstump`` function which computes the multi-dimensional matrix profile and |
| 1122 | multi-dimensional matrix profile index according to mSTOMP, a variant of |
| 1123 | mSTAMP. Note that only self-joins are supported. |
| 1124 | |
| 1125 | Parameters |
| 1126 | ---------- |
| 1127 | T : numpy.ndarray |
| 1128 | The time series or sequence for which to compute the multi-dimensional |
| 1129 | matrix profile. Each row in ``T`` represents data from the same |
| 1130 | dimension while each column in ``T`` represents data from a different |
| 1131 | dimension. |
| 1132 | |
| 1133 | m : int |
| 1134 | Window size. |
| 1135 | |
| 1136 | include : list, numpy.ndarray, default None |
| 1137 | A list of (zero-based) indices corresponding to the dimensions in ``T`` that |
| 1138 | must be included in the constrained multidimensional motif search. |
| 1139 | For more information, see Section IV D in: |
| 1140 | |
| 1141 | `DOI: 10.1109/ICDM.2017.66 \ |
| 1142 | <https://www.cs.ucr.edu/~eamonn/Motif_Discovery_ICDM.pdf>`__ |
| 1143 | |
| 1144 | discords : bool, default False |
| 1145 | When set to ``True``, this reverses the distance matrix which results in a |
| 1146 | multi-dimensional matrix profile that favors larger matrix profile values |
| 1147 | (i.e., discords) rather than smaller values (i.e., motifs). Note that indices |
| 1148 | in ``include`` are still maintained and respected. |
| 1149 | |
| 1150 | normalize : bool, default True |
| 1151 | When set to ``True``, this z-normalizes subsequences prior to computing |
| 1152 | distances. Otherwise, this function gets re-routed to its complementary |
| 1153 | non-normalized equivalent set in the ``@core.non_normalized`` function |
| 1154 | decorator. |
| 1155 | |
| 1156 | p : float, default 2.0 |
| 1157 | The p-norm to apply for computing the Minkowski distance. Minkowski distance is |
| 1158 | typically used with ``p`` being ``1`` or ``2``, which correspond to the |
| 1159 | Manhattan distance and the Euclidean distance, respectively. This parameter is |
| 1160 | ignored when ``normalize == True``. |
| 1161 | |
| 1162 | T_subseq_isconstant : numpy.ndarray, function, or list, default None |
| 1163 | A parameter that is used to show whether a subsequence of a time series in ``T`` |
| 1164 | is constant (``True``) or not. ``T_subseq_isconstant`` can be a 2D boolean |
| 1165 | ``numpy.ndarray`` or a function that can be applied to each time series in |
| 1166 | ``T``. Alternatively, for maximum flexibility, a list (with length equal to the |
| 1167 | total number of time series) may also be used. In this case, |
| 1168 | ``T_subseq_isconstant[i]`` corresponds to the ``i``-th time series ``T[i]`` |
| 1169 | and each element in the list can either be a 1D boolean ``numpy.ndarray``, a |
| 1170 | function, or ``None``. |
| 1171 |