Transforms a Pandas DataFrame into a Numpy array with binarized text columns This function transforms single-level df to an array so it can be plotted with HyperTools. Additionally, it uses the Pandas.Dataframe.get_dummies function to transform text columns into binary vectors, or
(data, return_labels=False)
| 4 | |
| 5 | |
| 6 | def df2mat(data, return_labels=False): |
| 7 | """ |
| 8 | Transforms a Pandas DataFrame into a Numpy array with binarized text columns |
| 9 | |
| 10 | This function transforms single-level df to an array so it can be plotted |
| 11 | with HyperTools. Additionally, it uses the Pandas.Dataframe.get_dummies |
| 12 | function to transform text columns into binary vectors, or |
| 13 | 'dummy variables'. |
| 14 | |
| 15 | Parameters |
| 16 | ---------- |
| 17 | data : A single-level Pandas DataFrame |
| 18 | The df that you want to convert. Note that this currently only works |
| 19 | with single-level (not Multi-level indices). |
| 20 | |
| 21 | Returns |
| 22 | ---------- |
| 23 | plot_data : Numpy array |
| 24 | A Numpy array where text columns are turned into binary vectors. |
| 25 | |
| 26 | labels : list (optional) |
| 27 | A list of column labels for the numpy array. To return this, set |
| 28 | return_labels=True. |
| 29 | |
| 30 | """ |
| 31 | |
| 32 | df_str = data.select_dtypes(include=['object']) |
| 33 | df_num = data.select_dtypes(exclude=['object']) |
| 34 | |
| 35 | for colname in df_str.columns: |
| 36 | df_num = df_num.join(pd.get_dummies(data[colname], prefix=colname)) |
| 37 | |
| 38 | plot_data = df_num.values |
| 39 | |
| 40 | labels=list(df_num.columns.values) |
| 41 | |
| 42 | if return_labels: |
| 43 | return plot_data,labels |
| 44 | else: |
| 45 | return plot_data |
no outgoing calls