A decision tree classifier. Read more in the :ref:`User Guide `. Parameters ---------- criterion : {"gini", "entropy", "log_loss"}, default="gini" The function to measure the quality of a split. Supported criteria are "gini" for the Gini impurity and "log_loss
| 697 | |
| 698 | |
| 699 | class DecisionTreeClassifier(ClassifierMixin, BaseDecisionTree): |
| 700 | """A decision tree classifier. |
| 701 | |
| 702 | Read more in the :ref:`User Guide <tree>`. |
| 703 | |
| 704 | Parameters |
| 705 | ---------- |
| 706 | criterion : {"gini", "entropy", "log_loss"}, default="gini" |
| 707 | The function to measure the quality of a split. Supported criteria are |
| 708 | "gini" for the Gini impurity and "log_loss" and "entropy" both for the |
| 709 | Shannon information gain, see :ref:`tree_mathematical_formulation`. |
| 710 | |
| 711 | splitter : {"best", "random"}, default="best" |
| 712 | The strategy used to choose the split at each node. Supported |
| 713 | strategies are "best" to choose the best split and "random" to choose |
| 714 | the best random split. |
| 715 | |
| 716 | max_depth : int, default=None |
| 717 | The maximum depth of the tree. If None, then nodes are expanded until |
| 718 | all leaves are pure or until all leaves contain less than |
| 719 | min_samples_split samples. |
| 720 | |
| 721 | min_samples_split : int or float, default=2 |
| 722 | The minimum number of samples required to split an internal node: |
| 723 | |
| 724 | - If int, then consider `min_samples_split` as the minimum number. |
| 725 | - If float, then `min_samples_split` is a fraction and |
| 726 | `ceil(min_samples_split * n_samples)` are the minimum |
| 727 | number of samples for each split. |
| 728 | |
| 729 | .. versionchanged:: 0.18 |
| 730 | Added float values for fractions. |
| 731 | |
| 732 | min_samples_leaf : int or float, default=1 |
| 733 | The minimum number of samples required to be at a leaf node. |
| 734 | A split point at any depth will only be considered if it leaves at |
| 735 | least ``min_samples_leaf`` training samples in each of the left and |
| 736 | right branches. This may have the effect of smoothing the model, |
| 737 | especially in regression. |
| 738 | |
| 739 | - If int, then consider `min_samples_leaf` as the minimum number. |
| 740 | - If float, then `min_samples_leaf` is a fraction and |
| 741 | `ceil(min_samples_leaf * n_samples)` are the minimum |
| 742 | number of samples for each node. |
| 743 | |
| 744 | .. versionchanged:: 0.18 |
| 745 | Added float values for fractions. |
| 746 | |
| 747 | min_weight_fraction_leaf : float, default=0.0 |
| 748 | The minimum weighted fraction of the sum total of weights (of all |
| 749 | the input samples) required to be at a leaf node. Samples have |
| 750 | equal weight when sample_weight is not provided. |
| 751 | |
| 752 | max_features : int, float or {"sqrt", "log2"}, default=None |
| 753 | The number of features to consider when looking for the best split: |
| 754 | |
| 755 | - If int, then consider `max_features` features at each split. |
| 756 | - If float, then `max_features` is a fraction and |
searching dependent graphs…