Calculate the best split of a node for each group of input variables when building a CARTGV tree.
split_cartgv(node, group, label, maxdepth = 2, penalty = "No")
node | a data frame containing the observations in the node. The first column is the response vector named "Y" and with the lable "0" and "1". The p-1 others variables are continuous or categroical variable must be coded as a set of dummy variables. |
---|---|
group | a vector with the group number of each variable.
(WARNING : if there are " |
label | an integer indicating the label of the node (the majority class) |
maxdepth | an integer indicating the maximal depth for a split-tree. The default value is 2. |
penalty | a boolean indicating if the decrease in node impurity must take account of the group size. Four penalty are available: "No" ,"Size","Root.size" or "Log". |
a list with elements
- Gain_Gini: a vector containing the reduction of Gini in the node from splitting on each group,
- Gain_Ent: a vector containing the reduction of Entropy in the node from splitting on each group,
- Gain_Mis: a vector containing the reduction of the number of miclassified observations in the node from splitting on each group,
- carts: a list containing for each group the CART object which summarizes the splitting tree,
- pred: a matrix with "nrows(node)
" lines and "length(unique(group))
" columns containing for each group the prediction,
resulting from the splitting tree.