Build a splitting tree in the modified CARTGV trees containing in a RFGV forest.

cartgv_split(data, group, crit = 1, case_min = 1, maxdepth = 2,
  p = floor(sqrt(length(unique(group[!is.na(group)])))),
  penalty = "No")

Arguments

data

a data frame containing the response value (for the first variable) and the predictors and used to grow the tree. The name of the response value must be "Y".The response variable must be the first variable of the data frame and the variable must be coded as the two levels "0" and "1".

group

group a vector with the group number of each variable. (WARNING : if there are "p" goups, the groups must be numbers from "1" to "p" in increasing order. The group label of the response variable is missing (i.e. NA)).

crit

an integer indicating the impurity function used (1=Gini index / 2=Entropie/ 3=Misclassification rate).

case_min

an integer indicating the minimun number of cases/non cases in a terminal nodes. The default is 1.

maxdepth

the max depth for a split-tree.

p

an integer indicating the number of variables randomly samples as candidates at each split.

penalty

a boolean indicating if the decrease in node impurity must take account of the group size. Four penalty are available: "No","Size","Root.size" or "Log".

Value

a list with elements

  • tree : a data frame which summarizes the resulted splitting tree.

  • carts : a list containing all the CART objects used to buid the splitting tree. (Note that each split in the splitting tree is a CART object)

  • splits : a list containing informations about the splits. Each element is an object retuned by the function "split_cartgv".

  • pop : a list containing the indices (rownames) of the observations which belong to the nodes.

  • tables_coupures : a list containing data frames that summarizes the splits.

  • groups_selec : a matrix containint for each splitting-tree the indices of the sampled grouped. Precisely, the i-th row correspond to the i-th splitting-tree.