utils
- tf2xgb.utils.gen_random_dataset(n, n_subgrp, n_grp, beta, sigma)
Generate random Pandas Data Frame with
nobservations split ton_subgrpdistinct subgroups described by column'subgrp_id'and'n_grp'distinct groups described by column'grp_id'. The target in column'y'is linear combination of feature vector in column'X'with true coefficient vectorbetaand standard errorsigma. Intercept is zero.- Parameters
n – number of observations
n_subgrp – number of subgroups
n_grp – number of groups
beta – true coefficients
sigma – standard error
- Returns
random dataset
- tf2xgb.utils.get_ragged_nested_index_lists(df, id_col_list)
Gets the ragged nested lists of indices (= row numbers of
df). Hierarchy in the nesting is set up by the df columns with names inid_col_list.- Parameters
df – Pandas Data Frame with the sample. It ias to contain columns listed in
id_col_list.id_col_list – list of columns to
dfwhich correspond to the levels of nesting in the resulting index list. Higher-level groups have to be mentioned first, e.g.['grp_id', 'subgrp_id'].
- Returns
Pandas DF with two columns: copy of
df[id_col_list[0]]and column'_row_'containing nested list of row numbers, which is input to decoratorxgb_tf_loss().