Class XGBClassifier (0.14.1)

  XGBClassifier 
 ( 
 num_parallel_tree 
 : 
 int 
 = 
 1 
 , 
 booster 
 : 
 typing 
 . 
 Literal 
 [ 
 "gbtree" 
 , 
 "dart" 
 ] 
 = 
 "gbtree" 
 , 
 dart_normalized_type 
 : 
 typing 
 . 
 Literal 
 [ 
 "tree" 
 , 
 "forest" 
 ] 
 = 
 "tree" 
 , 
 tree_method 
 : 
 typing 
 . 
 Literal 
 [ 
 "auto" 
 , 
 "exact" 
 , 
 "approx" 
 , 
 "hist" 
 ] 
 = 
 "auto" 
 , 
 min_tree_child_weight 
 : 
 int 
 = 
 1 
 , 
 colsample_bytree 
 : 
 float 
 = 
 1.0 
 , 
 colsample_bylevel 
 : 
 float 
 = 
 1.0 
 , 
 colsample_bynode 
 : 
 float 
 = 
 1.0 
 , 
 gamma 
 : 
 float 
 = 
 0.0 
 , 
 max_depth 
 : 
 int 
 = 
 6 
 , 
 subsample 
 : 
 float 
 = 
 1.0 
 , 
 reg_alpha 
 : 
 float 
 = 
 0.0 
 , 
 reg_lambda 
 : 
 float 
 = 
 1.0 
 , 
 early_stop 
 : 
 bool 
 = 
 True 
 , 
 learning_rate 
 : 
 float 
 = 
 0.3 
 , 
 max_iterations 
 : 
 int 
 = 
 20 
 , 
 min_rel_progress 
 : 
 float 
 = 
 0.01 
 , 
 enable_global_explain 
 : 
 bool 
 = 
 False 
 , 
 xgboost_version 
 : 
 typing 
 . 
 Literal 
 [ 
 "0.9" 
 , 
 "1.1" 
 ] 
 = 
 "0.9" 
 , 
 ) 
 

XGBoost classifier model.

Parameters

Name
Description
num_parallel_tree
Optional[int]

Number of parallel trees constructed during each iteration. Default to 1.

booster
Optional[str]

Specify which booster to use: gbtree or dart. Default to "gbtree".

dart_normalized_type
Optional[str]

Type of normalization algorithm for DART booster. Possible values: "TREE", "FOREST". Default to "TREE".

tree_method
Optional[str]

Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: ""exact", "approx", "hist".

min_child_weight
Optional[float]

Minimum sum of instance weight(hessian) needed in a child. Default to 1.

colsample_bytree
Optional[float]

Subsample ratio of columns when constructing each tree. Default to 1.0.

colsample_bylevel
Optional[float]

Subsample ratio of columns for each level. Default to 1.0.

colsample_bynode
Optional[float]

Subsample ratio of columns for each split. Default to 1.0.

gamma
Optional[float]

(min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.

max_depth
Optional[int]

Maximum tree depth for base learners. Default to 6.

subsample
Optional[float]

Subsample ratio of the training instance. Default to 1.0.

reg_alpha
Optional[float]

L1 regularization term on weights (xgb's alpha). Default to 0.0.

reg_lambda
Optional[float]

L2 regularization term on weights (xgb's lambda). Default to 1.0.

early_stop
Optional[bool]

Whether training should stop after the first iteration. Default to True.

learning_rate
Optional[float]

Boosting learning rate (xgb's "eta"). Default to 0.3.

max_iterations
Optional[int]

Maximum number of rounds for boosting. Default to 20.

min_rel_progress
Optional[float]

Minimum relative loss improvement necessary to continue training when early_stop is set to True. Default to 0.01.

enable_global_explain
Optional[bool]

Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

xgboost_version
Optional[str]

Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".

Methods

__repr__

  __repr__ 
 () 
 

Print the estimator's constructor with all non-default parameter values

fit

  fit 
 ( 
 X 
 : 
 typing 
 . 
 Union 
 [ 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 ], 
 y 
 : 
 typing 
 . 
 Union 
 [ 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 ], 
 ) 
 - 
> bigframes 
 . 
 ml 
 . 
 base 
 . 
 _T 
 

Fit gradient boosting model.

Note that calling fit() multiple times will cause the model object to be re-fit from scratch. To resume training from a previous checkpoint, explicitly pass xgb_model argument.

Parameters
Name
Description
X
bigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples, n_features). Training data.

y
bigframes.dataframe.DataFrame or bigframes.series.Series

DataFrame of shape (n_samples,) or (n_samples, n_targets). Target values. Will be cast to X's dtype if necessary.

Returns
Type
Description
XGBModel
Fitted Estimator.

get_params

  get_params 
 ( 
 deep 
 : 
 bool 
 = 
 True 
 ) 
 - 
> typing 
 . 
 Dict 
 [ 
 str 
 , 
 typing 
 . 
 Any 
 ] 
 

Get parameters for this estimator.

Parameter
Name
Description
deep
bool, default True

Default True . If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
Type
Description
Dictionary
A dictionary of parameter names mapped to their values.

predict

  predict 
 ( 
 X 
 : 
 typing 
 . 
 Union 
 [ 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 ] 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Predict using the XGB model.

Parameter
Name
Description
X
bigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples, n_features). Samples.

Returns
Type
Description
DataFrame of shape (n_samples,)
Returns predicted values.

register

  register 
 ( 
 vertex_ai_model_id 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 ml 
 . 
 base 
 . 
 _T 
 

Register the model to Vertex AI.

After register, go to Google Cloud Console ( https://console.cloud.google.com/vertex-ai/models ) to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.

Parameter
Name
Description
vertex_ai_model_id
Optional[str], default None

optional string id as model id in Vertex. If not set, will by default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation.

score

  score 
 ( 
 X 
 : 
 typing 
 . 
 Union 
 [ 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 ], 
 y 
 : 
 typing 
 . 
 Union 
 [ 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 ], 
 ) 
 

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters
Name
Description
X
bigframes.dataframe.DataFrame or bigframes.series.Series

DataFrame of shape (n_samples, n_features). Test samples.

y
bigframes.dataframe.DataFrame or bigframes.series.Series

DataFrame of shape (n_samples,) or (n_samples, n_outputs). True labels for X .

Returns
Type
Description
A DataFrame of the evaluation result.

to_gbq

  to_gbq 
 ( 
 model_name 
 : 
 str 
 , 
 replace 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 ml 
 . 
 ensemble 
 . 
 XGBClassifier 
 

Save the model to BigQuery.

Parameters
Name
Description
model_name
str

the name of the model.

replace
bool, default False

whether to replace if the model already exists. Default to False.

Returns
Type
Description
XGBClassifier
saved model.
Design a Mobile Site
View Site in Mobile | Classic
Share by: