Software
sparkxgb
R interface for XGBoost on Spark
R
sparkxgb is a sparklyr extension that provides an interface to XGBoost on Spark, allowing you to run XGBoost models on distributed data. It supports both formula-based model specification and Spark ML Pipelines API for building machine learning workflows.
The package integrates XGBoost with Spark’s distributed computing capabilities, enabling you to train gradient boosting models on large datasets that don’t fit in memory. It provides both classifier and regressor implementations that work seamlessly with sparklyr’s data manipulation functions and ML pipeline components. The package supports hyperparameter tuning through cross-validation and integrates with Spark’s model evaluation framework.
sparkxgb
sparkxgb
sparkxgb

