diff --git a/README.md b/README.md index 3092f3a..38b54c1 100755 --- a/README.md +++ b/README.md @@ -72,10 +72,6 @@ After installing the dependencies for one or both of these algorithms, you can use them as model types in training and model exploration. You can read more about these models in the hlink documentation [here](https://hlink.docs.ipums.org/models.html). -*Note: The XGBoost-PySpark integration provided by the xgboost Python package is -currently unstable. So the hlink xgboost support is experimental and may change -in the future.* - ## Docs The documentation site can be found at [hlink.docs.ipums.org](https://hlink.docs.ipums.org). diff --git a/hlink/linking/core/classifier.py b/hlink/linking/core/classifier.py index bb27123..b58780a 100644 --- a/hlink/linking/core/classifier.py +++ b/hlink/linking/core/classifier.py @@ -134,7 +134,7 @@ def choose_classifier(model_type: str, params: dict[str, Any], dep_var: str): elif model_type == "xgboost": if not _xgboost_available: raise ModuleNotFoundError( - "To use the experimental 'xgboost' model type, you need to install " + "To use the 'xgboost' model type, you need to install " "the xgboost library and its dependencies. Try installing hlink with " "the xgboost extra:\n\n pip install hlink[xgboost]" ) diff --git a/sphinx-docs/changelog.md b/sphinx-docs/changelog.md index 77814e9..a5f261d 100644 --- a/sphinx-docs/changelog.md +++ b/sphinx-docs/changelog.md @@ -21,6 +21,11 @@ Hlink adheres to semantic versioning as much as possible. invoked by `select_column_mapping` when the configuration calls for them. [PR #207][pr207] +### Changed + +* Stabilized the XGBoost feature, since the integration provided by the xgboost + Python package is no longer unstable. [PR #219][pr219] + ### Deprecated * The `hlink.linking.core.transforms.apply_transform` function, which applies @@ -422,6 +427,7 @@ and false negative data in model exploration. [PR #1][pr1] [pr207]: https://github.com/ipums/hlink/pull/207 [pr212]: https://github.com/ipums/hlink/pull/212 [pr213]: https://github.com/ipums/hlink/pull/213 +[pr219]: https://github.com/ipums/hlink/pull/219 [household-matching-docs]: config.html#household-matching [household-training-docs]: config.html#household-training-and-model-exploration diff --git a/sphinx-docs/models.md b/sphinx-docs/models.md index 31c9eb6..ad4739a 100644 --- a/sphinx-docs/models.md +++ b/sphinx-docs/models.md @@ -121,8 +121,8 @@ maxBins = 6 XGBoost is an alternate, high-performance implementation of gradient boosting. It uses [xgboost.spark.SparkXGBClassifier](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.spark.SparkXGBClassifier). -Since the XGBoost-PySpark integration which the xgboost Python package provides -is currently unstable, support for the xgboost model type is disabled in hlink +Since the XGBoost-PySpark integration requires some additional Python packages, +support for the xgboost model type is disabled in hlink by default. hlink will stop with an error if you try to use this model type without enabling support for it. To enable support for xgboost, install hlink with the `xgboost` extra.