Skip to content

Commit f5a2ff7

Browse files
authored
3.1.0 news (#11694)
1 parent 07ecf3d commit f5a2ff7

File tree

2 files changed

+164
-0
lines changed

2 files changed

+164
-0
lines changed

doc/changes/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,6 @@ For release notes prior to the 2.1 release, please see `news <https://github.com
88
:maxdepth: 1
99
:caption: Contents:
1010

11+
v3.1.0
1112
v3.0.0
1213
v2.1.0

doc/changes/v3.1.0.rst

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
###################
2+
3.1.0 (2025 Sep 22)
3+
###################
4+
5+
We are delighted to share the latest 3.1.0 update for XGBoost.
6+
7+
********************
8+
Categorical Re-coder
9+
********************
10+
11+
This release features a major update to categorical data support by introducing a
12+
re-coder. This re-coder saves categories in the trained model and re-codes the data during
13+
inference, to keep the categorical encoding consistent. Aside from primitive types like
14+
integers, it also supports string-based categories. The implementation works with all
15+
supported Python DataFrame implementations. (:pr:`11609`, :pr:`11665`, :pr:`11605`,
16+
:pr:`11628`, :pr:`11598`, :pr:`11591`, :pr:`11568`, :pr:`11561`, :pr:`11650`, :pr:`11621`,
17+
:pr:`11611`, :pr:`11313`, :pr:`11311`, :pr:`11310`, :pr:`11315`, :pr:`11303`, :pr:`11612`,
18+
:pr:`11098`, :pr:`11347`) See :ref:`cat-recode` for more information. (:pr:`11297`)
19+
20+
In addition, categorical support for Polars data frames is now available (:pr:`11565`).
21+
22+
Lastly, we removed the experimental tag for categorical feature support in this
23+
release. (:pr:`11690`)
24+
25+
***************
26+
External Memory
27+
***************
28+
29+
We continue the work on external memory support on 3.1. In this release, XGBoost features
30+
an adaptive cache for CUDA external memory. The improved cache can split the data between
31+
CPU memory and GPU memory according to the underlying hardware and data
32+
size. (:pr:`11556`, :pr:`11465`, :pr:`11664`, :pr:`11594`, :pr:`11469`, :pr:`11547`,
33+
:pr:`11339`, :pr:`11477`, :pr:`11453`, :pr:`11446`, :pr:`11458`, :pr:`11426`, :pr:`11566`,
34+
:pr:`11497`)
35+
36+
Also, there's an optional support (opt-in) for using ``nvcomp`` and the GB200
37+
decompression engine to handle sparse data (requires nvcomp as a plugin) (:pr:`11451`,
38+
:pr:`11464`, :pr:`11460`, :pr:`11512`, :pr:`11520`). We improved the memory usage of
39+
quantile sketching with external memory (:pr:`11641`) and optimized the predictor for
40+
training (:pr:`11548`). To help ensure the training performance, the latest XGBoost
41+
features detection for NUMA (Non-Uniform Memory Access) node (:pr:`11538`, :pr:`11576`) for checking cross-socket data
42+
access. We are working on additional tooling to enhance NUMA node performance. Aside from
43+
features, we have also added various documentation improvements. (:pr:`11412`,
44+
:pr:`11631`)
45+
46+
Lastly, external memory support with text file input has been removed
47+
(:pr:`11562`). Moving forward, we will focus on iterator inputs.
48+
49+
50+
****************************
51+
Multi-Target/Class Intercept
52+
****************************
53+
54+
Starting with 3.1, the base-score (intercept) is estimated and stored as a vector when the
55+
model has multiple outputs, be it multi-target regression or multi-class
56+
classification. This change enhances the initial estimation for multi-output models and
57+
will be the starting point for future work on vector-leaf. (:pr:`11277`, :pr:`11651`,
58+
:pr:`11625`, :pr:`11649`, :pr:`11630`, :pr:`11647`, :pr:`11656`, :pr:`11663`)
59+
60+
********
61+
Features
62+
********
63+
64+
- Support leaf prediction with QDM on CPU. (:pr:`11620`)
65+
- Improve seed with mean sampling for the first iteration. (:pr:`11639`)
66+
- Optionally include git hash in CMake build. (:pr:`11587`)
67+
68+
****************************
69+
Removing Deprecated Features
70+
****************************
71+
72+
This version removes some deprecated features, notably, the binary IO format, along with
73+
features deprecated in 2.0.
74+
75+
- Binary serialization format has been removed in 3.1. The format has been formally
76+
deprecated in `1.6 <https://github.com/dmlc/xgboost/issues/7547>`__. (:pr:`11307`,
77+
:pr:`11553`, :pr:`11552`, :pr:`11602`)
78+
79+
- Removed old GPU-related parameters including ``use_gpu`` (pyspark), ``gpu_id``,
80+
``gpu_hist``, and ``gpu_coord_descent``. These parameters have been deprecated in
81+
2.0. Use the ``device`` parameter instead. (:pr:`11395`, :pr:`11554`, :pr:`11549`,
82+
:pr:`11543`, :pr:`11539`, :pr:`11402`)
83+
84+
- Remove deprecated C functions: ``XGDMatrixCreateFromCSREx``,
85+
``XGDMatrixCreateFromCSCEx``. (:pr:`11514`, :pr:`11513`)
86+
87+
- XGBoost starts emit warning for text inputs. (:pr:`11590`)
88+
89+
90+
*************
91+
Optimizations
92+
*************
93+
94+
- Optimize CPU inference with Array-Based Tree Traversal (:pr:`11519`)
95+
- Specialize for GPU dense histogram. (:pr:`11443`)
96+
- [sycl] Improve L1 cache locality for histogram building. (:pr:`11555`)
97+
- [sycl] Reduce predictor memory consumption and improve L2 locality (:pr:`11603`)
98+
99+
*****
100+
Fixes
101+
*****
102+
103+
- Fix static linking C++ libraries on macOS (:pr:`11522`)
104+
- Rename param.hh/cc to hist_param.hh/cc to fix xcode build (:pr:`11378`)
105+
- [sycl] Fix build with updated compiler (:pr:`11618`)
106+
- [sycl] Various fixes for fp32-only devices. (:pr:`11527`, :pr:`11524`)
107+
- Fix compilation on android older than API 26 (:pr:`11366`)
108+
- Fix loading Gamma model from 1.3. (:pr:`11377`)
109+
110+
**************
111+
Python Package
112+
**************
113+
114+
- Support mixing Python metrics and built-in metrics for the skl interface. (:pr:`11536`)
115+
- CUDA 13 Support for PyPI with the new ``xgboost-cu13`` package. (:pr:`11677`, :pr:`11662`)
116+
- Remove wheels for manylinux2014. (:pr:`11673`)
117+
- Initial support for building variant wheels (:pr:`11531`, :pr:`11645`, :pr:`11294`)
118+
- Minimum PySpark version is now set to 3.4 (:pr:`11364`). In addition, the PySpark
119+
interface now checks the validation indicator column type and has a fix for None column
120+
input. (:pr:`11535`, :pr:`11523`)
121+
- [dask] Small cleanup for the predict function. (:pr:`11423`)
122+
123+
*********
124+
R Package
125+
*********
126+
127+
Now that most of the deprecated features have been removed in this release, we will try to
128+
bring the latest R package back to CRAN.
129+
130+
- Implement Booster reset. (:pr:`11357`)
131+
- Improvements for documentation, including having code examples in XGBoost's sphinx
132+
documentation side, and notes for R-universe release. (:pr:`11369`, :pr:`11410`,
133+
:pr:`11685`, :pr:`11316`)
134+
135+
************
136+
JVM Packages
137+
************
138+
139+
- Support columnar inputs for cpu pipeline (:pr:`11352`)
140+
- Rewrite the `LabeledPoint` as a Java class (:pr:`11545`)
141+
- Various fixes and document updates. (:pr:`11525`, :pr:`11508`, :pr:`11489`, :pr:`11682`)
142+
143+
*********
144+
Documents
145+
*********
146+
147+
Changes for general documentation:
148+
149+
- Update notes about GPU memory usage. (:pr:`11375`)
150+
- Various fixes and updates. (:pr:`11503`, :pr:`11532`, :pr:`11328`, :pr:`11344`, :pr:`11626`)
151+
152+
153+
******************
154+
CI and Maintenance
155+
******************
156+
157+
- Code cleanups. (:pr:`11367`, :pr:`11342`, :pr:`11658`, :pr:`11528`, :pr:`11585`,
158+
:pr:`11672`, :pr:`11642`, :pr:`11667`, :pr:`11495`, :pr:`11567`)
159+
- Various cleanup and fixes for tests. (:pr:`11405`, :pr:`11389`, :pr:`11396`, :pr:`11456`)
160+
- Support CMake 4.0 (:pr:`11382`)
161+
- Various CI updates and fixes (:pr:`11318`, :pr:`11349`, :pr:`11653`, :pr:`11637`,
162+
:pr:`11683`, :pr:`11638`, :pr:`11644`, :pr:`11306`, :pr:`11560`, :pr:`11323`, :pr:`11617`,
163+
:pr:`11341`, :pr:`11693`)

0 commit comments

Comments
 (0)