Skip to content

Commit bc73147

Browse files
committed
Introduce TMath::KNNDensity and deprecate TH1K
The `TH1K` class is deprecated and will be removed in 6.40. It did not implement the `TH1` interface consistently, and limited the usability of the k-neighbors method it implemented by closely coupling the algorithm with the histogram class. Instead, one should use the new `TMath::KNNDensity` function that implements the same mathematical logic.
1 parent a8ef8b0 commit bc73147

File tree

8 files changed

+93
-114
lines changed

8 files changed

+93
-114
lines changed

README/ReleaseNotes/v638/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
* The `rpath` build option is deprecated. It is now without effect.
1111
Relative RPATHs to the main ROOT libraries are unconditionally appended to all ROOT executables and libraries if the operating system supports it.
1212
If you want a ROOT build without RPATHs, use the canonical CMake variable `CMAKE_SKIP_INSTALL_RPATH=TRUE`.
13+
* The `TH1K` class is deprecated and will be removed in 6.40. It did not implement the `TH1` interface consistently, and limited the usability of the k-neighbors method it implemented by closely coupling the algorithm with the histogram class. Please use the new `TMath::KNNDensity` function that implements the same mathematical logic.
1314

1415
## Core Libraries
1516
* Behavior change: when selecting a template instantiation for a dictionary, all the template arguments have to be fully defined - the forward declarations are not enough any more. The error prompted by the dictionary generator will be `Warning: Unused class rule: MyTemplate<MyFwdDeclaredClass>`.

hist/hist/inc/TH1K.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,6 @@ class TH1K : public TH1, public TArrayF {
6868
void SetKOrd(Int_t k){fKOrd=k;}
6969

7070
ClassDefOverride(TH1K,2) //1-Dim Nearest Kth neighbour method
71-
};
71+
} R__DEPRECATED(6, 40, "Use TMath::KNNDensity");
7272

7373
#endif

math/mathcore/inc/TMath.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515
#include "TMathBase.h"
1616

1717
#include "TError.h"
18+
#include "ROOT/RSpan.hxx"
19+
1820
#include <algorithm>
1921
#include <limits>
2022
#include <cmath>
@@ -574,6 +576,9 @@ struct Limits {
574576
Double_t Gamma(Double_t a,Double_t x);
575577
Double_t GammaDist(Double_t x, Double_t gamma, Double_t mu=0, Double_t beta=1);
576578
Double_t LnGamma(Double_t z);
579+
580+
void KNNDensity(std::span<const double> observations, std::span<const double> queries, std::span<double> result,
581+
int k, double dmin = 0.0);
577582
}
578583

579584
////////////////////////////////////////////////////////////////////////////////

math/mathcore/src/TMath.cxx

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3196,6 +3196,92 @@ Double_t TMath::VavilovDenEval(Double_t rlam, Double_t *AC, Double_t *HC, Int_t
31963196
}
31973197

31983198

3199+
/// \brief Computes a 1D k-nearest neighbor density estimate for a set of query points.
3200+
///
3201+
/// For each query point, this function estimates the local density based on
3202+
/// the distance to the k-th nearest neighbor. A minimum distance threshold
3203+
/// `dmin` is used to avoid division by zero and numerical instability when a
3204+
/// query point coincides with an observation or neighbors are extremely close.
3205+
///
3206+
/// \param observations Data points used for density estimation.
3207+
/// \param queries Points at which to estimate the density.
3208+
/// \param result Output for the density estimates. Must have the same size as `queries`.
3209+
/// \param k Number of nearest neighbors to consider. If k >= number of observations,
3210+
/// it is automatically reduced to n-1.
3211+
/// \param dmin Minimum allowed neighbor distance to prevent division by zero. Default is zero.
3212+
///
3213+
/// The density estimate for each query point x is:
3214+
///
3215+
/// \f[
3216+
/// \hat{p}(x) = \frac{n}{2(n+1)} \frac{k_{\text{actual}}}{d_\(k_{\text{actual}}\)(x)},
3217+
/// \f]
3218+
///
3219+
/// where \(d_\(k_{\text{actual}}\)(x)\) is the distance and \(k_{\text{actual}}\) may be different from \(k\)
3220+
/// when it needs to be greater than \(k\) to keep the \(d > d_{\text{min}} constraint.
3221+
/// If no neighbor is outside `dmin`, the minimum allowed distance will be used as the distance itself.
3222+
///
3223+
/// \note This function implements the math of the former `TH1K` class. There are some differences:
3224+
///
3225+
/// 1. The `TH1K` class additionally multiplied all density estimates with the bin width.
3226+
/// 2. If the `k` parameter was not specified by the user, the `TH1K` class used the bin width as `dmin`.
3227+
/// 3. If the `k` parameter was explicitly specified, the `dmin` parameter was set to 1.e-6.
3228+
void KNNDensity(std::span<const double> observations, std::span<const double> queries, std::span<double> result, int k,
3229+
double dmin)
3230+
{
3231+
int n = observations.size();
3232+
if (n == 0) {
3233+
Error("KNNDensity", "No observations provided");
3234+
return;
3235+
}
3236+
if (k < 0) {
3237+
Error("KNNDensity", "k must be positive");
3238+
return;
3239+
}
3240+
if (k >= static_cast<int>(n))
3241+
k = static_cast<int>(n - 1);
3242+
3243+
if (result.size() != queries.size()) {
3244+
Error("KNNDensity", "Result span size must match query span size");
3245+
return;
3246+
}
3247+
3248+
// Make a sorted copy of the observations for binary search
3249+
std::vector<double> sorted_obs(observations.begin(), observations.end());
3250+
std::sort(sorted_obs.begin(), sorted_obs.end());
3251+
3252+
// constant factor in the output
3253+
const double factor = n * 0.5 / (n + 1.0);
3254+
3255+
for (std::size_t i = 0; i < queries.size(); ++i) {
3256+
double x = queries[i];
3257+
3258+
// Find index in sorted_obs
3259+
int jr = std::distance(sorted_obs.begin(), std::lower_bound(sorted_obs.begin(), sorted_obs.end(), x));
3260+
int jl = jr - 1;
3261+
3262+
double ff = 0.0;
3263+
int k_actual = 0;
3264+
3265+
for (; k_actual + 1 <= k || ff <= dmin; ++k_actual) {
3266+
double fl = (jl >= 0) ? std::abs(sorted_obs[jl] - x) : std::numeric_limits<double>::max();
3267+
double fr = (jr < n) ? std::abs(sorted_obs[jr] - x) : std::numeric_limits<double>::max();
3268+
3269+
if (jl < 0 && jr >= n)
3270+
break;
3271+
3272+
ff = std::min(fl, fr);
3273+
fl < fr ? --jl : ++jr;
3274+
}
3275+
3276+
if (ff == 0.0)
3277+
ff = dmin; // avoid division by zero
3278+
3279+
// Modified kNN density formula
3280+
result[i] = factor * k_actual / ff;
3281+
}
3282+
}
3283+
3284+
31993285
//explicitly instantiate template functions from VecCore
32003286
#ifdef R__HAS_VECCORE
32013287
#include <Math/Types.h>

roottest/graphics/CMakeLists.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,6 @@ set(TEST_DESCRIPTIONS
112112
"h2_cut a hist"
113113
"h2proj a hist"
114114
"histpalettecolor a hist"
115-
"hksimple a hist"
116115
"hlHisto1 a hist"
117116
"hlHisto2 a hist"
118117
#"hlHisto4 a hist" TODO: Fails on all platforms!

roottest/graphics/macros/hist/hksimple.C

Lines changed: 0 additions & 55 deletions
This file was deleted.

tutorials/hist/hist012_TH1_hksimple.C

Lines changed: 0 additions & 56 deletions
This file was deleted.

tutorials/hist/index.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@ The examples below showcase the same functionalities through 3 different impleme
8080
| hist009_TH1_normalize.C | | | Normalizing a Histogram. |
8181
| hist010_TH1_two_scales.C | hist010_TH1_two_scales.py | hist010_TH1_two_scales_uhi.py|Draw two histograms on one canvas using different y-axis scales. |
8282
| hist011_TH1_legend_autoplaced.C | | | Automatic placing of the legend. |
83-
| hist012_TH1_hksimple.C | | | Dynamic filling of TH1K histograms. |
8483
| hist013_TH1_rebin.C | | | Create a variable bin-width histogram and change bin sizes. |
8584
| hist014_TH1_cumulative.C | | | Illustrate use of the TH1::GetCumulative method. |
8685
| hist015_TH1_read_and_draw.C | hist015_TH1_read_and_draw.py | hist015_TH1_read_and_draw_uhi.py|Read a 1D histogram from a ROOT File and draw it. |

0 commit comments

Comments
 (0)