`<cmath>`: Calling UCRT `::pow(x, 2)` is less accurate than `x * x`

This is a regression caused by #903 which we merged in June 2020 and shipped in Nov 2020 as part of VS 2019 16.8.

That PR correctly fixed the type of `std::pow(float, int)` (it previously returned `float` and now returns `double` as required by the Standard). But it also removed special-case logic that detected `std::pow(x, 2)` and directly returned `x * x`. While reviewing the PR at the time, in https://github.com/microsoft/STL/pull/903#pullrequestreview-436222996 I thought about whether always calling UCRT `::pow()` would be problematic:

> I thought about the previous special-casing for `pow(floating, 2)`; this shouldn't cause correctness issues (I expect the UCRT `pow` to do the right thing).

Why would I *ever* have expected that? I guess I was distracted in June 2020. After receiving a customer report (DevCom-10968831), I've confirmed that UCRT `::pow(x, 2)` returns an incorrect result for some inputs, when `x * x` is exactly correct:

```
C:\Temp>type meow.cpp
```
```cpp
#include <bit>
#include <cmath>
#include <cstdint>
#include <limits>
#include <print>
#include <random>
using namespace std;

int main() {
    mt19937_64 urbg{};
    uniform_real_distribution<double> dist{0.0, 1.0};

    const int Trials = 100'000'000;
    int differences  = 0;

    for (int i = 0; i < Trials; ++i) {
        const double dbl_val = dist(urbg);
        const double sqr_pow = pow(dbl_val, 2);
        const double sqr_mul = dbl_val * dbl_val;
        if (sqr_pow != sqr_mul) {
            ++differences;

            const uint64_t bits_pow = bit_cast<uint64_t>(sqr_pow);
            const uint64_t bits_mul = bit_cast<uint64_t>(sqr_mul);
            const uint64_t ulp_diff = bits_pow < bits_mul ? bits_mul - bits_pow : bits_pow - bits_mul;

            if (differences < 4 || ulp_diff > 1) {
                println("dbl_val: {0:a} {0:.1000g}", dbl_val);
                println("sqr_pow: {0:a} {0:.1000g}", sqr_pow);
                println("sqr_mul: {0:a} {0:.1000g}", sqr_mul);
                println("ulp_diff: {}", ulp_diff);
                println();
            }
        }
    }

    println("Ran {} trials, found {} differences, {}% of trials.", Trials, differences, differences * 100.0 / Trials);
}
```
```
C:\Temp>cl /EHsc /nologo /W4 /std:c++latest /MT /O2 meow.cpp && meow
meow.cpp
dbl_val: 1.ec9a50154a6f9p-1 0.96211481342217475276612503876094706356525421142578125
sqr_pow: 1.d9f0c06b2463dp-1 0.92566491420638608023097049226635135710239410400390625
sqr_mul: 1.d9f0c06b2463ep-1 0.9256649142063861912532729547820053994655609130859375
ulp_diff: 1

dbl_val: 1.12814d2dd432cp-1 0.536142742008541173248659106320701539516448974609375
sqr_pow: 1.26590a84f9b11p-2 0.287449039808437112331063190140412189066410064697265625
sqr_mul: 1.26590a84f9b12p-2 0.28744903980843716784221442139823921024799346923828125
ulp_diff: 1

dbl_val: 1.33994b0b751ccp-3 0.15019472721782134438939237952581606805324554443359375
sqr_pow: 1.719905c84494bp-6 0.0225584560840357654931676023579711909405887126922607421875
sqr_mul: 1.719905c84494ap-6 0.022558456084035762023720650404357002116739749908447265625
ulp_diff: 1

Ran 100000000 trials, found 42347 differences, 0.042347% of trials.
```

This is consistent if I try a distribution of -1e10 to 1e10. (It vanishes if I try numeric_limits min/max, probably because most results squared are infinity.) I haven't observed inaccuracy above 1 ULP, but even 1 is bad.

Using [Wolfram Alpha](https://www.wolframalpha.com/) for high-precision computation, I manually checked the rounding for the first reported case.

```
Wolfram Alpha dbl_val^2: 0.9256649142063861358000347706773514725967013405547974870554600612150564220570458928705193102359771728515625
Wolfram Alpha dbl_val^2 - sqr_pow:  5.55690642784110001154943072365508912370554600612150564220570458928705193102359771728515625 * 10^-17
Wolfram Alpha dbl_val^2 - sqr_mul: -5.54532381841046539268688595725311400129445399387849435779429541071294806897640228271484375 * 10^-17
```

Here, the mathematically exact answer for the square of `dbl_val` occurs between two representable `double`s (as is the case for the vast majority of inputs). `sqr_pow` is a bit smaller than the mathematically exact answer. `sqr_mul` is a bit larger. The mathematically exact answer is *almost* at the midpoint of the two `double`s, but not quite. `sqr_mul` is closer (smaller absolute value difference), therefore it is correctly rounded.

(The full rule is "round to nearest, tiebreak to even" which always says what the correctly rounded result should be, even when the value to round is at the midpoint between representable `double`s, but the tiebreaker doesn't come into effect here.)

I don't know if the UCRT is technically allowed (e.g. by the IEEE spec) to be 1 ULP off in `pow()`. However, the behavioral regression is problematic for at least one customer, was unplanned, and is not mathematically correct. (I would be unrepentant if we had changed behavior in the pursuit of increased correctness.) While special-casing every integer power (e.g. 3, 4, etc.) would be unwise, the case of 2 is simple, we likely shipped it for a couple of decades, and should be feasible to return to (heh).

I'll work on a fix and test coverage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`<cmath>`: Calling UCRT `::pow(x, 2)` is less accurate than `x * x` #5768

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

<cmath>: Calling UCRT ::pow(x, 2) is less accurate than x * x #5768

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`<cmath>`: Calling UCRT `::pow(x, 2)` is less accurate than `x * x` #5768