Add Makefile infrastructure and `long double' real input data for
targets using the IEEE 754 binary128 format.
Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.
Reviewed-by: Joseph Myers <josmyers@redhat.com>
Add Makefile infrastructure and `double' real input data for targets
using the IEEE 754 binary64 format.
Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.
Reviewed-by: Joseph Myers <josmyers@redhat.com>
Add Makefile infrastructure and `float' real input data for targets
using the IEEE 754 binary32 format.
Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.
Reviewed-by: Joseph Myers <josmyers@redhat.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the powr functions, which are like pow, but with simpler
handling of special cases (based on exp(y*log(x)), so negative x and
0^0 are domain errors, powers of -0 are always +0 or +Inf never -0 or
-Inf, and 1^+-Inf and Inf^0 are also domain errors, while NaN^0 and
1^NaN are NaN). The test inputs are taken from those for pow, with
appropriate adjustments (including removing all tests that would be
domain errors from those in auto-libm-test-in and adding some more
such tests in libm-test-powr.inc).
The underlying implementation uses __ieee754_pow functions after
dealing with all special cases that need to be handled differently.
It might be a little faster (avoiding a wrapper and redundant checks
for special cases) to have an underlying implementation built
separately for both pow and powr with compile-time conditionals for
special-case handling, but I expect the benefit of that would be
limited given that both functions will end up needing to use the same
logic for computing pow outside of special cases.
My understanding is that powr(negative, qNaN) should raise "invalid":
that the rule on "invalid" for an argument outside the domain of the
function takes precedence over a quiet NaN argument producing a quiet
NaN result with no exceptions raised (for rootn it's explicit that the
0th root of qNaN raises "invalid"). I've raised this on the WG14
reflector to confirm the intent.
Tested for x86_64 and x86, and with build-many-glibcs.py.
On SPR, it improves atanh bench performance by:
Before After Improvement
reciprocal-throughput 15.1715 14.8628 2%
latency 57.1941 56.1883 2%
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
On SPR, it improves sinh bench performance by:
Before After Improvement
reciprocal-throughput 14.2017 11.815 17%
latency 36.4917 35.2114 4%
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
On Skylake, it improves tanh bench performance by:
Before After Improvement
max 110.89 95.826 14%
min 20.966 20.157 4%
mean 30.9601 29.8431 4%
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The current approach tracks math maximum supported errors by explicitly
setting them per function and architecture. On newer implementations or
new compiler versions, the file is updated with newer values if it
shows higher results. The idea is to track the maximum known error, to
update the manual with the obtained values.
The constant libm-test-ulps shows little value, where it is usually a
mechanical change done by the maintainer, for past releases it is
usually ignored whether the ulp change resulted from a compiler
regression, and the math tests already have a maximum ulp error that
triggers a regression.
It was shown by a recent update after the new acosf [1] implementation
that is correctly rounded, where the libm-test-ulps was indeed from a
compiler issue.
This patch removes all arch-specific libm-test-ulps, adds system generic
libm-test-ulps where applicable, and changes its semantics. The generic
files now track specific implementation constraints, like if it is
expected to be correctly rounded, or if the system-specific has
different error expectations.
Now multiple libm-test-ulps can be defined, and system-specific
overrides generic implementation. This is for the case where
arch-specific implementation might show worse precision than generic
implementation, for instance, the cbrtf on i686.
Regressions are only reported if the implementation shows larger errors
than 9 ulps (13 for IBM long double) unless it is overridden by
libm-test-ulps and the maximum error is not printed at the end of tests.
The regen-ulps rule is also removed since it does not make sense to
update the libm-test-ulps automatically.
The manual error table is also removed, Paul Zimmermann and others have
been tracking libm precision with a more comprehensive analysis for some
releases; so link to his work instead.
[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=9cc9f8e11e8fb8f54f1e84d9f024917634a78201
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the rsqrt functions (1/sqrt(x)). The test inputs are
taken from those for sqrt.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Single-precision remainderf() and quad-precision remainderl()
implementation derived from Sun is affected by an issue when the result
is +-0. IEEE754 requires that if remainder(x, y) = 0, its sign shall be
that of x regardless of the rounding direction.
The implementation seems to have assumed that x - x = +0 in all
rounding modes, which is not the case. When rounding direction is
roundTowardNegative the sign of an exact zero sum (or difference) is −0.
Regression tests that triggered this erroneous behavior are added to
math/libm-test-remainder.inc.
Tested for cross riscv64 and powerpc.
Original fix by: Bruce Evans <bde@FreeBSD.org> in FreeBSD's
a2ddfa5ea726c56dbf825763ad371c261b89b7c7.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
A number of fma tests started to fail on hppa when gcc was changed to
use Ranger rather than EVRP. Eventually I found that the value of
a1 + u.d in this is block of code was being computed in FE_TOWARDZERO
mode and not the original rounding mode:
if (TININESS_AFTER_ROUNDING)
{
w.d = a1 + u.d;
if (w.ieee.exponent == 109)
return w.d * 0x1p-108;
}
This caused the exponent value to be wrong and the wrong return path
to be used.
Here we add an optimization barrier after the rounding mode is reset
to ensure that the previous value of a1 + u.d is not reused.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch
changes the exp_data struct slightly so that the fields are better aligned
and without gaps. As a result on targets that support them, more load-pair
instructions are used in exp. Exp10 is improved by moving invlog10_2N later
so that neglog10_2hiN and neglog10_2loN can be loaded using load-pair.
The exp benchmark improves 2.5%, "144bits" by 7.2%, "768bits" by 12.7% on
Neoverse V2. Exp10 improves by 1.5%.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The libm size improvement built with "--enable-stack-protector=strong
--enable-bind-now=yes --enable-fortify-source=2":
Before:
text data bss dec hex filename
585192 860 12 586064 8f150 aarch64-linux-gnu/math/libm.so
960775 1068 12 961855 ead3f x86_64-linux-gnu/math/libm.so
1189174 5544 368 1195086 123c4e powerpc64le-linux-gnu/math/libm.so
After:
text data bss dec hex filename
584952 860 12 585824 8f060 aarch64-linux-gnu/math/libm.so
960615 1068 12 961695 eac9f x86_64-linux-gnu/math/libm.so
1189078 5544 368 1194990 123bee powerpc64le-linux-gnu/math/libm.so
The are small code changes for x86_64 and powerpc64le, which do not
affect performance; but on aarch64 with gcc-14 I see a slight better
code generation due the usage of ldq for floating point constant loading.
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
The libm size improvement built with "--enable-stack-protector=strong
--enable-bind-now=yes --enable-fortify-source=2":
Before:
text data bss dec hex filename
587304 860 12 588176 8f990 aarch64-linux-gnu-master/math/libm.so
962855 1068 12 963935 eb55f x86_64-linux-gnu-master/math/libm.so
1191222 5544 368 1197134 12444e powerpc64le-linux-gnu-master/math/libm.so
After:
text data bss dec hex filename
585192 860 12 586064 8f150 aarch64-linux-gnu/math/libm.so
960775 1068 12 961855 ead3f x86_64-linux-gnu/math/libm.so
1189174 5544 368 1195086 123c4e powerpc64le-linux-gnu/math/libm.so
The are small code changes for x86_64 and powerpc64le, which do not
affect performance; but on aarch64 with gcc-14 I see a slight better
code generation due the usage of ldq for floating point constant loading.
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
The CORE-MATH implementation is correctly rounded (for any rounding mode),
although it should worse performance than current one. The current
implementation performance comes mainly from the internal usage of
the optimize expf implementation, and shows a maximum ULPs of 2 for
FE_TONEAREST and 3 for other rounding modes.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 40.6995 49.0737 -20.58%
x86_64v2 40.5841 44.3604 -9.30%
x86_64v3 39.3879 39.7502 -0.92%
i686 112.3380 129.8570 -15.59%
aarch64 (Neoverse) 18.6914 17.0946 8.54%
power10 11.1343 9.3245 16.25%
reciprocal-throughput master patched improvement
x86_64 18.6471 24.1077 -29.28%
x86_64v2 17.7501 20.2946 -14.34%
x86_64v3 17.8262 17.1877 3.58%
i686 64.1454 86.5645 -34.95%
aarch64 (Neoverse) 9.77226 12.2314 -25.16%
power10 4.0200 5.3316 -32.63%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
ieee_long_double_shape_type has
typedef union
{
long double value;
struct
{
...
int sign_exponent:16;
...
} parts;
} ieee_long_double_shape_type;
Clang issues an error:
../sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c:49:2: error: implicit truncation from 'int' to bit-field changes value from 65535 to -1 [-Werror,-Wbitfield-constant-conversion]
49 | SET_LDOUBLE_WORDS (ldnx, 0xffff,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50 | tests[i] >> 32, tests[i] & 0xffffffffULL);
|
Use -1, instead of 0xffff, to silence Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atan2pi functions (atan2(y,x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atanpi functions (atan(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the asinpi functions (asin(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the acospi functions (acos(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the tanpi functions (tan(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the sinpi functions (sin(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the cospi functions (cos(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Some CORE-MATH routines uses roundeven and most of ISA do not have
an specific instruction for the operation. In this case, the call
will be routed to generic implementation.
However, if the ISA does support round() and ctz() there is a better
alternative (as used by CORE-MATH).
This patch adds such optimization and also enables it on powerpc.
On a power10 it shows the following improvement:
expm1f master patched improvement
latency 9.8574 7.0139 28.85%
reciprocal-throughput 4.3742 2.6592 39.21%
Checked on powerpc64le-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The k>>31 in signgam = 1 - (((k&(k>>31))&1)<<1); is not portable:
* The ISO C standard says "If E1 has a signed type and a negative
value, the resulting value is implementation-defined." (this is
still in C23).
* If the int type is larger than 32 bits (e.g. a 64-bit type),
then k = INT_MAX; line 144 will make k>>31 put 1 in bit 0
(thus signgam will be -1) while 0 is expected.
Moreover, instead of the fx >= 0x1p31f condition, testing fx >= 0
is probably better for 2 reasons:
The signgam expression has more or less a condition on the sign
of fx (the goal of k>>31, which can be dropped with this new
condition). Since fx ≥ 0 should be the most common case, one can
get signgam directly in this case (value 1). And this simplifies
the expression for the other case (fx < 0).
This new condition may be easier/faster to test on the processor
(e.g. by avoiding a load of a constant from the memory).
This is commit d41459c731865516318f813cf4c966dafa0eecbf from CORE-MATH.
Checked on x86_64-linux-gnu.
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic tanf.
The code was adapted to glibc style, to use the definition of
math_config.h, to remove errno handling, and to use a generic
128 bit routine for ABIs that do not support it natively.
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (neoverse1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):
latency master patched improvement
x86_64 82.3961 54.8052 33.49%
x86_64v2 82.3415 54.8052 33.44%
x86_64v3 69.3661 50.4864 27.22%
i686 219.271 45.5396 79.23%
aarch64 29.2127 19.1951 34.29%
power10 19.5060 16.2760 16.56%
reciprocal-throughput master patched improvement
x86_64 28.3976 19.7334 30.51%
x86_64v2 28.4568 19.7334 30.65%
x86_64v3 21.1815 16.1811 23.61%
i686 105.016 15.1426 85.58%
aarch64 18.1573 10.7681 40.70%
power10 8.7207 8.7097 0.13%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
The commit 9247f53219 triggered some regressions on loongarch and
riscv:
math/test-float-log10
math/test-float32-log10
And it is due a wrong sync with CORE-MATH for special 0.0/-0.0
inputs.
Checked on aarch64-linux-gnu and loongarch64-linux-gnu-lp64d.
On GCC 11 (x86-64), the previous code produced test failures like
this one:
Failure: Test: exp10m1_towardzero (-0x1.1p+4)
Result:
is: -1.00000000e+00 -0x1.000000p+0
should be: -9.99999940e-01 -0x1.fffffep-1
difference: 5.96046447e-08 0x1.000000p-24
ulp : 1.0000
max.ulp : 0.0000
Apply a similar fix to exp2m1f.
Co-authored-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The CORE-MATH exp2m1f implementation showed slight worse latency
when using x86_64 baseline ABI. This patch adds a ifunc variant
with similar performance for x86_64-v3.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH exp10m1f implementation showed slight worse latency
when using x86_64 baseline ABI. This patch adds a ifunc variant
with similar performance for x86_64-v3.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance compared to the generic exp2m1f.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow). The
only change is to handle FLT_MAX_EXP for FE_DOWNWARD or FE_TOWARDZERO.
The benchmark inputs are based on exp2f ones.
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 40.6042 48.7104 -19.96%
x86_64v2 40.7506 35.9032 11.90%
x86_64v3 35.2301 31.7956 9.75%
i686 102.094 94.6657 7.28%
aarch64 18.2704 15.1387 17.14%
power10 11.9444 8.2402 31.01%
reciprocal-throughput master patched improvement
x86_64 20.8683 16.1428 22.64%
x86_64v2 19.5076 10.4474 46.44%
x86_64v3 19.2106 10.4014 45.86%
i686 56.4054 59.3004 -5.13%
aarch64 12.0781 7.3953 38.77%
power10 6.5306 5.9388 9.06%
The generic implementation calls __ieee754_exp2f and x86_64 provides
an optimized ifunc version (built with -mfma -mavx2, not correctly
rounded). This explains the performance difference for x86_64.
Same for i686, where the ABI provides an optimized __ieee754_exp2f
version built with '-msse2 -mfpmath=sse'. When built wth same
flags, the new algorithm shows a better performance:
master patched improvement
latency 102.094 91.2823 10.59%
reciprocal-throughput 56.4054 52.7984 6.39%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance compared to the generic exp10m1f.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow). I mostly
fixed some small issues in corner cases (sNaN handling, -INFINITY,
a specific overflow check).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 45.4690 49.5845 -9.05%
x86_64v2 46.1604 36.2665 21.43%
x86_64v3 37.8442 31.0359 17.99%
i686 121.367 93.0079 23.37%
aarch64 21.1126 15.0165 28.87%
power10 12.7426 8.4929 33.35%
reciprocal-throughput master patched improvement
x86_64 19.6005 17.4005 11.22%
x86_64v2 19.6008 11.1977 42.87%
x86_64v3 17.5427 10.2898 41.34%
i686 59.4215 60.9675 -2.60%
aarch64 13.9814 7.9173 43.37%
power10 6.7814 6.4258 5.24%
The generic implementation calls __ieee754_exp10f which has an
optimized version, although it is not correctly rounded, which is
the main culprit of the the latency difference for x86_64 and
throughp for i686.
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
Also remove the use of builtins in favor of standard names, compiler
already inline them (if supported) with current compiler options.
It also fixes and issue where __builtin_roundeven is not support on
gcc older than version 10.
Checked on x86_64-linux-gnu and i686-linux_gnu.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
The CORE-MATH implementation is correctly rounded (for any rounding mode).
This can be checked by exhaustive tests in a few minutes since there are
less than 2^32 values to check against for example GNU MPFR.
This patch also adds some bench values for tgammaf.
Tested on x86_64 and x86 (cfarm26).
With the initial GNU libc code it gave on an Intel(R) Core(TM) i7-8700:
"tgammaf": {
"": {
"duration": 3.50188e+09,
"iterations": 2e+07,
"max": 602.891,
"min": 65.1415,
"mean": 175.094
}
}
With the new code:
"tgammaf": {
"": {
"duration": 3.30825e+09,
"iterations": 5e+07,
"max": 211.592,
"min": 32.0325,
"mean": 66.1649
}
}
With the initial GNU libc code it gave on cfarm26 (i686):
"tgammaf": {
"": {
"duration": 3.70505e+09,
"iterations": 6e+06,
"max": 2420.23,
"min": 243.154,
"mean": 617.509
}
}
With the new code:
"tgammaf": {
"": {
"duration": 3.24497e+09,
"iterations": 1.8e+07,
"max": 1238.15,
"min": 101.155,
"mean": 180.276
}
}
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Changes in v2:
- include <math.h> (fix the linknamespace failures)
- restored original benchtests/strcoll-inputs/filelist#en_US.UTF-8 file
- restored original wrapper code (math/w_tgammaf_compat.c),
except for the dealing with the sign
- removed the tgammaf/float entries in all libm-test-ulps files
- address other comments from Joseph Myers
(https://sourceware.org/pipermail/libc-alpha/2024-July/158736.html)
Changes in v3:
- pass NULL argument for signgam from w_tgammaf_compat.c
- use of math_narrow_eval
- added more comments
Changes in v4:
- initialize local_signgam to 0 in math/w_tgamma_template.c
- replace sysdeps/ieee754/dbl-64/gamma_productf.c by dummy file
Changes in v5:
- do not mention local_signgam any more in math/w_tgammaf_compat.c
- initialize local_signgam to 1 instead of 0 in w_tgamma_template.c
and added comment
Changes in v6:
- pass NULL as 2nd argument of __ieee754_gammaf_r in
w_tgammaf_compat.c, and check for NULL in e_gammaf_r.c
Changes in v7:
- added Signed-off-by line for Alexei Sibidanov (author of the code)
Changes in v8:
- added Signed-off-by line for Paul Zimmermann (submitted of the patch)
Changes in v9:
- address comments from review by Adhemerval Zanella
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch
changes the exp2f_data struct slightly so that the fields are better aligned.
As a result on targets that support them, load-pair instructions accessing
poly_scaled and invln2_scaled are now 16-byte aligned.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As discussed at the patch review meeting
Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Reviewed-by: Simon Chopin <simon.chopin@canonical.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and
exp10(x)-1, like expm1).
As with other such functions, these use type-generic templates that
could be replaced with faster and more accurate type-specific
implementations in future. Test inputs are copied from those for
expm1, plus some additions close to the overflow threshold (copied
from exp2 and exp10) and also some near the underflow threshold.
exp2m1 has the unusual property of having an input (M_MAX_EXP) where
whether the function overflows (under IEEE semantics) depends on the
rounding mode. Although these could reasonably be XFAILed in the
testsuite (as we do in some cases for arguments very close to a
function's overflow threshold when an error of a few ulps in the
implementation can result in the implementation not agreeing with an
ideal one on whether overflow takes place - the testsuite isn't smart
enough to handle this automatically), since these functions aren't
required to be correctly rounding, I made the implementation check for
and handle this case specially.
The Makefile ordering expected by lint-makefiles for the new functions
is a bit peculiar, but I implemented it in this patch so that the test
passes; I don't know why log2 also needed moving in one Makefile
variable setting when it didn't in my previous patches, but the
failure showed a different place was expected for that function as
well.
The powerpc64le IFUNC setup seems not to be as self-contained as one
might hope; it shouldn't be necessary to add IFUNCs for new functions
such as these simply to get them building, but without setting up
IFUNCs for the new functions, there were undefined references to
__GI___expm1f128 (that IFUNC machinery results in no such function
being defined, but doesn't stop include/math.h from doing the
redirection resulting in the exp2m1f128 and exp10m1f128
implementations expecting to call it).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for
base-10 logarithms).
This is directly analogous to the log2p1 implementation (except that
whereas log2p1 has a smaller underflow range than log1p, log10p1 has a
larger underflow range). The test inputs are copied from those for
log1p and log2p1, plus a few more inputs in that wider underflow
range.
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the logp1 functions (aliases for log1p functions - the
name is intended to be more consistent with the new log2p1 and
log10p1, where clearly it would have been very confusing to name those
functions log21p and log101p). As aliases rather than new functions,
the content of this patch is somewhat different from those actually
adding new functions.
Tests are shared with log1p, so this patch *does* mechanically update
all affected libm-test-ulps files to expect the same errors for both
functions.
The vector versions of log1p on aarch64 and x86_64 are *not* updated
to have logp1 aliases (and thus there are no corresponding header,
tests, abilist or ulps changes for vector functions either). It would
be reasonable for such vector aliases and corresponding changes to
other files to be made separately. For now, the log1p tests instead
avoid testing logp1 in the vector case (a Makefile change is needed to
avoid problems with grep, used in generating the .c files for vector
function tests, matching more than one ALL_RM_TEST line in a file
testing multiple functions with the same inputs, when it assumes that
the .inc file only has a single such line).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Left shift of ki is undefined when ki<0, copy the logic from exp,
which uses unsigned arithmetics, to fix it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The libc.a for alpha, s390, and sparcv9 does not provide
copysignf64x, copysignf128, frexpf64x, frexpf128, modff64x, and
modff128.
Checked with a static build for the affected ABIs.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Don't provide __nexttowardf128_do_not_use, nexttowardf128_do_not_use,
finitef128_do_not_use, isinff128_do_not_use and isnanf128_do_not_use.
This fixes BZ #31757.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some static implementation of float128 routines might call __isnanf128,
which is not provided by the static object.
Checked on x86_64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The commit 16439f419b removed the static fmod/fmodf on i386 and m68k
with and empty w_fmod.c (required for the ABIs that uses the newly
implementation). This patch fixes by adding the required symbols on
the arch-specific w_fmod{f}_compat.c implementation.
To statically build fmod fails on some ABI (alpha, s390, sparc) because
it does not export the ldexpf128, this is also fixed by this patch.
Checked on i686-linux-gnu and with a build for m68k-linux-gnu.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the log2p1 functions (log2(1+x): like log1p, but for
base-2 logarithms).
This illustrates the intended structure of implementations of all
these function families: define them initially with a type-generic
template implementation. If someone wishes to add type-specific
implementations, it is likely such implementations can be both faster
and more accurate than the type-generic one and can then override it
for types for which they are implemented (adding benchmarks would be
desirable in such cases to demonstrate that a new implementation is
indeed faster).
The test inputs are copied from those for log1p. Note that these
changes make gen-auto-libm-tests depend on MPFR 4.2 (or later).
The bulk of the changes are fairly generic for any such new function.
(sysdeps/powerpc/nofpu/Makefile only needs changing for those
type-generic templates that use fabs.)
Tested for x86_64 and x86, and with build-many-glibcs.py.
Fix BZ #31759 by not defining nearbyint aliases when used in IFUNC.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Complete the internal renaming from "C2X" and related names in GCC by
renaming *-c2x and *-gnu2x tests to *-c23 and *-gnu23.
Tested for x86_64, and with build-many-glibcs.py for powerpc64le.
WG14 decided to use the name C23 as the informal name of the next
revision of the C standard (notwithstanding the publication date in
2024). Update references to C2X in glibc to use the C23 name.
This is intended to update everything *except* where it involves
renaming files (the changes involving renaming tests are intended to
be done separately). In the case of the _ISOC2X_SOURCE feature test
macro - the only user-visible interface involved - support for that
macro is kept for backwards compatibility, while adding
_ISOC23_SOURCE.
Tested for x86_64.
Remove the error handling wrapper from exp10. This is very similar to
the changes done to exp and exp2, except that we also need to handle
pow10 and pow10l.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
New implementation is based on the existing exp/exp2, with different
reduction constants and polynomial. Worst-case error in round-to-
nearest is 0.513 ULP.
The exp/exp2 shared table is reused for exp10 - .rodata size of
e_exp_data increases by 64 bytes.
As for exp/exp2, targets with single-instruction rounding/conversion
intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the
necessary code to their math_private.h.
Improvements on Neoverse V1 compared to current GLIBC master:
exp10 thruput: 3.3x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
exp10 latency: 1.8x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8]
Tested on:
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction)
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
With GCC 14 on 32-bit x86 the compiler emits a maybe-uninitialized
warning:
../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function '__kernel_rem_pio2':
../sysdeps/ieee754/dbl-64/k_rem_pio2.c:364:20: error: 'fq' may be used uninitialized [-Werror=maybe-uninitialized]
364 | y[0] = fq[0]; y[1] = fq[1]; y[2] = fw;
| ~~^~~
This is similar to the warning that is suppressed in the other branch of
the switch. Help the compiler knowing that the variable is always
initialized, which also makes the suppression obsolete.
On Skylake, it changes log1p bench performance by:
Before After Improvement
max 63.349 58.347 8%
min 4.448 5.651 -30%
mean 12.0674 10.336 14%
The minimum code path is
if (hx < 0x3FDA827A) /* x < 0.41422 */
{
if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */
{
...
}
if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */
{
math_force_eval (two54 + x); /* raise inexact */
if (ax < 0x3c900000) /* |x| < 2**-54 */
{
...
}
else
return x - x * x * 0.5;
FMA and non-FMA code sequences look similar. Non-FMA version is slightly
faster. Since log1p is called by asinh and atanh, it improves asinh
performance by:
Before After Improvement
max 75.645 63.135 16%
min 10.074 10.071 0%
mean 15.9483 14.9089 6%
and improves atanh performance by:
Before After Improvement
max 91.768 75.081 18%
min 15.548 13.883 10%
mean 18.3713 16.8011 8%
On Skylake, it improves expm1 bench performance by:
Before After Improvement
max 70.204 68.054 3%
min 20.709 16.2 22%
mean 22.1221 16.7367 24%
NB: Add
extern long double __expm1l (long double);
extern long double __expm1f128 (long double);
for __typeof (__expm1l) and __typeof (__expm1f128) when __expm1 is
defined since __expm1 may be expanded in their declarations which
causes the build failure.
Bump autoconf requirement to 2.71 to allow regenerating configure on
more recent distributions. autoconf 2.71 has been in Fedora since F36
and is the current version in Debian stable (bookworm). It appears to
be current in Gentoo as well.
All sysdeps configure and preconfigure scripts have also been
regenerated; all changes are trivial transformations that do not affect
functionality.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Return value from *scanf and *asprintf routines are now properly checked
in test-scanf-ldbl-compat-template.c and test-printf-ldbl-compat.c.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This allows to include bits/syslog-decl.h in include/sys/syslog.h and
therefore be able to create the libc_hidden_builtin_proto (__syslog_chk)
prototype.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
If libc_hidden_builtin_{def,proto} isn't properly set for *_chk routines,
there are unwanted PLT entries in libc.so.
There is a special case with __asprintf_chk:
If ldbl_* macros are used for asprintf, ABI gets broken on s390x,
if it isn't, ppc64le isn't building due to multiple asm redirections.
This is due to the inclusion of bits/stdio-lbdl.h for ppc64le whereas it
isn't for s390x. This header creates redirections, which are not
compatible with the ones generated using libc_hidden_def.
Yet, we can't use libc_hidden_ldbl_proto on s390x since it will not
create a simple strong alias (e.g. as done on x86_64), but a versioned
alias, leading to ABI breakage.
This results in errors on s390x:
/usr/bin/ld: glibc/iconv/../libio/bits/stdio2.h:137: undefined reference
to `__asprintf_chk'
Original __asprintf_chk symbols:
00000000001395b0 T __asprintf_chk
0000000000177e90 T __nldbl___asprintf_chk
__asprintf_chk symbols with ldbl_* macros:
000000000012d590 t ___asprintf_chk
000000000012d590 t __asprintf_chk@@GLIBC_2.4
000000000012d590 t __GI___asprintf_chk
000000000012d590 t __GL____asprintf_chk___asprintf_chk
0000000000172240 T __nldbl___asprintf_chk
__asprintf_chk symbols with the patch:
000000000012d590 t ___asprintf_chk
000000000012d590 T __asprintf_chk
000000000012d590 t __GI___asprintf_chk
0000000000172240 T __nldbl___asprintf_chk
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The *_chk routines naming doesn't match the name that would be generated
using libc_hidden_ldbl_proto. Since the macro is needed for some of
these *_chk functions for _FORTIFY_SOURCE to be enabled, that needed to
be fixed.
While at it, all the *_chk function get renamed appropriately for
consistency, even if not strictly necessary.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
Since the _FORTIFY_SOURCE feature uses some routines of Glibc, they need to
be excluded from the fortification.
On top of that:
- some tests explicitly verify that some level of fortification works
appropriately, we therefore shouldn't modify the level set for them.
- some objects need to be build with optimization disabled, which
prevents _FORTIFY_SOURCE to be used for them.
Assembler files that implement architecture specific versions of the
fortified routines were not excluded from _FORTIFY_SOURCE as there is no
C header included that would impact their behavior.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Replace the loop-over-scalar placeholder routines with optimised
implementations from Arm Optimized Routines (AOR).
Also add some headers containing utilities for aarch64 libmvec
routines, and update libm-test-ulps.
Data tables for new routines are used via a pointer with a
barrier on it, in order to prevent overly aggressive constant
inlining in GCC. This allows a single adrp, combined with offset
loads, to be used for every constant in the table.
Special-case handlers are marked NOINLINE in order to confine the
save/restore overhead of switching from vector to normal calling
standard. This way we only incur the extra memory access in the
exceptional cases. NOINLINE definitions have been moved to
math_private.h in order to reduce duplication.
AOR exposes a config option, WANT_SIMD_EXCEPT, to enable
selective masking (and later fixing up) of invalid lanes, in
order to trigger fp exceptions correctly (AdvSIMD only). This is
tested and maintained in AOR, however it is configured off at
source level here for performance reasons. We keep the
WANT_SIMD_EXCEPT blocks in routine sources to greatly simplify
the upstreaming process from AOR to glibc.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This patch redirects the error functions to the appropriate
longdouble variants which enables the compiler to optimize
for the abi ieeelongdouble.
Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Optimize the fast paths (x < y) and (x/y < 2^12). Delay handling of special
cases to reduce the number of instructions executed before the fast paths.
Performance improvements for fmod:
Skylake Zen2 Neoverse V1
subnormals 11.8% 4.2% 11.5%
normal 3.9% 0.01% -0.5%
close-exponents 6.3% 5.6% 19.4%
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper
with SVID error handling around the new code. There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).
The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation. For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).
It shows an small improvement, the results for fmod:
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992
x86_64 (Ryzen 9) | normal | 296.939 | 296.738
x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119
aarch64 (N1) | subnormal | 6.81778 | 4.33313
aarch64 (N1) | normal | 155.620 | 152.915
aarch64 (N1) | close-exponents | 8.21306 | 5.76138
armhf (N1) | subnormal | 15.1083 | 14.5746
armhf (N1) | normal | 244.833 | 241.738
armhf (N1) | close-exponents | 21.8182 | 22.457
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 8 (which is sizeof of uint32_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 8;
ex -= 8;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318
x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641
x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224
aarch64 (N1) | subnormal | 10.2182 | 6.81778
aarch64 (N1) | normal | 60.0616 | 20.3667
aarch64 (N1) | close-exponents | 11.5256 | 8.39685
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 11.6662 | 10.8955
armhf (N1) | normal | 69.2759 | 34.1524
armhf (N1) | close-exponents | 13.6472 | 18.2131
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:
mx * 2^ex == 2 * mx * 2^(ex - 1)
while (ex > ey)
{
mx *= 2;
--ex;
mx %= my;
}
With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 11 (which is sizeo of uint64_t minus
MANTISSA_WIDTH plus the signal bit):
while (ex > ey)
{
mx << 11;
ex -= 11;
mx %= my;
} */
The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles. Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.
I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049
x86_64 (Ryzen 9) | normal | 1016.51 | 296.939
x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244
aarch64 (N1) | subnormal | 11.153 | 6.81778
aarch64 (N1) | normal | 528.649 | 155.62
aarch64 (N1) | close-exponents | 11.4517 | 8.21306
I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, clz, ctz, and multiplication):
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
armhf (N1) | subnormal | 15.908 | 15.1083
armhf (N1) | normal | 837.525 | 244.833
armhf (N1) | close-exponents | 16.2111 | 21.8182
Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.
Co-authored-by: kirill <kirill.okhotnikov@gmail.com>
[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
They are both used by __libc_freeres to free all library malloc
allocated resources to help tooling like mtrace or valgrind with
memory leak tracking.
The current scheme uses assembly markers and linker script entries
to consolidate the free routine function pointers in the RELRO segment
and to be freed buffers in BSS.
This patch changes it to use specific free functions for
libc_freeres_ptrs buffers and call the function pointer array directly
with call_function_static_weak.
It allows the removal of both the internal macros and the linker
script sections.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants for the %i scanf format (in addition to the %b format,
which isn't yet implemented for scanf in glibc). Implement that scanf
support for glibc.
As with the strtol support, this is incompatible with previous C
standard versions, in that such an input string starting with 0b or 0B
was previously required to be parsed as 0 (with the rest of the input
potentially matching subsequent parts of the scanf format string).
Thus this patch adds 12 new __isoc23_* functions per long double
format (12, 24 or 36 depending on how many long double formats the
glibc configuration supports), with appropriate header redirection
support (generally very closely following that for the __isoc99_*
scanf functions - note that __GLIBC_USE (DEPRECATED_SCANF) takes
precedence over __GLIBC_USE (C2X_STRTOL), so the case of GNU
extensions to C89 continues to get old-style GNU %a and does not get
this new feature). The function names would remain as __isoc23_* even
if C2x ends up published in 2024 rather than 2023.
When scanf %b support is added, I think it will be appropriate for all
versions of scanf to follow C2x rules for inputs to the %b format
(given that there are no compatibility concerns for a new format).
Tested for x86_64 (full glibc testsuite). The first version was also
tested for powerpc (32-bit) and powerpc64le (stdio-common/ and wcsmbs/
tests), and with build-many-glibcs.py.
The patch suppress the same warnings from 87c266d758,
that shows issues for microblaze, mips soft-fp, nios2, and or1k.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Fix the following issues with built-in function use in
sysdeps/ieee754/ldbl-128 and sysdeps/ieee754/float128:
* fabsl used __builtin_fabsf128 unconditionally, breaking the build
with GCC 6 for several architectures; it should use __builtin_fabsl
with an appropriate redirection in float128_private.h. (I'm not
particularly concerned with building glibc with GCC 6; rather, I
want to be able to run the tgmath.h tests with GCC 6, which is a
significantly different case for tgmath.h compared to GCC 7 and
later because of the lack of _FloatN / _FloatNx support in the
compiler, and at present running the tests with a compiler means
building glibc with that compiler.)
* Some (conditional) uses of built-in functions had been added to
ldbl-128 without appropriate float128_private.h remapping (there was
remapping for the macros controlling whether the built-in functions
are used, just not for the functions themselves).
* s_llrintl.c called __builtin_round not __builtin_llrintl, which is
obviously wrong.
Tested with build-many-glibcs.py for aarch64-linux-gnu, GCC 6 (where
it fixes the glibc build) and GCC 12, and with the glibc testsuite for
x86_64.
vfprintf is entangled with vfwprintf (of course), __printf_fp,
__printf_fphex, __vstrfmon_l_internal, and the strfrom family of
functions. The latter use the internal snprintf functionality,
so vsnprintf is converted as well.
The simples conversion is __printf_fphex, followed by
__vstrfmon_l_internal and __printf_fp, and finally
__vfprintf_internal and __vfwprintf_internal. __vsnprintf_internal
and strfrom* are mostly consuming the new interfaces, so they
are comparatively simple.
__printf_fp is a public symbol, so the FILE *-based interface
had to preserved.
The __printf_fp rewrite does not change the actual binary-to-decimal
conversion algorithm, and digits are still not emitted directly to
the target buffer. However, the staging buffer now uses bytes
instead of wide characters, and one buffer copy is eliminated.
The changes are at least performance-neutral in my testing.
Floating point printing and snprintf improved measurably, so that
this Lua script
for i=1,5000000 do
print(i, i * math.pi)
end
runs about 5% faster for me. To preserve fprintf performance for
a simple "%d" format, this commit has some logic changes under
LABEL (unsigned_number) to avoid additional function calls. There
are certainly some very easy performance improvements here: binary,
octal and hexadecimal formatting can easily avoid the temporary work
buffer (the number of digits can be computed ahead-of-time using one
of the __builtin_clz* built-ins). Decimal formatting can use a
specialized version of _itoa_word for base 10.
The existing (inconsistent) width handling between strfmon and printf
is preserved here. __print_fp_buffer_1 would have to use
__translated_number_width to achieve ISO conformance for printf.
Test expectations in libio/tst-vtables-common.c are adjusted because
the internal staging buffer merges all virtual function calls into
one.
In general, stack buffer usage is greatly reduced, particularly for
unbuffered input streams. __printf_fp can still use a large buffer
in binary128 mode for %g, though.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch is using the corresponding GCC builtin for logbf, logb,
logbl and logbf128 if the USE_FUNCTION_BUILTIN macros are defined to one
in math-use-builtins-function.h.
Co-Authored-By: Xi Ruoyao <xry111@xry111.site>
This patch is using the corresponding GCC builtin for llrintf, llrint,
llrintl and llrintf128 if the USE_FUNCTION_BUILTIN macros are defined to one
in math-use-builtins-function.h.
Co-Authored-By: Xi Ruoyao <xry111@xry111.site>
This patch is using the corresponding GCC builtin for lrintf, lrint,
lrintl and lrintf128 if the USE_FUNCTION_BUILTIN macros are defined to one
in math-use-builtins-function.h.
Co-Authored-By: Xi Ruoyao <xry111@xry111.site>
GCC 13 has added more _FloatN and _FloatNx versions of existing
<math.h> and <complex.h> built-in functions, for use in libstdc++-v3.
This breaks the glibc build because of how those functions are defined
as aliases to functions with the same ABI but different types. Add
appropriate -fno-builtin-* options for compiling relevant files, as
already done for the case of long double functions aliasing double
ones and based on the list of files used there.
I fixed some mistakes in that list of double files that I noticed
while implementing this fix, but there may well be more such
(harmless) cases, in this list or the new one (files that don't
actually exist or don't define the named functions as aliases so don't
need the options). I did try to exclude cases where glibc doesn't
define certain functions for _FloatN or _FloatNx types at all from the
new uses of -fno-builtin-* options. As with the options for double
files (see the commit message for commit
49348beafe, "Fix build with GCC 10 when
long double = double."), it's deliberate that the options are used
even if GCC currently doesn't have a built-in version of a given
functions, so providing some level of future-proofing against more
such built-in functions being added in future.
Tested with build-many-glibcs.py for aarch64-linux-gnu
powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu (compilers
and glibcs builds) with GCC mainline.
Detecting an overflow edge case depended on signed overflow of a long
long. Replace the additions and the overflow checks by
__builtin_add_overflow().
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Avoid moving code across SET_RESTORE_ROUNDL in order to fix
[BZ #29463].
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
This works around a gcc issue where it const folded inf/inf into nan,
preventing the invalid exception to be signalled.
(x-x)/(x-x) is more robust against optimizations and works for all
out of bounds values including x==nan.
The gcc issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95115
should be fixed on release branches starting from gcc-10, but it is
better to change the code in case glibc is built with older gcc.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
GCC 13 adds support for _FloatN and _FloatNx types in C++, so breaking
the installed glibc headers that assume such support is not present.
GCC mostly works around this with fixincludes, but that doesn't help
for building glibc and its tests (glibc doesn't itself contain C++
code, but there's C++ code built for tests). Update glibc's
bits/floatn-common.h and bits/floatn.h headers to handle the GCC 13
support directly.
In general the changes match those made by fixincludes, though I think
the ones in sysdeps/powerpc/bits/floatn.h, where the header tests
__LDBL_MANT_DIG__ == 113 or uses #elif, wouldn't match the existing
fixincludes patterns.
Some places involving special C++ handling in relation to _FloatN
support are not changed. There's no need to change the
__HAVE_FLOATN_NOT_TYPEDEF definition (also in a form that wouldn't be
matched by the fixincludes fixes) because it's only used in relation
to macro definitions using features not supported for C++
(__builtin_types_compatible_p and _Generic). And there's no need to
change the inline function overloads for issignaling, iszero and
iscanonical in C++ because cases where types have the same format but
are no longer compatible types are handled automatically by the C++
overload resolution rules.
This patch also does not change the overload handling for iseqsig, and
there I think changes *are* needed, beyond those in this patch or made
by fixincludes. The way that overload is defined, via a template
parameter to a structure type, requires overloads whenever the types
are incompatible, even if they have the same format. So I think we
need to add overloads with GCC 13 for every supported _FloatN and
_FloatNx type, rather than just having one for _Float128 when it has a
different ABI to long double as at present (but for older GCC, such
overloads must not be defined for types that end up defined as
typedefs for another type).
Tested with build-many-glibcs.py: compilers build for
aarch64-linux-gnu ia64-linux-gnu mips64-linux-gnu powerpc-linux-gnu
powerpc64le-linux-gnu x86_64-linux-gnu; glibcs build for
aarch64-linux-gnu ia64-linux-gnu i686-linux-gnu mips-linux-gnu
mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu
x86_64-linux-gnu.
math/test-float128-y1 fails on x86_64 and ppc64el with gcc 12 and -O3,
because code inside a block guarded by SET_RESTORE_ROUNDL is being moved
after the rounding mode has been restored. Use math_force_eval to
prevent this (and insert some math_opt_barrier calls to prevent code
from being moved before the rounding mode is set).
Fixes#29463
Reviewed-By: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The compiler may substitute calls to sin or cos with calls to sincos, thus
we should have the same optimized implementations for sincos. The
optimized implementations may produce results that differ, that also makes
sure that the sincos call aggrees with the sin and cos calls.
Both float, double, and _Float128 are assumed to be supported
(float and double already only uses builtins). Only long double
is parametrized due GCC bug 29253 which prevents its usage on
powerpc.
It allows to remove i686, ia64, x86_64, powerpc, and sparc arch
specific implementation.
On ia64 it also fixes the sNAN handling:
math/test-float64x-fabs
math/test-ldouble-fabs
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
powerpc64-linux-gnu, sparc64-linux-gnu, and ia64-linux-gnu.
Converting double precision constants to float is now affected by the
runtime dynamic rounding mode instead of being evaluated at compile
time with default rounding mode (except static object initializers).
This can change the computed result and cause performance regression.
The known correctness issues (increased ulp errors) are already fixed,
this patch fixes remaining cases of unnecessary runtime conversions.
Add float M_* macros to math.h as new GNU extension API. To avoid
conversions the new M_* macros are used and instead of casting double
literals to float, use float literals (only required if the conversion
is inexact).
The patch was tested on aarch64 where the following symbols had new
spurious conversion instructions that got fixed:
__clog10f
__gammaf_r_finite@GLIBC_2.17
__j0f_finite@GLIBC_2.17
__j1f_finite@GLIBC_2.17
__jnf_finite@GLIBC_2.17
__kernel_casinhf
__lgamma_negf
__log1pf
__y0f_finite@GLIBC_2.17
__y1f_finite@GLIBC_2.17
cacosf
cacoshf
casinhf
catanf
catanhf
clogf
gammaf_positive
Fixes bug 28713.
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
s_cosf.c and s_sinf.c have
if (abstop12 (y) < abstop12 (pio4))
where abstop12 takes a float argument, but pio4 is static const double.
pio4 is used only in calls to abstop12 and never in arithmetic. Apply
-static const double pio4 = 0x1.921FB54442D18p-1;
+static const float pio4 = 0x1.921FB6p-1f;
to fix:
FAIL: math/test-float-cos
FAIL: math/test-float-sin
FAIL: math/test-float-sincos
FAIL: math/test-float32-cos
FAIL: math/test-float32-sin
FAIL: math/test-float32-sincos
when compiling with GCC 12.
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
The macro TAYLOR_SIN adds the term `-0.5*da*a^2 + da` in hopes
of regaining some precision as a function of da. However the
comment says we add the term `-0.5*da*a^2 + 0.5*da` which is
different. This fix updates the comment to reflect the
code and also simplifies the calculation by replacing `a` with `x`
because they always have the same value.
Signed-off-by: Akila Welihinda <akilawelihinda@ucla.edu>
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper with
SVID error handling around the new code. There is no new symbol version
nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
Only ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
This implementation is based on 'An Improved Algorithm for hypot(a,b)'
by Carlos F. Borges [1] using the MyHypot3 with the following changes:
- Handle qNaN and sNaN.
- Tune the 'widely varying operands' to avoid spurious underflow
due the multiplication and fix the return value for upwards
rounding mode.
- Handle required underflow exception for subnormal results.
The main advantage of the new algorithm is its precision. With a
random 1e9 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc
current implementation shows around 0.05% results with an error of
1 ulp (453266 results) while the new implementation only shows
0.0001% of total (1280).
Checked on aarch64-linux-gnu and x86_64-linux-gnu.
[1] https://arxiv.org/pdf/1904.09481.pdf
This implementation is based on 'An Improved Algorithm for hypot(a,b)'
by Carlos F. Borges [1] using the MyHypot3 with the following changes:
- Handle qNaN and sNaN.
- Tune the 'widely varying operands' to avoid spurious underflow
due the multiplication and fix the return value for upwards
rounding mode.
- Handle required underflow exception for subnormal results.
The main advantage of the new algorithm is its precision. With a
random 1e8 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc
current implementation shows around 0.02% results with an error of
1 ulp (23158 results) while the new implementation only shows
0.0001% of total (111).
[1] https://arxiv.org/pdf/1904.09481.pdf
Improve hypot performance significantly by using fma when available. The
fma version has twice the throughput of the previous version and 70% of
the latency. The non-fma version has 30% higher throughput and 10%
higher latency.
Max ULP error is 0.949 with fma and 0.792 without fma.
Passes GLIBC testsuite.
This implementation is based on the 'An Improved Algorithm for
hypot(a,b)' by Carlos F. Borges [1] using the MyHypot3 with the
following changes:
- Handle qNaN and sNaN.
- Tune the 'widely varying operands' to avoid spurious underflow
due the multiplication and fix the return value for upwards
rounding mode.
- Handle required underflow exception for denormal results.
The main advantage of the new algorithm is its precision: with a
random 1e9 input pairs in the range of [DBL_MIN, DBL_MAX], glibc
current implementation shows around 0.34% results with an error of
1 ulp (3424869 results) while the new implementation only shows
0.002% of total (18851).
The performance result are also only slight worse than current
implementation. On x86_64 (Ryzen 5900X) with gcc 12:
Before:
"hypot": {
"workload-random": {
"duration": 3.73319e+09,
"iterations": 1.12e+08,
"reciprocal-throughput": 22.8737,
"latency": 43.7904,
"max-throughput": 4.37184e+07,
"min-throughput": 2.28361e+07
}
}
After:
"hypot": {
"workload-random": {
"duration": 3.7597e+09,
"iterations": 9.8e+07,
"reciprocal-throughput": 23.7547,
"latency": 52.9739,
"max-throughput": 4.2097e+07,
"min-throughput": 1.88772e+07
}
}
Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
[1] https://arxiv.org/pdf/1904.09481.pdf
Use a more optimized comparison for check for NaN and infinite and
add an inlined issignaling implementation for float. With gcc it
results in 2 FP comparisons.
The file Copyright is also changed to use GPL, the implementation was
completely changed by 7c10fd3515 to use double precision instead of
scaling and this change removes all the GET_FLOAT_WORD usage.
Checked on x86_64-linux-gnu.
The largest errors over the full binary32 range are after this
patch (on x86_64):
RNDN: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6
RNDZ: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6
RNDU: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6
RNDD: libm wrong by up to 8.98e+00 ulp(s) [9] for x=0x1.4b7066p+7
Inputs that were yielding huge errors have been added to "make check".
Reviewed-by: Adhemeral Zanella <adhemerval.zanella@linaro.org>
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs. fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN). fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN. fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value. All these functions also treat +0 as greater than -0. There
are also corresponding <tgmath.h> type-generic macros.
Add these functions to glibc. The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results. The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Avoid defining f64xfmaf128 twice when building s_fmaf128.c.
This can be reproduced on powerpc64le whenever f128 functions do not
have IFUNC enabled, e.g. using "--with-cpu=power8 --disable-multi-arch", or
when using "-with-cpu=power9".
Fixes: b3f27d8150 ("Add narrowing fma functions")
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
As described in bug 28358, the round-to-odd computations used in the
libm functions that round their results to a narrower format can yield
spurious underflow exceptions in the following circumstances: the
narrowing only narrows the precision of the type and not the exponent
range (i.e., it's narrowing _Float128 to _Float64x on x86_64, x86 or
ia64), the architecture does after-rounding tininess detection (which
applies to all those architectures), the result is inexact, tiny
before rounding but not tiny after rounding (with the chosen rounding
mode) for _Float64x (which is possible for narrowing mul, div and fma,
not for narrowing add, sub or sqrt), so the underflow exception
resulting from the toward-zero computation in _Float128 is spurious
for _Float64x.
Fixed by making ROUND_TO_ODD call feclearexcept (FE_UNDERFLOW) in the
problem cases (as indicated by an extra argument to the macro); there
is never any need to preserve underflow exceptions from this part of
the computation, because the conversion of the round-to-odd value to
the narrower type will underflow in exactly the cases in which the
function should raise that exception, but it may be more efficient to
avoid the extra manipulation of the floating-point environment when
not needed.
Tested for x86_64 and x86, and with build-many-glibcs.py.
include/math.h has a mechanism to redirect internal calls to various
libm functions, that can often be inlined by the compiler, to call
non-exported __* names for those functions in the case when the calls
aren't inlined, with the redirection being disabled when
NO_MATH_REDIRECT. Add fma to the functions to which this mechanism is
applied.
At present, libm-internal fma calls (generally to __builtin_fma*
functions) are only done when it's known the call will be inlined,
with alternative code not relying on an fma operation being used in
the caller otherwise. This patch is in preparation for adding the TS
18661 / C2X narrowing fma functions to glibc; it will be natural for
the narrowing function implementations to call the underlying fma
functions unconditionally, with this either being inlined or resulting
in an __fma* call. (Using two levels of round-to-odd computation like
that, in the case where there isn't an fma hardware instruction, isn't
optimal but is certainly a lot simpler for the initial implementation
than writing different narrowing fma implementations for all the
various pairs of formats.)
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by the patch (using
<https://sourceware.org/pipermail/libc-alpha/2021-September/130991.html>
to fix installed library stripping in build-many-glibcs.py). Also
tested for x86_64.
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well. However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div. Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.
The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.
I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch is using the corresponding GCC builtin for roundevenf,
roundeven and roundevenl if the USE_FUNCTION_BUILTIN macros are defined
to one in math-use-builtins.h.
These builtin functions is supported since GCC 10.
The code of the generic implementation is not changed.
Signed-off-by: Shen-Ta Hsieh <ibmibmibm.tw@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This patch redirect roundeven function for futhermore changes.
Signed-off-by: Shen-Ta Hsieh <ibmibmibm.tw@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This patch replaced obsolete AC_TRY_COMPILE to AC_COMPILE_IFELSE or
AC_PREPROC_IFELSE.
It has been confirmed that GNU 'autoconf' 2.69 suppressed obsolete
warnings, updated the following files:
- configure
- sysdeps/mach/configure
- sysdeps/mach/hurd/configure
- sysdeps/s390/configure
- sysdeps/unix/sysv/linux/configure
and didn't change the following files:
- sysdeps/ieee754/ldbl-opt/configure
- sysdeps/unix/sysv/linux/powerpc/configure
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The symbol has never been exported, so no compatibility symbol is
needed. Removing this file prevents ld from creation an exported
symbol in case GLIBC_2_0 expands to a symbol version which
does not have a local: *; directive in the symbol version map file.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>