C23 adds various <math.h> function families originally defined in TS
18661-4. Add the pown functions, which are like pow but with an
integer exponent. That exponent has type long long int in C23; it was
intmax_t in TS 18661-4, and as with other interfaces changed after
their initial appearance in the TS, I don't think we need to support
the original version of the interface. The test inputs are based on
the subset of test inputs for pow that use integer exponents that fit
in long long.
As the first such template implementation that saves and restores the
rounding mode internally (to avoid possible issues with directed
rounding and intermediate overflows or underflows in the wrong
rounding mode), support also needed to be added for using
SET_RESTORE_ROUND* in such template function implementations. This
required math-type-macros-float128.h to include <fenv_private.h>, so
it can tell whether SET_RESTORE_ROUNDF128 is defined. In turn, the
include order with <fenv_private.h> included before <math_private.h>
broke loongarch builds, showing up that
sysdeps/loongarch/math_private.h is really a fenv_private.h file
(maybe implemented internally before the consistent split of those
headers in 2018?) and needed to be renamed to fenv_private.h to avoid
errors with duplicate macro definitions if <math_private.h> is
included after <fenv_private.h>.
The underlying implementation uses __ieee754_pow functions (called
more than once in some cases, where the exponent does not fit in the
floating type). I expect a custom implementation for a given format,
that only handles integer exponents but handles larger exponents
directly, could be faster and more accurate in some cases.
I encourage searching for worst cases for ulps error for these
implementations (necessarily non-exhaustively, given the size of the
input space).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the powr functions, which are like pow, but with simpler
handling of special cases (based on exp(y*log(x)), so negative x and
0^0 are domain errors, powers of -0 are always +0 or +Inf never -0 or
-Inf, and 1^+-Inf and Inf^0 are also domain errors, while NaN^0 and
1^NaN are NaN). The test inputs are taken from those for pow, with
appropriate adjustments (including removing all tests that would be
domain errors from those in auto-libm-test-in and adding some more
such tests in libm-test-powr.inc).
The underlying implementation uses __ieee754_pow functions after
dealing with all special cases that need to be handled differently.
It might be a little faster (avoiding a wrapper and redundant checks
for special cases) to have an underlying implementation built
separately for both pow and powr with compile-time conditionals for
special-case handling, but I expect the benefit of that would be
limited given that both functions will end up needing to use the same
logic for computing pow outside of special cases.
My understanding is that powr(negative, qNaN) should raise "invalid":
that the rule on "invalid" for an argument outside the domain of the
function takes precedence over a quiet NaN argument producing a quiet
NaN result with no exceptions raised (for rootn it's explicit that the
0th root of qNaN raises "invalid"). I've raised this on the WG14
reflector to confirm the intent.
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the rsqrt functions (1/sqrt(x)). The test inputs are
taken from those for sqrt.
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atan2pi functions (atan2(y,x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atanpi functions (atan(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the asinpi functions (asin(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the acospi functions (acos(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the tanpi functions (tan(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the sinpi functions (sin(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the cospi functions (cos(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and
exp10(x)-1, like expm1).
As with other such functions, these use type-generic templates that
could be replaced with faster and more accurate type-specific
implementations in future. Test inputs are copied from those for
expm1, plus some additions close to the overflow threshold (copied
from exp2 and exp10) and also some near the underflow threshold.
exp2m1 has the unusual property of having an input (M_MAX_EXP) where
whether the function overflows (under IEEE semantics) depends on the
rounding mode. Although these could reasonably be XFAILed in the
testsuite (as we do in some cases for arguments very close to a
function's overflow threshold when an error of a few ulps in the
implementation can result in the implementation not agreeing with an
ideal one on whether overflow takes place - the testsuite isn't smart
enough to handle this automatically), since these functions aren't
required to be correctly rounding, I made the implementation check for
and handle this case specially.
The Makefile ordering expected by lint-makefiles for the new functions
is a bit peculiar, but I implemented it in this patch so that the test
passes; I don't know why log2 also needed moving in one Makefile
variable setting when it didn't in my previous patches, but the
failure showed a different place was expected for that function as
well.
The powerpc64le IFUNC setup seems not to be as self-contained as one
might hope; it shouldn't be necessary to add IFUNCs for new functions
such as these simply to get them building, but without setting up
IFUNCs for the new functions, there were undefined references to
__GI___expm1f128 (that IFUNC machinery results in no such function
being defined, but doesn't stop include/math.h from doing the
redirection resulting in the exp2m1f128 and exp10m1f128
implementations expecting to call it).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for
base-10 logarithms).
This is directly analogous to the log2p1 implementation (except that
whereas log2p1 has a smaller underflow range than log1p, log10p1 has a
larger underflow range). The test inputs are copied from those for
log1p and log2p1, plus a few more inputs in that wider underflow
range.
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the logp1 functions (aliases for log1p functions - the
name is intended to be more consistent with the new log2p1 and
log10p1, where clearly it would have been very confusing to name those
functions log21p and log101p). As aliases rather than new functions,
the content of this patch is somewhat different from those actually
adding new functions.
Tests are shared with log1p, so this patch *does* mechanically update
all affected libm-test-ulps files to expect the same errors for both
functions.
The vector versions of log1p on aarch64 and x86_64 are *not* updated
to have logp1 aliases (and thus there are no corresponding header,
tests, abilist or ulps changes for vector functions either). It would
be reasonable for such vector aliases and corresponding changes to
other files to be made separately. For now, the log1p tests instead
avoid testing logp1 in the vector case (a Makefile change is needed to
avoid problems with grep, used in generating the .c files for vector
function tests, matching more than one ALL_RM_TEST line in a file
testing multiple functions with the same inputs, when it assumes that
the .inc file only has a single such line).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the log2p1 functions (log2(1+x): like log1p, but for
base-2 logarithms).
This illustrates the intended structure of implementations of all
these function families: define them initially with a type-generic
template implementation. If someone wishes to add type-specific
implementations, it is likely such implementations can be both faster
and more accurate than the type-generic one and can then override it
for types for which they are implemented (adding benchmarks would be
desirable in such cases to demonstrate that a new implementation is
indeed faster).
The test inputs are copied from those for log1p. Note that these
changes make gen-auto-libm-tests depend on MPFR 4.2 (or later).
The bulk of the changes are fairly generic for any such new function.
(sysdeps/powerpc/nofpu/Makefile only needs changing for those
type-generic templates that use fabs.)
Tested for x86_64 and x86, and with build-many-glibcs.py.
Remove the error handling wrapper from exp10. This is very similar to
the changes done to exp and exp2, except that we also need to handle
pow10 and pow10l.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper
with SVID error handling around the new code. There is no new symbol
version nor compatibility code on !LIBM_SVID_COMPAT targets
(e.g. riscv).
The ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation. For both i686 and m68k,
which provive arch specific implementation, wrappers are added so
no new symbol are added (which would require to change the
implementations).
It shows an small improvement, the results for fmod:
Architecture | Input | master | patch
-----------------|-----------------|----------|--------
x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992
x86_64 (Ryzen 9) | normal | 296.939 | 296.738
x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119
aarch64 (N1) | subnormal | 6.81778 | 4.33313
aarch64 (N1) | normal | 155.620 | 152.915
aarch64 (N1) | close-exponents | 8.21306 | 5.76138
armhf (N1) | subnormal | 15.1083 | 14.5746
armhf (N1) | normal | 244.833 | 241.738
armhf (N1) | close-exponents | 21.8182 | 22.457
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The error handling is moved to sysdeps/ieee754 version with no SVID
support. The compatibility symbol versions still use the wrapper with
SVID error handling around the new code. There is no new symbol version
nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
Only ia64 is unchanged, since it still uses the arch specific
__libm_error_region on its implementation.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs. fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN). fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN. fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value. All these functions also treat +0 as greater than -0. There
are also corresponding <tgmath.h> type-generic macros.
Add these functions to glibc. The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results. The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.
Tested for x86_64 and x86, and with build-many-glibcs.py.
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well. However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div. Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.
The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.
I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
This patch changes the exp10f error handling semantics to only set
errno according to POSIX rules. New symbol version is introduced at
GLIBC_2.32. The old wrappers are kept for compat symbols.
There are some outliers that need special handling:
- ia64 provides an optimized implementation of exp10f that uses ia64
specific routines to set SVID compatibility. The new symbol version
is aliased to the exp10f one.
- m68k also provides an optimized implementation, and the new version
uses it instead of the sysdeps/ieee754/flt32 one.
- riscv and csky uses the generic template implementation that
does not provide SVID support. For both cases a new exp10f
version is not added, but rather the symbols version of the
generic sysdeps/ieee754/flt32 is adjusted instead.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
powerpc64le-linux-gnu.
The fix for BZ#18231 requires new symbols only for microblaze. This patch
adds the required folder and files for both BE and LE abilist. No semantic
changes are expected.
Checked with check-abi for microblaze-linux-gnueabihf and
microblazeel-linux-gnueabihf.
* sysdeps/microblaze/preconfigure.ac: New file.
* sysdeps/microblaze/preconfigure: Regenerate.
* sysdeps/microblaze/be/implies: New file.
* sysdeps/microblaze/le/implies: Likewise.
* sysdeps/unix/sysv/linux/microblaze/be/implies: Likewise.
* sysdeps/unix/sysv/linux/microblaze/le/implies: Likewise.
* sysdeps/unix/sysv/linux/microblaze/*.abilist. Move to
sysdeps/unix/sysv/linux/microblaze/be/*.abilist.
* sysdeps/unix/sysv/linux/microblaze/le/*.abilist: New files.