glibc/sysdeps/arc
Adhemerval Zanella 9583836785 math: Use coshf from CORE-MATH
The CORE-MATH implementation is correctly rounded (for any rounding mode),
although it should worse performance than current one.  The current
implementation performance comes mainly from the internal usage of
the optimize expf implementation, and shows a maximum ULPs of 2 for
FE_TONEAREST and 3 for other rounding modes.

The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):

Latency                      master        patched   improvement
x86_64                      40.6995        49.0737       -20.58%
x86_64v2                    40.5841        44.3604        -9.30%
x86_64v3                    39.3879        39.7502        -0.92%
i686                       112.3380       129.8570       -15.59%
aarch64 (Neoverse)          18.6914        17.0946         8.54%
power10                     11.1343        9.3245         16.25%

reciprocal-throughput        master        patched   improvement
x86_64                      18.6471        24.1077       -29.28%
x86_64v2                    17.7501        20.2946       -14.34%
x86_64v3                    17.8262        17.1877         3.58%
i686                        64.1454        86.5645       -34.95%
aarch64 (Neoverse)          9.77226        12.2314       -25.16%
power10                      4.0200        5.3316        -32.63%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2024-12-18 17:24:43 -03:00
..
bits Fix femode_t conditionals for arc and or1k 2024-11-19 22:25:39 +00:00
fpu math: Use coshf from CORE-MATH 2024-12-18 17:24:43 -03:00
nofpu math: Use coshf from CORE-MATH 2024-12-18 17:24:43 -03:00
nptl
Implies
Makefile
Versions
__longjmp.S
abort-instr.h
atomic-machine.h
bsd-_setjmp.S
bsd-setjmp.S
configure arc: Cleanup arcbe 2024-09-25 15:54:07 +01:00
configure.ac arc: Cleanup arcbe 2024-09-25 15:54:07 +01:00
dl-machine.h
dl-runtime.h
dl-tls.h
dl-trampoline.S
entry.h
fpu_control.h
gccframe.h
get-rounding-mode.h
jmpbuf-offsets.h
jmpbuf-unwind.h
ldsodefs.h
libc-tls.c
machine-gmon.h
math-tests-trap.h
math-use-builtins-ffs.h string: Use builtins for ffs and ffsll 2024-02-01 09:31:33 -03:00
preconfigure
setjmp.S
sfp-machine.h
sotruss-lib.c
start.S
sysdep.h
tininess.h
tst-audit.h
utmp-size.h login: Check default sizes of structs utmp, utmpx, lastlog 2024-04-19 14:38:17 +02:00