glibc

Commit Graph

Author	SHA1	Message	Date
Adhemerval Zanella	8ae9e51376	math: Use log1pf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic log1pf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1, gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 71.8142 38.9668 45.74% x86_64v2 71.9094 39.1321 45.58% x86_64v3 60.1000 32.4016 46.09% i686 147.105 104.258 29.13% aarch64 26.4439 14.0050 47.04% power10 19.4874 9.4146 51.69% powerpc 17.6145 8.00736 54.54% reciprocal-throughput master patched improvement x86_64 19.7604 12.7254 35.60% x86_64v2 19.0039 11.9455 37.14% x86_64v3 16.8559 11.9317 29.21% i686 82.3426 73.9718 10.17% aarch64 14.4665 7.9614 44.97% power10 11.9974 8.4117 29.89% powerpc 7.15222 6.0914 14.83% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-01 11:27:39 -03:00
Adhemerval Zanella	c369580814	math: Use log2p1f from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance compared to the generic log2p1f. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 70.1462 47.0090 32.98% x86_64v2 70.2513 47.6160 32.22% x86_64v3 60.4840 39.9443 33.96% i686 164.068 122.909 25.09% aarch64 25.9169 16.9207 34.71% power10 18.1261 9.8592 45.61% powerpc 17.2683 9.38665 45.64% reciprocal-throughput master patched improvement x86_64 26.2240 16.4082 37.43% x86_64v2 25.0911 15.7480 37.24% x86_64v3 20.9371 11.7264 43.99% i686 90.4209 95.3073 -5.40% aarch64 16.8537 8.9561 46.86% power10 12.9401 6.5555 49.34% powerpc 9.01763 7.54745 16.30% The performance decrease for i686 is mostly due the use of x87 fpu, when building with '-msse2 -mfpmath=sse: master patched improvement latency 164.068 102.982 37.23% reciprocal-throughput 89.1968 82.5117 7.49% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-01 11:27:39 -03:00
Adhemerval Zanella	bbd578b38d	math: Use expm1f from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance compared to the generic expm1f. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 96.7402 36.4026 62.37% x86_64v2 97.5391 33.4625 65.69% x86_64v3 82.1778 30.8668 62.44% i686 120.58 94.8302 21.35% aarch64 32.3558 12.8881 60.17% power10 23.5087 9.8574 58.07% powerpc 23.4776 9.06325 61.40% reciprocal-throughput master patched improvement x86_64 27.8224 15.9255 42.76% x86_64v2 27.8364 9.6438 65.36% x86_64v3 20.3227 9.6146 52.69% i686 63.5629 59.4718 6.44% aarch64 17.4838 7.1082 59.34% power10 12.4644 8.7829 29.54% powerpc 14.2152 5.94765 58.16% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-01 11:27:35 -03:00
Adhemerval Zanella	5c22fd25c1	math: Use exp2m1f from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance compared to the generic exp2m1f. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). The only change is to handle FLT_MAX_EXP for FE_DOWNWARD or FE_TOWARDZERO. The benchmark inputs are based on exp2f ones. Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 40.6042 48.7104 -19.96% x86_64v2 40.7506 35.9032 11.90% x86_64v3 35.2301 31.7956 9.75% i686 102.094 94.6657 7.28% aarch64 18.2704 15.1387 17.14% power10 11.9444 8.2402 31.01% reciprocal-throughput master patched improvement x86_64 20.8683 16.1428 22.64% x86_64v2 19.5076 10.4474 46.44% x86_64v3 19.2106 10.4014 45.86% i686 56.4054 59.3004 -5.13% aarch64 12.0781 7.3953 38.77% power10 6.5306 5.9388 9.06% The generic implementation calls __ieee754_exp2f and x86_64 provides an optimized ifunc version (built with -mfma -mavx2, not correctly rounded). This explains the performance difference for x86_64. Same for i686, where the ABI provides an optimized __ieee754_exp2f version built with '-msse2 -mfpmath=sse'. When built wth same flags, the new algorithm shows a better performance: master patched improvement latency 102.094 91.2823 10.59% reciprocal-throughput 56.4054 52.7984 6.39% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-01 11:27:35 -03:00
Adhemerval Zanella	5fa89852fa	math: Use exp10m1f from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance compared to the generic exp10m1f. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). I mostly fixed some small issues in corner cases (sNaN handling, -INFINITY, a specific overflow check). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 45.4690 49.5845 -9.05% x86_64v2 46.1604 36.2665 21.43% x86_64v3 37.8442 31.0359 17.99% i686 121.367 93.0079 23.37% aarch64 21.1126 15.0165 28.87% power10 12.7426 8.4929 33.35% reciprocal-throughput master patched improvement x86_64 19.6005 17.4005 11.22% x86_64v2 19.6008 11.1977 42.87% x86_64v3 17.5427 10.2898 41.34% i686 59.4215 60.9675 -2.60% aarch64 13.9814 7.9173 43.37% power10 6.7814 6.4258 5.24% The generic implementation calls __ieee754_exp10f which has an optimized version, although it is not correctly rounded, which is the main culprit of the the latency difference for x86_64 and throughp for i686. Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-11-01 11:27:26 -03:00
Paul Zimmermann	392b3f0971	replace tgammaf by the CORE-MATH implementation The CORE-MATH implementation is correctly rounded (for any rounding mode). This can be checked by exhaustive tests in a few minutes since there are less than 2^32 values to check against for example GNU MPFR. This patch also adds some bench values for tgammaf. Tested on x86_64 and x86 (cfarm26). With the initial GNU libc code it gave on an Intel(R) Core(TM) i7-8700: "tgammaf": { "": { "duration": 3.50188e+09, "iterations": 2e+07, "max": 602.891, "min": 65.1415, "mean": 175.094 } } With the new code: "tgammaf": { "": { "duration": 3.30825e+09, "iterations": 5e+07, "max": 211.592, "min": 32.0325, "mean": 66.1649 } } With the initial GNU libc code it gave on cfarm26 (i686): "tgammaf": { "": { "duration": 3.70505e+09, "iterations": 6e+06, "max": 2420.23, "min": 243.154, "mean": 617.509 } } With the new code: "tgammaf": { "": { "duration": 3.24497e+09, "iterations": 1.8e+07, "max": 1238.15, "min": 101.155, "mean": 180.276 } } Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Changes in v2: - include <math.h> (fix the linknamespace failures) - restored original benchtests/strcoll-inputs/filelist#en_US.UTF-8 file - restored original wrapper code (math/w_tgammaf_compat.c), except for the dealing with the sign - removed the tgammaf/float entries in all libm-test-ulps files - address other comments from Joseph Myers (https://sourceware.org/pipermail/libc-alpha/2024-July/158736.html) Changes in v3: - pass NULL argument for signgam from w_tgammaf_compat.c - use of math_narrow_eval - added more comments Changes in v4: - initialize local_signgam to 0 in math/w_tgamma_template.c - replace sysdeps/ieee754/dbl-64/gamma_productf.c by dummy file Changes in v5: - do not mention local_signgam any more in math/w_tgammaf_compat.c - initialize local_signgam to 1 instead of 0 in w_tgamma_template.c and added comment Changes in v6: - pass NULL as 2nd argument of __ieee754_gammaf_r in w_tgammaf_compat.c, and check for NULL in e_gammaf_r.c Changes in v7: - added Signed-off-by line for Alexei Sibidanov (author of the code) Changes in v8: - added Signed-off-by line for Paul Zimmermann (submitted of the patch) Changes in v9: - address comments from review by Adhemerval Zanella Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-10-11 11:12:32 +02:00
Florian Weimer	9446351dac	powerpc64le: Update ulps Based on results from a POWER8 system with a GCC 8 build.	2024-08-08 13:42:12 +02:00
Adhemerval Zanella	6411dba836	powerpc: Update soft-fp ulps From new tests added by `0797283910`.	2024-08-07 11:02:03 -03:00
jeevitha	4e40c8104f	powerpc: Update ulps for fpu Adjust the ULPs for the log2p1 implementation.	2024-07-25 10:28:47 -03:00
Florian Weimer	71dafdf5f1	powerpc: Update ulps Results based on POWER8 and POWER9 machines running powerpc64-linux-gnu, with and without --disable-multi-arch.	2024-06-20 12:15:31 +02:00
Adhemerval Zanella	52b397bafa	powerpc: Update ulps For the exp10m1, exp2m1, and log10p1 implementations.	2024-06-18 17:31:10 -03:00
Joseph Myers	bb014f50c4	Implement C23 logp1 C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch does mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are not updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.	2024-06-17 13:47:09 +00:00
Adhemerval Zanella	f83e461f10	powerpc: Update ulps For the log2p1 implementation.	2024-05-20 13:12:23 -03:00
Adhemerval Zanella	ae515ba530	powerpc: Fix __fesetround_inline_nocheck on POWER9+ (BZ 31682) The `e68b1151f7` commit changed the __fesetround_inline_nocheck implementation to use mffscrni (through __fe_mffscrn) instead of mtfsfi. For generic powerpc ceil/floor/trunc, the function is supposed to disable the floating-point inexact exception enable bit, however mffscrni does not change any exception enable bits. This patch fixes by reverting the optimization for the __fesetround_inline_nocheck. Checked on powerpc-linux-gnu. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2024-05-09 08:59:30 -03:00
Paul Eggert	dff8da6b3e	Update copyright dates with scripts/update-copyrights	2024-01-01 10:53:40 -08:00
Adhemerval Zanella	ecb1e7220d	powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988) According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). This is a side-effect of how we implement the GNU extension feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception flag triggers a trap. To make the both functions follow the C23, fesetexcept and fesetexceptflag now fail if the argument may trigger a trap. The math tests now check for an value different than 0, instead of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP. Checked on powerpc64le-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:34 -03:00
Manjunath Matti	4eac1825ed	fegetenv_and_set_rn now uses the builtins provided by GCC. On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the prologue get/save/set rounding mode operations for POWER9 and later by using 'mffscrn' where possible, this was introduced by commit `f1c56cdff0`. GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn which now returns the FPSCR fields in a double. This feature is available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined. GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds this feature. Changes are done to use __builtin_set_fpscr_rn instead of mffscrn or mffscrni in __fe_mffscrn(rn). Suggested-by: Carl Love <cel@us.ibm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-09-27 13:55:36 -03:00
Frederic Berat	d636339306	sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: Fix warn unused result The fread routine return value needs to be checked when fortification is enabled, hence use xfread helper. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-06-22 00:21:17 -04:00
Mahesh Bodapati	56fc4b45c0	powerpc:Regenerate ulps for hypot For new inputs added in commit `3efbf11fdf`, regenerate the ulps of hypot from 0(default) to 1	2023-02-23 22:06:03 -06:00
Joseph Myers	6d7e8eda9b	Update copyright dates with scripts/update-copyrights	2023-01-06 21:14:39 +00:00
Adhemerval Zanella	efeb2bd1ab	math: Add math-use-builtins-fabs (BZ#29027) Both float, double, and _Float128 are assumed to be supported (float and double already only uses builtins). Only long double is parametrized due GCC bug 29253 which prevents its usage on powerpc. It allows to remove i686, ia64, x86_64, powerpc, and sparc arch specific implementation. On ia64 it also fixes the sNAN handling: math/test-float64x-fabs math/test-ldouble-fabs Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, powerpc64-linux-gnu, sparc64-linux-gnu, and ia64-linux-gnu.	2022-05-23 17:49:18 -03:00
Adhemerval Zanella	2a45807e73	powerpc: Remove fcopysign{f} implementation The builtin and generic implementation from generic files are suffice. Checked on powerpc64-linux-gnu and powerpc-linux-gnu.	2022-04-07 12:00:16 -03:00
Paul Eggert	581c785bf3	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: * 912-#endif remote: * 913: remote: * 914- remote: * error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines	2022-01-01 11:40:24 -08:00
Adhemerval Zanella	2eb1cd2f47	math: Remove powerpc e_hypot The generic implementation is shows only slight worse performance: POWER10 reciprocal-throughput latency master 8.28478 13.7253 new hypot 7.21945 13.1933 POWER9 reciprocal-throughput latency master 13.4024 14.0967 new hypot 14.8479 15.8061 POWER8 reciprocal-throughput latency master 15.5767 16.8885 new hypot 16.5371 18.4057 One way to improve might to make gcc generate xsmaxdp/xsmindp for fmax/fmin (it onl does for -ffast-math, clang does for default options). Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu (power9).	2021-12-13 09:08:07 -03:00
Paul A. Clarke	9fea0f1a2a	[powerpc] Tighten contraints for asm constant parameters There are a few places where only known numeric values are acceptable for `asm` parameters, yet the constraint "i" is used. "i" can include "symbolic constants whose values will be known only at assembly time or later." Use "n" instead of "i" where known numeric values are required. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2021-11-03 09:17:28 -05:00
Adhemerval Zanella	260d3032ad	powerpc: update libm test ulps Update after commit `6bbf729832` (Fixed inaccuracy of j0f (BZ #28185)).	2021-10-06 10:50:33 -03:00
Joseph Myers	b3f27d8150	Add narrowing fma functions This patch adds the narrowing fused multiply-add functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal, f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x, f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128, f64xfmaf128 for configurations with _Float64x and _Float128; __f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case (for calls to ffmal and dfmal when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, especially that for sqrt, so the description of those generally applies to this patch as well. As with sqrt, I reused the same test inputs in auto-libm-test-in as for non-narrowing fma rather than adding extra or separate inputs for narrowing fma. The tests in libm-test-narrow-fma.inc also follow those for non-narrowing fma. The non-narrowing fma has a known bug (bug 6801) that it does not set errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather than fixing this or having narrowing fma check for errors when non-narrowing does not (complicating the cases when narrowing fma can otherwise be an alias for a non-narrowing function), this patch does not attempt to check for errors from narrowing fma and set errno; the CHECK_NARROW_FMA macro is still present, but as a placeholder that does nothing, and this missing errno setting is considered to be covered by the existing bug rather than needing a separate open bug. missing-errno annotations are duly added to many of the auto-libm-test-in test inputs for fma. This completes adding all the new functions from TS 18661-1 to glibc, so will be followed by corresponding stdc-predef.h changes to define __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support for TS 18661-1 will be at a similar level to that for C standard floating-point facilities up to C11 (pragmas not implemented, but library functions done). (There are still further changes to be done to implement changes to the types of fromfp functions from N2548.) Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).	2021-09-22 21:25:31 +00:00
Joseph Myers	4eff749e8f	Adjust new narrowing div/mul tests for IBM long double, update powerpc ULPs Testing for powerpc shows some of the new narrowing div/mul tests need XFAILing for IBM long double and some ULPs updates are needed for those tests.	2021-09-22 12:35:44 +00:00
Joseph Myers	abd383584b	Add narrowing square root functions This patch adds the narrowing square root functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64, f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x, f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128, f64xsqrtf128 for configurations with _Float64x and _Float128; __f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case (for calls to fsqrtl and dsqrtl when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, so the description of those generally applies to this patch as well. However, the not-actually-narrowing cases (where the two types involved in the function have the same floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather than needing a separately built not-actually-narrowing function such as was needed for add / sub / mul / div. Thus, there is no __nldbl_dsqrtl name for ldbl-opt because no such name was needed (whereas the other functions needed such a name since the only other name for that entry point was e.g. f32xaddf64, not reserved by TS 18661-1); the headers are made to arrange for sqrt to be called in that case instead. The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because they were observed to be needed in GCC 7 testing of riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/ files added didn't need such DIAG_* in any configuration I tested with build-many-glibcs.py, but if they do turn out to be needed in more files with some other configuration / GCC version, they can always be added there. I reused the same test inputs in auto-libm-test-in as for non-narrowing sqrt rather than adding extra or separate inputs for narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow those for non-narrowing sqrt. Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).	2021-09-10 20:56:22 +00:00
Siddhesh Poyarekar	30891f35fa	Remove "Contributed by" lines We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-09-03 22:06:44 +05:30
Tulio Magno Quites Machado Filho	667d9c8d55	powerpc: Update libm test ulps Update after commit `43576de04a`.	2021-04-09 17:41:22 -03:00
Paul Zimmermann	9acda61d94	Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469 , #14470 , #14471 , #14472 ] For j0f/j1f/y0f/y1f, the largest error for all binary32 inputs is reduced to at most 9 ulps for all rounding modes. The new code is enabled only when there is a cancellation at the very end of the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not give any visible slowdown on average. Two different algorithms are used: * around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of degree 3 are used, computed using the Sollya tool (https://www.sollya.org/) * for large inputs, an asymptotic formula from [1] is used [1] Fast and Accurate Bessel Function Computation, John Harrison, Proceedings of Arith 19, 2009. Inputs yielding the new largest errors are added to auto-libm-test-in, and ulps are regenerated for various targets (thanks Adhemerval Zanella). Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-04-02 06:15:48 +02:00
Raphael Moreira Zinsly	56c81132cc	powerpc: Add optimized ilogb* for POWER9 The instructions xsxexpdp and xsxexpqp introduced on POWER9 extract the exponent from a double-precision and quad-precision floating-point respectively, thus they can be used to improve ilogb, ilogbf and ilogbf128.	2021-03-16 12:19:09 -03:00
Matheus Castanho	c82e691c56	powerpc: Update libm-test-ulps Generated with 'make regen-ulps' on POWER8. Tested on powerpc, powerpc64, and powerpc64le	2021-03-16 09:23:41 -03:00
Florian Weimer	82215c1e25	powerpc: Regenerate ulps This time on a POWER8 machine.	2021-03-03 18:39:17 +01:00
Matheus Castanho	40d055a2dd	powerpc: Update libm-test-ulps Generated with 'make regen-ulps' Tested on powerpc, powerpc64, and powerpc64le	2021-03-02 10:08:07 -03:00
Paul Eggert	2b778ceb40	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: * pre-commit check failed ... remote: * error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master	2021-01-02 12:17:34 -08:00
Florian Weimer	2aa8ec7dd7	powerpc: Regenerate ulps For new inputs added in commit `cad5ad81d2`, as seen on a POWER8 system.	2020-12-22 19:22:44 +01:00
Matheus Castanho	c71d13a098	Update powerpc libm-test-ulps Before this patch, the following tests were failing: ppc and ppc64: FAIL: math/test-ldouble-j0 ppc64le: FAIL: math/test-float128-j0 FAIL: math/test-float64x-j0 FAIL: math/test-ibm128-j0 FAIL: math/test-ldouble-j0	2020-09-10 15:52:01 -03:00
Adhemerval Zanella	169ea8f928	powerpc: Use sqrt{f} builtin The powerpc sqrt implementation is also simplified: - the static constants are open coded within the implementation. - for !USE_SQRT_BUILTIN the function is implemented directly on __ieee754_sqrt (it avoid an superflous extra jump). Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.	2020-06-22 11:09:49 -03:00
Adhemerval Zanella	e80501a5c9	math: Decompose math-use-builtins.h Each symbol definitions are moved on a separated file and it cover all symbol type definitions (float, double, long double, and float128). It allows to set support for architectures without the boiler place of copying default values. Checked with a build on the affected ABIs.	2020-06-22 11:09:45 -03:00
Paul E. Murphy	6ef4227509	powerpc64le: use common fmaf128 implementation This defines the macro such that it should behave best on all supported powerpc targets. Likewise, this allows us to remove the ppc64le specific s_fmaf128.c. I have verified powerpc64le multiarch and powerpc64le power9 no-multiarch builds continue to generate optimize fmaf128.	2020-06-05 15:29:44 -05:00
Adhemerval Zanella	6f10ff02cb	powerpc: Fix powerpc64le due `a7a3435c9a` The build uses an undefined macro evaluation for fmaf128 build. For now set USE_FMAL_BUILTIN and USE_FMAF128_BUILTIN to 0. Checked with a build for: powerpc64le-linux-gnu-power9-disable-multi-arch powerpc64le-linux-gnu-power9 powerpc64le-linux-gnu powerpc64-linux-gnu-power8 powerpc64-linux-gnu powerpc-linux-gnu-power4 powerpc-linux-gnu	2020-06-04 09:05:41 -03:00
Vineet Gupta	a7a3435c9a	powerpc/fpu: use generic fma functions Tested with build-many-glibcs for powerpc-linux-gnu This is a non functional change and powerpc libm before/after was byte invariant as compared below: \| cd /SCRATCH/vgupta/gnu/install-glibc-A-baseline \| for i in `find . -name libm-2.31.9000.so`; do \| echo $i; diff $i /SCRATCH/vgupta/gnu/install-glibc-C-reduce-scope/$i ; \| echo $?; \| done \| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so \| 0 \| ./arm-linux-gnueabi/lib/libm-2.31.9000.so \| 0 \| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so \| 0 \| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so \| 0 \| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so \| 0 \| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so \| 0 \| ./powerpc-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./microblaze-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./nios2-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./hppa-linux-gnu/lib/libm-2.31.9000.so \| 0 \| ./s390x-linux-gnu/lib64/libm-2.31.9000.so \| 0 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-06-03 10:23:33 -07:00
Gabriel F. T. Gomes	051be01f6b	powerpc64le: Enable support for IEEE long double On platforms where long double may have two different formats, i.e.: the same format as double (64-bits) or something else (128-bits), building with -mlong-double-128 is the default and function calls in the user program match the name of the function in Glibc. When building with -mlong-double-64, Glibc installed headers redirect such calls to the appropriate function. Likewise, the internals of glibc are now built against IEEE long double. However, the only (minimally) notable usage of long double is difftime. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2020-04-30 08:52:08 -05:00
Tulio Magno Quites Machado Filho	bd6cdfc18c	powerpc: Update ULPs and xfail more ibm128 outputs There are 2 new input values that require to be marked as xfail-rounding:ibm128-libgcc as they're known to fail because of libgcc issues with different rounding modes. Otherwise, the other tests just need an increase in ULP.	2020-04-07 11:41:29 -03:00
Adhemerval Zanella	5f34491510	math: Remove fenvinline.h Similar to string2.h (`18b10de7ce`) and string3.h (`09a596cc2c`) this patch removes the fenvinline.h on all architectures. Currently only powerpc implements some optimizations. This kind of optimization is better implemented by the compiler (which handles the architecture ISA transparently). Also, for the specific optimized powerpc implementation the code is becoming convoluted and these micro-optimization are hardly wildly used, even more being a possible hotspot in realword cases (non-default rounding are used only on specific cases and exception handling are done most likely only on errors path). Only x86 implements similar optimization (on fenv.h) also indicates that these should no be on libc. The math/test-fenv already covers all math/test-fenvinline tests, so it is safe to remove it. The powerpc fegetround optimization is moved to internal fenv_libc.h. The BZ#94193 [1] the corresponding GCC bug for adding replacements for these on powerpc. Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94193	2020-03-30 10:52:25 -03:00
Adhemerval Zanella	1c15464ca0	math: Remove inline math tests With mathinline removal there is no need to keep building and testing inline math tests. The gen-libm-tests.py support to generate ULP_I_* is removed and all libm-test-ulps files are updated to longer have the i{float,double,ldouble} entries. The support for no-test-inline is also removed from both gen-auto-libm-tests and the auto-libm-test-out-* were regenerated. Checked on x86_64-linux-gnu and i686-linux-gnu.	2020-03-19 11:45:44 -03:00
Wilco Dijkstra	220622dde5	Add libm_alias_finite for _finite symbols This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:04 -03:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00

1 2 3 4 5 ...

382 Commits