Commit Graph

41643 Commits

Author SHA1 Message Date
Andreas K. Hüttel 3a9b4b4aeb
math: Add sinpi,cospi,tanpi sparc64 ulps
Linux catbus 6.1.112 #1 SMP Sun Oct 13 10:52:08 PDT 2024 sparc64 sun4v UltraSparc T5 (Niagara5) GNU/Linux

gcc (Gentoo 13.3.1_p20240614 p17) 13.3.1 20240614

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-12-08 22:01:51 +01:00
Andreas K. Hüttel 80d1e63e90
math: Add tanpi aarch64 ulps
Linux dola 5.15.169-gentoo-dist #1 SMP Wed Oct 23 06:25:30 -00 2024 aarch64 GNU/Linux
Vendor ID:                ARM
  Model name:             Neoverse-N1

gcc (Gentoo Hardened 13.3.1_p20241025 p1) 13.3.1 20241024

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2024-12-08 18:25:05 +01:00
H.J. Lu 5df09b4448 math: Exclude internal math symbols for tests [BZ #32414]
Since internal tests don't have access to internal symbols in libm,
exclude them for internal tests.  Also make tst-strtod5 and tst-strtod5i
depend on $(libm) to support older versions of GCC which can't inline
copysign family functions.  This fixes BZ #32414.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2024-12-07 13:43:01 -08:00
H.J. Lu 77c7c44174 Remove AC_SUBST(libc_cv_mtls_descriptor)
Remove

AC_SUBST(libc_cv_mtls_descriptor)

since there is no @libc_cv_mtls_descriptor@ and there is

LIBC_CONFIG_VAR([have-mtls-descriptor], [$libc_cv_mtls_descriptor])

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
2024-12-06 17:42:06 +08:00
Joseph Myers f9e90e4b4c Implement C23 tanpi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the tanpi functions (tan(pi*x)).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-05 21:42:10 +00:00
Joseph Myers 062257c5d9 Fix typo in elf/Makefile:postclean-generated
The postclean-generated setting in elf/Makefile lists
$(objpfx)/dso-sort-tests-2.generated-makefile twice and
$(objpfx)/dso-sort-tests-1.generated-makefile not at all, which looks
like a typo; fix it to list each once.

Tested for x86_64.
2024-12-05 21:40:57 +00:00
Adhemerval Zanella dae2e746b7 math: xfail some sinpi tests for ibm128-libgcc
On powerpc math/test-ibm128-sinpi shows:

testing long double (without inline functions)
Failure: sinpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set
Failure: sinpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set
Failure: sinpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged)
Failure: Test: sinpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020)
Result:
 is:         qNaN
 should be:  -0.00000000000000000000000000000000e+00  -0x0.000000000000000000000000000p+0
Failure: Test: sinpi_downward (0x3.fffffffffffffffcp+108)
Result:
 is:          2.97479253223185882765417834495004e-15   0x1.acb679186c7b49a36c9ec63e110p-49
 should be:   0.00000000000000000000000000000000e+00   0x0.000000000000000000000000000p+0
 difference:  2.97479253223185882765417834495004e-15   0x1.acb679186c7b49a36c9ec63e110p-49
 ulp       :  179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321
 max.ulp   :  4.0000
Failure: Test: sinpi_downward (0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:          2.63250110604328276654475674742669e-15   0x1.7b6225fa8503a5a8c514f5c0208p-49
 should be:   0.00000000000000000000000000000000e+00   0x0.000000000000000000000000000p+0
 difference:  2.63250110604328276654475674742669e-15   0x1.7b6225fa8503a5a8c514f5c0208p-49
 ulp       :  179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321
 max.ulp   :  4.0000
Failure: Test: sinpi_towardzero (-0x3.fffffffffffffffcp+108)
Result:
 is:         -1.71856472474338625450766636956702e-14  -0x1.3596cf230d8f69346d93d8c3100p-46
 should be:  -0.00000000000000000000000000000000e+00  -0x0.000000000000000000000000000p+0
 difference:  1.71856472474338625450766636956702e-14   0x1.3596cf230d8f69346d93d8c3100p-46
 ulp       :  179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321
 max.ulp   :  3.0000
Failure: Test: sinpi_towardzero (-0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:         -9.73792846364428462525599942305655e-15  -0x1.5ed8897ea140e96a31453d6e580p-47
 should be:  -0.00000000000000000000000000000000e+00  -0x0.000000000000000000000000000p+0
 difference:  9.73792846364428462525599942305655e-15   0x1.5ed8897ea140e96a31453d6e580p-47
 ulp       :  179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321
 max.ulp   :  3.0000
Failure: Test: sinpi_towardzero (0x3.fffffffffffffffcp+108)
Result:
 is:          1.71856472474338625450766636956702e-14   0x1.3596cf230d8f69346d93d8c3100p-46
 should be:   0.00000000000000000000000000000000e+00   0x0.000000000000000000000000000p+0
 difference:  1.71856472474338625450766636956702e-14   0x1.3596cf230d8f69346d93d8c3100p-46
 ulp       :  179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321
 max.ulp   :  3.0000
Failure: Test: sinpi_towardzero (0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:          9.73792846364428462525599942305655e-15   0x1.5ed8897ea140e96a31453d6e580p-47
 should be:   0.00000000000000000000000000000000e+00   0x0.000000000000000000000000000p+0
 difference:  9.73792846364428462525599942305655e-15   0x1.5ed8897ea140e96a31453d6e580p-47
 ulp       :  179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321
 max.ulp   :  3.0000
Failure: Test: sinpi_upward (-0x3.fffffffffffffffcp+108)
Result:
 is:         -1.71856472474338625450766636956709e-14  -0x1.3596cf230d8f69346d93d8c3110p-46
 should be:  -0.00000000000000000000000000000000e+00  -0x0.000000000000000000000000000p+0
 difference:  1.71856472474338625450766636956710e-14   0x1.3596cf230d8f69346d93d8c3110p-46
 ulp       :  inf
 max.ulp   :  4.0000
Failure: Test: sinpi_upward (-0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:         -9.73792846364428462525599942305708e-15  -0x1.5ed8897ea140e96a31453d6e598p-47
 should be:  -0.00000000000000000000000000000000e+00  -0x0.000000000000000000000000000p+0
 difference:  9.73792846364428462525599942305709e-15   0x1.5ed8897ea140e96a31453d6e598p-47
 ulp       :  inf
 max.ulp   :  4.0000
Failure: sinpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set
Failure: sinpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set
Failure: sinpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged)
Failure: Test: sinpi_upward (0xf.ffffffffffffbffffffffffffcp+1020)
Result:
 is:         qNaN
 should be:   0.00000000000000000000000000000000e+00   0x0.000000000000000000000000000p+0
2024-12-05 13:48:01 -03:00
Adhemerval Zanella b14224fb57 math: xfail some cospi tests for ibm128-libgcc
On powerpc math/test-ibm128-cospi shows:

testing long double (without inline functions)
Failure: cospi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set
Failure: cospi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set
Failure: cospi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged)
Failure: Test: cospi_downward (-0xf.ffffffffffffbffffffffffffcp+1020)
Result:
 is:         qNaN
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
Failure: Test: cospi_downward (0x3.fffffffffffffffcp+108)
Result:
 is:          9.99999999999999999999999999995574e-01   0x1.ffffffffffffffffffffffff4c8p-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  4.42501664022411309598141492088312e-30   0x1.670000000000000000000000000p-98
 ulp       :  179.5000
 max.ulp   :  4.0000
Failure: Test: cospi_downward (0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:          9.99999999999999999999999999996524e-01   0x1.ffffffffffffffffffffffff730p-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  3.47591836363008326759542899077727e-30   0x1.1a0000000000000000000000000p-98
 ulp       :  141.0000
 max.ulp   :  4.0000
Failure: Test: cospi_towardzero (-0x3.fffffffffffffffcp+108)
Result:
 is:          9.99999999999999999999999999852310e-01   0x1.ffffffffffffffffffffffe8990p-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  1.47689552599346303944427057331536e-28   0x1.767000000000000000000000000p-93
 ulp       :  5991.0000
 max.ulp   :  4.0000
Failure: Test: cospi_towardzero (-0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:          9.99999999999999999999999999952569e-01   0x1.fffffffffffffffffffffff87c0p-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  4.74302619264133348003801799876275e-29   0x1.e10000000000000000000000000p-95
 ulp       :  1924.0000
 max.ulp   :  4.0000
Failure: Test: cospi_towardzero (0x3.fffffffffffffffcp+108)
Result:
 is:          9.99999999999999999999999999852310e-01   0x1.ffffffffffffffffffffffe8990p-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  1.47689552599346303944427057331536e-28   0x1.767000000000000000000000000p-93
 ulp       :  5991.0000
 max.ulp   :  4.0000
Failure: Test: cospi_towardzero (0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:          9.99999999999999999999999999952569e-01   0x1.fffffffffffffffffffffff87c0p-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  4.74302619264133348003801799876275e-29   0x1.e10000000000000000000000000p-95
 ulp       :  1924.0000
 max.ulp   :  4.0000
Failure: Test: cospi_upward (-0x3.fffffffffffffffcp+108)
Result:
 is:          9.99999999999999999999999999852323e-01   0x1.ffffffffffffffffffffffe899bp-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  1.47673235656615530277812119019587e-28   0x1.766568e20369c00000000000000p-93
 ulp       :  5990.3382
 max.ulp   :  4.0000
Failure: Test: cospi_upward (-0x3.ffffffffffffffffffffffffffp+108)
Result:
 is:          9.99999999999999999999999999952583e-01   0x1.fffffffffffffffffffffff87cbp-1
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
 difference:  4.74136253815267677203679334037676e-29   0x1.e0d4cf1e9076600000000000000p-95
 ulp       :  1923.3252
 max.ulp   :  4.0000
Failure: cospi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set
Failure: cospi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set
Failure: cospi_upward (0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged)
Failure: Test: cospi_upward (0xf.ffffffffffffbffffffffffffcp+1020)
Result:
 is:         qNaN
 should be:   1.00000000000000000000000000000000e+00   0x1.000000000000000000000000000p+0
2024-12-05 13:47:52 -03:00
Adhemerval Zanella c8d3220e64 powerpc: Update ulps
From 'Implement C23 cospi' (0ae0af68d8)
and 'Implement C23 sinpi' (776938e8b8).
2024-12-05 13:35:24 -03:00
Wilco Dijkstra fa16523c48 AArch64: Update libm-test-ulps
Add sinpi/cospi.
2024-12-05 16:19:37 +00:00
H.J. Lu 09d07f16a7 i686: Update libm-test-ulps
Update i686 libm-test-ulps to fix

FAIL: math/test-float64x-cospi
FAIL: math/test-float64x-sinpi
FAIL: math/test-ldouble-cospi
FAIL: math/test-ldouble-sinpi

when building glibc with GCC 7.4.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-05 20:10:58 +08:00
H.J. Lu 0003605a54 x86-64: Update libm-test-ulps
Update x86-64 libm-test-ulps to fix

FAIL: math/test-float64x-cospi
FAIL: math/test-float64x-exp2m1
FAIL: math/test-float64x-sinpi
FAIL: math/test-ldouble-cospi
FAIL: math/test-ldouble-exp2m1
FAIL: math/test-ldouble-sinpi

when building glibc with GCC 7.4.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-05 20:08:36 +08:00
Joseph Myers 30ad01a3cf Use M_LIT in place of M_MLIT for literals
This should fix the reported issue building cospi and sinpi with GCC 6.

Tested for x86_64 (not with GCC 6).
2024-12-05 10:12:09 +00:00
Joseph Myers 9b5f2eb9fc Add further test of TLS
Add an additional test of TLS variables, with different alignment,
accessed from different modules.  The idea of the alignment test is
similar to tst-tlsalign and the same code is shared for setting up
test variables, but unlike the tst-tlsalign code, there are multiple
threads and variables are accessed from multiple objects to verify
that they get a consistent notion of the address of an object within a
thread.  Threads are repeatedly created and shut down to verify proper
initialization in each new thread.  The test is also repeated with TLS
descriptors when supported.  (However, only initial-exec TLS is
covered in this test.)

Tested for x86_64.
2024-12-05 09:53:47 +00:00
Sergey Bugaev 8cbab3b729 hurd: Protect against servers returning bogus read/write lengths
There already was a branch checking for this case in _hurd_fd_read ()
when the data is returned out-of-line. Do the same for inline data, as
well as for _hurd_fd_write (). It's also not possible for the length to
be negative, since it's stored in an unsigned integer.

Not verifying the returned length can confuse the callers who assume
the returned length is always reasonable. This manifested as libzstd
test suite failing on writes to /dev/zero, even though the write () call
appeared to succeed. In fact, the zero store backing /dev/zero was
returning a larger written length than the size actually submitted to
it, which is a separate bug to be fixed on the Hurd side. With this
patch, EGRATUITOUS is now propagated to the caller.

Reported-by: Diego Nieto Cid <dnietoc@gmail.com>
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20241204112915.540032-1-bugaevc@gmail.com>
2024-12-05 08:49:35 +01:00
H.J. Lu 00de38e531 Fix and sort variables in Makefiles
Fix variables in Makefiles:

1. There is a tab, not a space, between "variable" and =, +=, :=.
2. The last entry doesn't have a trailing \.

and sort them.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-12-05 15:36:23 +08:00
Joseph Myers 776938e8b8 Implement C23 sinpi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the sinpi functions (sin(pi*x)).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-04 20:04:04 +00:00
Joseph Myers 0ae0af68d8 Implement C23 cospi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the cospi functions (cos(pi*x)).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-04 10:20:44 +00:00
H.J. Lu 1c4cebb84b malloc: Optimize small memory clearing for calloc
Add calloc-clear-memory.h to clear memory size up to 36 bytes (72 bytes
on 64-bit targets) for calloc.  Use repeated stores with 1 branch, instead
of up to 3 branches.  On x86-64, it is faster than memset since calling
memset needs 1 indirect branch, 1 broadcast, and up to 4 branches.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2024-12-04 04:28:15 +08:00
Joseph Myers f43eb2cf30 Use Linux 6.12 in build-many-glibcs.py
Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
2024-12-03 03:11:22 +00:00
Carmen Bianca BAKKER c5a3d1bc84 locale: More strictly implement ISO 8601 for Esperanto locale
Esperanto, as an international language and a bit of a non-locale,
usually defaults to international consensus. In this commit, I make the
Esperanto locale more in line with ISO 8601 by setting the first day as
Monday, and the first week as containing January 4.

Closes: BZ #32323
Signed-off-by: Carmen Bianca BAKKER <carmen@carmenbianca.eu>
Reviewed-by: Mike FABIAN <mfabian@redhat.com>
2024-12-02 19:18:18 +01:00
Adhemerval Zanella 17a43505b3 elf: Consolidate stackinfo.h
And use sane default the generic implementation.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-12-02 17:14:58 +00:00
Florian Weimer b7d4de086c manual: Describe struct link_map, support link maps with dlinfo
This does not describe how to use RTLD_DI_ORIGIN and l_name
to reconstruct a full path for the an object. The reason
is that I think we should not recommend further use of
RTLD_DI_ORIGIN due to its buffer overflow potential (bug 24298).
This should be covered by another dlinfo extension.  It would
also obsolete the need for the dladdr approach to obtain
the file name for the main executable.

Obtaining the lowest address from load segments in program
headers is quite clumsy and should be provided directly
via dlinfo.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-12-02 11:38:14 +01:00
Joseph Myers 3c2b9dc41c Add threaded test of sem_trywait
All the existing glibc tests of sem_trywait are single-threaded.  Add
one that calls sem_trywait and sem_post in separate threads.

Tested for x86_64.
2024-11-29 20:25:04 +00:00
Joseph Myers 6ae9836ed2 Add test of ELF hash collisions
Add tests that the dynamic linker works correctly with symbol names
involving hash collisions, for both choices of hash style (and
--hash-style=both as well).  I note that there weren't actually any
previous tests using --hash-style (so tests would only cover the
default linker configuration in that regard).  Also test symbol
versions involving hash collisions.

Tested for x86_64.
2024-11-29 16:43:56 +00:00
Sergey Kolosov bde47662b7 nptl: Add new test for pthread_spin_trylock
Add a threaded test for pthread_spin_trylock attempting to lock already
acquired spin lock and checking for correct return code.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-11-29 15:55:20 +01:00
k4lizen e2436d6f5a malloc: send freed small chunks to smallbin
Large chunks get added to the unsorted bin since
sorting them takes time, for small chunks the
benefit of adding them to the unsorted bin is
non-existant, actually hurting performance.

Splitting and malloc_consolidate still add small
chunks to unsorted, but we can hint the compiler
that that is a relatively rare occurance.
Benchmarking shows this to be consistently good.

Authored-by: k4lizen <k4lizen@proton.me>
Signed-off-by: Aleksa Siriški <sir@tmina.org>
2024-11-29 13:27:13 +00:00
Wilco Dijkstra a08d9a52f9 AArch64: Remove zva_128 from memset
Remove ZVA 128 support from memset - the new memset no longer
guarantees count >= 256, which can result in underflow and a
crash if ZVA size is 128 ([1]).  Since only one CPU uses a ZVA
size of 128 and its memcpy implementation was removed in commit
e162ab2bf1, remove this special
case too.

[1] https://sourceware.org/pipermail/libc-alpha/2024-November/161626.html

Reviewed-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-11-29 13:27:13 +00:00
Wangyang Guo 2d6427a63c benchtests: Add calloc test
Two new benchmarks related to calloc added:
- bench-calloc-simple
- bench-calloc-thread
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-11-29 19:04:28 +08:00
Siddhesh Poyarekar 19a198f058 pthread_getcpuclockid: Add descriptive comment to smoke test
Add a descriptive comment to the tst-pthread-cpuclockid-invalid test and
also drop pthread_getcpuclockid from the TODO-testing list since it now
has full coverage.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2024-11-28 13:19:52 -05:00
Adhemerval Zanella 82a3991a84 Remove nios2-linux-gnu
GCC 15 (e876acab6cdd84bb2b32c98fc69fb0ba29c81153) and binutils
(e7a16d9fd65098045ef5959bf98d990f12314111) both removed all Nios II
support, and the architecture has been EOL'ed by the vendor.  The
kernel still has support, but without a proper compiler there
is no much sense in keep it on glibc.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-11-28 14:03:25 -03:00
Siddhesh Poyarekar 293369689a libio: make _IO_least_marker static
Trivial cleanup to limit _IO_least_marker so that it's clear that it is
unused outside of genops.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2024-11-28 08:27:24 -05:00
Wangyang Guo c69e8cccaf malloc: Avoid func call for tcache quick path in free()
Tcache is an important optimzation to accelerate memory free(), things
within this code path should be kept as simple as possible. This commit
try to remove the function call when free() invokes tcache code path by
inlining _int_free().

Result of bench-malloc-thread benchmark

Test Platform: Xeon-8380
Ratio: New / Original time_per_iteration (Lower is Better)

Threads#   | Ratio
-----------|------
1 thread   | 0.879
4 threads  | 0.874

The performance data shows it can improve bench-malloc-thread benchmark
by ~12% in both single thread and multi-thread scenario.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2024-11-27 08:24:09 +08:00
Florian Weimer 4836a9af89 debug: Fix tst-longjmp_chk3 build failure on Hurd
Explicitly include <unistd.h> for _exit and getpid.
2024-11-26 23:01:28 +01:00
Adhemerval Zanella 3b1c5a539b math: Add internal roundeven_finite
Some CORE-MATH routines uses roundeven and most of ISA do not have
an specific instruction for the operation.  In this case, the call
will be routed to generic implementation.

However, if the ISA does support round() and ctz() there is a better
alternative (as used by CORE-MATH).

This patch adds such optimization and also enables it on powerpc.
On a power10 it shows the following improvement:

expm1f                      master      patched       improvement
latency                     9.8574       7.0139            28.85%
reciprocal-throughput       4.3742       2.6592            39.21%

Checked on powerpc64le-linux-gnu and aarch64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
2024-11-26 15:07:57 -03:00
Julian Zhu 32445b6dd2 RISC-V: Use builtin for fma and fmaf
The built-in functions `builtin_{fma, fmaf}` are sufficient to generate correct `fmadd.d`/`fmadd.s` instructions on RISC-V.

Signed-off-by: Julian Zhu <jz531210@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-11-25 16:45:59 -03:00
Julian Zhu d2264de5db RISC-V: Use builtin for copysign and copysignf
The built-in functions `builtin_{copysign, copysignf}` are sufficient to generate correct `fsgnj.d/fsgnj.s` instructions on RISC-V.

Signed-off-by: Julian Zhu <jz531210@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-11-25 16:45:59 -03:00
Alejandro Colomar 53fcdf5f74 Silence most -Wzero-as-null-pointer-constant diagnostics
Replace 0 by NULL and {0} by {}.

Omit a few cases that aren't so trivial to fix.

Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117059>
Link: <https://software.codidact.com/posts/292718/292759#answer-292759>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-11-25 16:45:59 -03:00
Yannick Le Pennec 83d4b42ded sysdeps: linux: Fix output of LD_SHOW_AUXV=1 for AT_RSEQ_*
The constants themselves were added to elf.h back in 8754a4133e but the
array in _dl_show_auxv wasn't modified accordingly, resulting in the
following output when running LD_SHOW_AUXV=1 /bin/true on recent Linux:

    AT_??? (0x1b): 0x1c
    AT_??? (0x1c): 0x20

With this patch:

    AT_RSEQ_FEATURE_SIZE: 28
    AT_RSEQ_ALIGN:        32

Tested on Linux 6.11 x86_64

Signed-off-by: Yannick Le Pennec <yannick.lepennec@live.fr>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-11-25 16:45:59 -03:00
Florian Weimer 4b7cfcc3fb debug: Wire up tst-longjmp_chk3
The test was added in commit ac8cc9e300
without all the required Makefile scaffolding.  Tweak the test
so that it actually builds (including with dynamic SIGSTKSZ).

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-11-25 19:42:14 +01:00
Michael Jeanson d9f40387d3 nptl: initialize cpu_id_start prior to rseq registration
When adding explicit initialization of rseq fields prior to
registration, I glossed over the fact that 'cpu_id_start' is also
documented as initialized by user-space.

While current kernels don't validate the content of this field on
registration, future ones could.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2024-11-25 19:42:14 +01:00
Adhemerval Zanella 6976cd3124 math: Fix branch hint for 68d7128942 2024-11-25 13:37:50 -03:00
Sachin Monga 2062e02772 powerpc64le: ROP Changes for strncpy/ppc-mount
Add ROP protect instructions to strncpy and ppc-mount functions.
Modify FRAME_MIN_SIZE to 48 bytes for ELFv2 to reserve additional
16 bytes for ROP save slot and padding.

Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
2024-11-25 10:44:20 -05:00
Vincent Lefevre 68d7128942 math: Fix non-portability in the computation of signgam in lgammaf
The k>>31 in signgam = 1 - (((k&(k>>31))&1)<<1); is not portable:

* The ISO C standard says "If E1 has a signed type and a negative
  value, the resulting value is implementation-defined." (this is
  still in C23).
* If the int type is larger than 32 bits (e.g. a 64-bit type),
  then k = INT_MAX; line 144 will make k>>31 put 1 in bit 0
  (thus signgam will be -1) while 0 is expected.

Moreover, instead of the fx >= 0x1p31f condition, testing fx >= 0
is probably better for 2 reasons:

The signgam expression has more or less a condition on the sign
of fx (the goal of k>>31, which can be dropped with this new
condition). Since fx ≥ 0 should be the most common case, one can
get signgam directly in this case (value 1). And this simplifies
the expression for the other case (fx < 0).

This new condition may be easier/faster to test on the processor
(e.g. by avoiding a load of a constant from the memory).

This is commit d41459c731865516318f813cf4c966dafa0eecbf from CORE-MATH.

Checked on x86_64-linux-gnu.
2024-11-25 09:20:47 -03:00
Wangyang Guo c621d4f74f malloc: Split _int_free() into 3 sub functions
Split _int_free() into 3 smaller functions for flexible combination:
* _int_free_check -- sanity check for free
* tcache_free -- free memory to tcache (quick path)
* _int_free_chunk -- free memory chunk (slow path)
2024-11-25 12:11:23 +08:00
Samuel Thibault d92a5e1dad hurd: Add MAP_NORESERVE mmap flag
This is already the current default behavior, which we will change with
overcommit support addition.
2024-11-25 00:55:33 +01:00
Siddhesh Poyarekar 03b8d76410 nptl: Add smoke test for pthread_getcpuclockid failure
Exercise the case where an exited thread will cause
pthread_getcpuclockid to fail.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2024-11-22 14:01:39 -05:00
Joseph Myers 99671e72bb Add multithreaded test of sem_getvalue
Test coverage of sem_getvalue is fairly limited.  Add a test that runs
it on threads on each CPU.  For this purpose I adapted
tst-skeleton-thread-affinity.c; it didn't seem very suitable to use
as-is or include directly in a different test doing things per-CPU,
but did seem a suitable starting point (thus sharing
tst-skeleton-affinity.c) for such testing.

Tested for x86_64.
2024-11-22 16:58:51 +00:00
Adhemerval Zanella bccb0648ea math: Use tanf from CORE-MATH
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic tanf.

The code was adapted to glibc style, to use the definition of
math_config.h, to remove errno handling, and to use a generic
128 bit routine for ABIs that do not support it natively.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (neoverse1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master       patched  improvement
x86_64                       82.3961       54.8052       33.49%
x86_64v2                     82.3415       54.8052       33.44%
x86_64v3                     69.3661       50.4864       27.22%
i686                         219.271       45.5396       79.23%
aarch64                      29.2127       19.1951       34.29%
power10                      19.5060       16.2760       16.56%

reciprocal-throughput         master       patched  improvement
x86_64                       28.3976       19.7334       30.51%
x86_64v2                     28.4568       19.7334       30.65%
x86_64v3                     21.1815       16.1811       23.61%
i686                         105.016       15.1426       85.58%
aarch64                      18.1573       10.7681       40.70%
power10                       8.7207        8.7097        0.13%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2024-11-22 10:52:27 -03:00
Adhemerval Zanella d846f4c12d math: Use lgammaf from CORE-MATH
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic lgammaf.

The code was adapted to glibc style, to use the definition of
math_config.h, to remove errno handling, to use math_narrow_eval
on overflow usage, and to adapt to make it reentrant.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master       patched  improvement
x86_64                       86.5609       70.3278       18.75%
x86_64v2                     78.3030       69.9709       10.64%
x86_64v3                     74.7470       59.8457       19.94%
i686                         387.355       229.761       40.68%
aarch64                      40.8341       33.7563       17.33%
power10                      26.5520       16.1672       39.11%
powerpc                      28.3145       17.0625       39.74%

reciprocal-throughput         master       patched  improvement
x86_64                       68.0461       48.3098       29.00%
x86_64v2                     55.3256       47.2476       14.60%
x86_64v3                     52.3015       38.9028       25.62%
i686                         340.848       195.707       42.58%
aarch64                      36.8000       30.5234       17.06%
power10                      20.4043       12.6268       38.12%
powerpc                      22.6588       13.8866       38.71%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
2024-11-22 10:52:27 -03:00