Commit Graph

43308 Commits

Author SHA1 Message Date
James Chesterman bd0a3526cc benchtests: Add benchtests for rsqrtf
Add benchtests for vector single precision rsqrtf. They are
identical to those found in log2f.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-05 11:15:42 -03:00
Adhemerval Zanella eb03df5404 i386: Fix fmod/fmodf/remainder/remainderf for gcc-12
The __builtin_fmod{f} and __builtin_remainder{f} were added on gcc 13,
and the minimum supported gcc is 12.  This patch adds a configure test
to check whether the compiler enables inlining for fmod/remainder, and
uses inline assembly if not.

Checked on i686-linux-gnu wih gcc-12.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-12-04 13:12:50 -03:00
Wilco Dijkstra 83dd79dffb nptl: Check alignment of pthread structs
Report assertion failure if the alignment of external pthread structs is
lower than the internal version.  This triggers on type mismatches like
in BZ #33632.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
2025-12-04 15:45:15 +00:00
James Chesterman f9bb6bcff6 aarch64: Optimise AdvSIMD atanhf
Optimise AdvSIMD atanhf by vectorising the special case.
There are asymptotes at x = -1 and x = 1. So return inf for these.
Values for which |x| > 1, return NaN.

R.Throughput difference on V2 with GCC@15:
58-60% improvement in special cases.
No regression in fast pass.
2025-12-04 10:54:49 -03:00
James Chesterman 0e734b2b0c aarch64: Optimise AdvSIMD asinhf
Optimise AdvSIMD asinhf by vectorising the special case.
For values greater than 0x1p64, scale the input down first.
This is because the output will overflow with inputs greater than
or equal to this value as there is a squaring operation in the
algorithm.
To scale, do:
2asinh(sqrt[(x-1)/2])
Because:
2asinh(x) = +-acosh(2x^2 + 1)
Apply opposite operations in opposite order for x, and you get:
asinh(x) = 2acosh(sqrt[(x-1)/2]).
Found that using asinh instead of acosh also very closely
approximates asinh(x) for a high input x.

R.Throughput difference on V2 with GCC@15:
25-58% improvement in special cases.
4% regression in fast pass.
2025-12-04 10:54:49 -03:00
James Chesterman 0e80864c07 aarch64: Optimise AdvSIMD acoshf
Optimise AdvSIMD acoshf by vectorising the special case.
For values greater than 0x1p64, scale the input down first.
This is because the output will overflow with inputs greater than
or equal to this value as there is a squaring operation in the
algorithm.
To scale, do:
2acosh(sqrt[(x+1)/2])
Because:
acosh(x) = 1/2acosh(2x^2 - 1) for x>=1.
Apply opposite operations in opposite order for x, and you get:
acosh(x) = 2acosh(sqrt[(x+1)/2]).

R.Throughput difference on V2 with GCC@15:
30-49% improvement in special cases.
2% regression in fast pass.
2025-12-04 10:54:49 -03:00
Yury Khrustalev 6f869f54fb aarch64: Add tests for glibc.cpu.aarch64_bti behaviour
Check that the new tunable changes behaviour correctly:

 * When BTI is enforced, any unmarked binary that is loaded
   results in an error: either an abort or dlopen error when
   this binary is loaded via dlopen.
 * When BTI is not enforced, it is OK to load an unmarked
   binary.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 12:44:45 +00:00
Yury Khrustalev dba95d2887 aarch64: Support enforcing BTI on dependencies
Add glibc.cpu.aarch64_bti tunable with 2 values:

 - permissive (default)
 - enforced

and use this tunable to enforce BTI marking on dependencies
when the enforced option is selected.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
2025-12-04 12:44:42 +00:00
Yury Khrustalev 59bac0d5d2 aarch64: Add configure checks for BTI support
We add configure checks for 3 things:
 - Compiler (both CC and TEST_CC) supports -mbranch-protection=bti.
 - Linker supports -z force-bti.
 - The toolchain supplies object files and target libraries with
   the BTI marking.

All three must be true in order for the tests to be valid, so
we check all flags and set the makefile variable accordingly.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 12:44:39 +00:00
Yury Khrustalev ccb5083553 aarch64: fix makefile formatting 2025-12-04 12:40:47 +00:00
James Chesterman e3c40c8db0 aarch64: Optimise AdvSIMD log10
Optimise AdvSIMD log10 by vectorising the special case.
For subnormal input values, use the same scaling technique as
described in the single precision equivalent.
Then check for inf, nan and x<=0.
2025-12-04 08:35:25 -03:00
James Chesterman 59c706b418 aarch64: Optimise AdvSIMD log2
Optimise AdvSIMD log2 by vectorising the special case.
For subnormal input values, use the same scaling technique as
described in the single precision equivalent.
Then check for inf, nan and x<=0.
2025-12-04 08:35:25 -03:00
James Chesterman 82d3a8a738 aarch64: Optimise AdvSIMD log
Optimise AdvSIMD log by vectorising the special case.
For subnormal input values, use the same scaling technique as
described in the single precision equivalent.
Then check for inf, nan and x<=0.
2025-12-04 08:35:25 -03:00
James Chesterman 015a13e780 aarch64: Optimise AdvSIMD log1p
Optimise AdvSIMD log1p by vectorising the special case.
The special cases are for when the input is:
Less than or equal to -1
+/- INFINITY
+/- NaN
2025-12-04 08:35:25 -03:00
James Chesterman 57215df30e aarch64: Optimise AdvSIMD log10f
Optimise AdvSIMD log10f by vectorising the special case.
Use scaling technique on subnormal values, then check for inf and
nan values.
The scaling technique will sqrt the input then multiply the output
by 2 because:
log(sqrt(x)) = 1/2(log(x)), so log(x) = 2log(sqrt(x))

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 08:31:19 -03:00
James Chesterman fe83660a7e aarch64: Optimise AdvSIMD log2f
Optimise AdvSIMD log2f by vectorising the special case.
Use scaling technique on subnormal values, then check for inf and
nan values.
The scaling technique used will sqrt the input then multiply the
output by 2 because:
log(sqrt(x)) = 1/2 log(x), so log(x) = 2log(sqrt(x))

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 08:31:15 -03:00
James Chesterman ab8138303c aarch64: Optimise AdvSIMD logf
Optimise AdvSIMD logf by vectorising the special case.
Use scaling technique on subnormal values, then check for inf and
nan values.
The scaling technique used will sqrt the input then multiply the
output by 2 because:
log(sqrt(x)) = 1/2 log(x), so log(x) = 2log(sqrt(x))

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 08:31:08 -03:00
James Chesterman f42c135157 aarch64: Optimise AdvSIMD log1pf
Optimise AdvSIMD log1pf by vectorising the special case and by
reducing the range of values passed to the special case.
Previously, high values such as 0x1.1p127 where treated as special
cases, but now the special cases are for when the input is:
Less than or equal to -1
+/- INFINITY
+/- NaN

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 08:31:02 -03:00
H.J. Lu 762bb01d4e int128: Check BITS_PER_MP_LIMB == 32 instead of __WORDSIZE == 32
commit 8cd6efca5b
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Nov 20 15:30:06 2025 -0300

    Add add_ssaaaa and sub_ssaaaa to gmp-arch.h

checks __WORDSIZE == 32 to decide if int128 should be used, which breaks
x32 which has int128 and __WORDSIZE == 32.  Check BITS_PER_MP_LIMB == 32,
instead of __WORDSIZE == 32.  This fixes BZ #33677.

Tested on x32, x86-64 and i686.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-12-04 07:46:20 +08:00
Adhemerval Zanella f28a11e43f time: Add TIME_MONOTONIC, TIME_ACTIVE, and TIME_THREAD_ACTIVE
The TIME_MONOTONIC maps to POSIX's CLOCK_MONOTONIC, TIME_ACTIVE to
CLOCK_PROCESS_CPUTIME_ID, and TIME_THREAD_ACTIVE to
CLOCK_THREAD_CPUTIME_ID.

No Linux specific timer are added as extension.

Co-authored-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
2025-12-03 11:03:58 -03:00
Joseph Myers 56d0e2cca1 Use Linux 6.18 in build-many-glibcs.py
Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
2025-12-02 16:34:07 +00:00
Yury Khrustalev 11d3cfb570 misc: fix some typos
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-12-02 12:25:36 +00:00
H.J. Lu 3dd2cbfa35 Use 64-bit atomic on sem_t with 8-byte alignment [BZ #33632]
commit 7fec8a5de6
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Nov 13 14:26:08 2025 -0300

    Revert __HAVE_64B_ATOMICS configure check

uses 64-bit atomic operations on sem_t if 64-bit atomics are supported.
But sem_t may be aligned to 32-bit on 32-bit architectures.

1. Add a macro, SEM_T_ALIGN, for sem_t alignment.
2. Add a macro, HAVE_UNALIGNED_64B_ATOMICS.  Define it if unaligned 64-bit
atomic operations are supported.
3. Add a macro, USE_64B_ATOMICS_ON_SEM_T.  Define to 1 if 64-bit atomic
operations are supported and SEM_T_ALIGN is at least 8-byte aligned or
HAVE_UNALIGNED_64B_ATOMICS is defined.
4. Assert that size and alignment of sem_t are not lower than those of
the internal struct new_sem.
5. Check USE_64B_ATOMICS_ON_SEM_T, instead of USE_64B_ATOMICS, when using
64-bit atomic operations on sem_t.

This fixes BZ #33632.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-12-02 06:50:49 +08:00
Yury Khrustalev d605dea0a4 scripts: Support custom Git URLs in build-many-glibcs.py
Use environment variables to provide mirror URLs to checkout
sources from Git. Each component has a corresponding env var
that will be used if it's present: <component>_GIT_MIRROR.

Note that '<component>' should be upper case, e.g. GLIBC.

Co-authored-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-12-01 16:22:02 +00:00
Yury Khrustalev af5ce3ec8f scripts: Support custom FTP mirror URL in build-many-glibcs.py
Allow to use custom mirror URLs to download tarballs from a mirror
of ftp.gnu.org using the FTP_GNU_ORG_MIRROR env variable (default
value is 'https://ftp.gnu.org').

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-12-01 16:21:54 +00:00
Kacper Piwiński 82f4758410 strops: use strlen instead of strchr for string length
For wide string the equivalent funtion __wcslen is used. This change
makes it more symetrical.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-12-01 16:42:54 +01:00
Yury Khrustalev 20092f2ef6 nptl: tests: Fix test-wrapper use in tst-dl-debug-tid.sh
Test wrapper script was used twice: once to run the test
command and second time within the text command which
seems unnecessary and results in false errors when running
this test.

Fixes 332f8e62af

Reviewed-by: Frédéric Bérat <fberat@redhat.com>
2025-12-01 14:39:39 +00:00
Osama Abdelkader 57ce2d8243 Fix allocation_index increment in malloc_internal
The allocation_index was being incremented before checking if mmap()
succeeds.  If mmap() fails, allocation_index would still be incremented,
creating a gap in the allocations tracking array and making
allocation_index inconsistent with the actual number of successful
allocations.

This fix moves the allocation_index increment to after the mmap()
success check, ensuring it only increments when an allocation actually
succeeds.  This maintains proper tracking for leak detection and
prevents gaps in the allocations array.

Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-12-01 13:35:36 +01:00
Adhemerval Zanella f9e61cd446 NEWS: Add new generic fma/fmaf note
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
2025-11-28 09:29:35 -03:00
Florian Weimer e98bd0c54d iconvdata: Fix invalid pointer arithmetic in ANSI_X3.110 module
The expression inptr + 1 can technically be invalid: if inptr == inend,
inptr may point one element past the end of an array.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-11-28 13:20:39 +01:00
Joseph Myers e535fb910c Define C23 header version macros
C23 defines library macros __STDC_VERSION_<header>_H__ to indicate
that a header has support for new / changed features from C23.  Now
that all the required library features are implemented in glibc,
define these macros.  I'm not sure this is sufficiently much of a
user-visible feature to be worth a mention in NEWS.

Tested for x86_64.

There are various optional C23 features we don't yet have, of which I
might look at the Annex H ones (floating-point encoding conversion
functions and _Float16 functions) next.

* Optional time bases TIME_MONOTONIC, TIME_ACTIVE, TIME_THREAD_ACTIVE.
  See
  <https://sourceware.org/pipermail/libc-alpha/2023-June/149264.html>
  - we need to review / update that patch.  (I think patch 2/2,
  inventing new names for all the nonstandard CLOCK_* supported by the
  Linux kernel, is rather more dubious.)

* Updating conform/ tests for C23.

* Defining the rounding mode macro FE_TONEARESTFROMZERO for RISC-V (as
  far as I know, the only architecture supported by glibc that has
  hardware support for this rounding mode for binary floating point)
  and supporting it throughout glibc and its tests (especially the
  string/numeric conversions in both directions that explicitly handle
  each possible rounding mode, and various tests that do likewise).

* Annex H floating-point encoding conversion functions.  (It's not
  entirely clear which are optional even given support for Annex H;
  there's some wording applied inconsistently about only being
  required when non-arithmetic interchange formats are supported; see
  the comments I raised on the WG14 reflector on 23 Oct 2025.)

* _Float16 functions (and other header and testcase support for this
  type).

* Decimal floating-point support.

* Fully supporting __int128 and unsigned __int128 as integer types
  wider than intmax_t, as permitted by C23.  Would need doing in
  coordination with GCC, see GCC bug 113887 for more discussion of
  what's involved.
2025-11-27 19:32:49 +00:00
Adhemerval Zanella 8a0152b61b math: New generic fmaf implementation
The current implementation relies on setting the rounding mode for
different calculations (FE_TOWARDZERO) to obtain correctly rounded
results.  For most CPUs, this adds significant performance overhead
because it requires executing a typically slow instruction (to
get/set the floating-point status), necessitates flushing the
pipeline, and breaks some compiler assumptions/optimizations.

The original implementation adds tests to handle underflow in corner
cases, whereas this implementation uses a different strategy that
checks both the mantissa and the result to determine whether the
result is not subject to double rounding.

I tested this implementation on various targets (x86_64, i686, arm,
aarch64, powerpc), including some by manually disabling the compiler
instructions.

Performance-wise, it shows large improvements:

reciprocal-throughput       master       patched       improvement
x86_64 [1]                   58.09          7.96             7.33x
i686 [1]                    279.41         16.97            16.46x
aarch64 [2]                  26.09          4.10             6.35x
armhf [2]                    30.25          4.20             7.18x
powerpc [3]                   9.46          1.46             6.45x

latency                     master       patched       improvement
x86_64                       64.50         14.25             4.53x
i686                        304.39         61.04             4.99x
aarch64                      27.71          5.74             4.82x
armhf                        33.46          7.34             4.55x
powerpc                      10.96          2.65             4.13x

Checked on x86_64-linux-gnu and i686-linux-gnu with —disable-multi-arch,
and on arm-linux-gnueabihf.

[1] gcc 15.2.1, Zen3
[2] gcc 15.2.1, Neoverse N1
[3] gcc 15.2.1, POWER10

Signed-off-by: Szabolcs Nagy <nsz@gcc.gnu.org>
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Co-authored-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-27 14:52:25 -03:00
Florian Weimer 15de570246 Linux: Ignore PIDFD_GET_INFO in tst-pidfd-consts
The constant is expected to change between kernel releases.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-11-27 14:34:58 +01:00
Adhemerval Zanella a61f7fd59d math: Sync atanh from CORE-MATH
The CORE-MATH commit dc9465e7 fixes some issues:

Failure: Test: atanh_towardzero (0x8.3f79103b3c64p-4)
Result:
 is:          5.7018661316561103e-01   0x1.23ef7ff0539c6p-1
 should be:   5.7018661316561092e-01   0x1.23ef7ff0539c5p-1
 difference:  1.1102230246251565e-16   0x1.0000000000000p-53
 ulp       :  1.0000
 max.ulp   :  0.0000
Failure: Test: atanh_towardzero (0x8.3f7d95aabaf7p-4)
Result:
 is:          5.7019248543911060e-01   0x1.23f044fac5997p-1
 should be:   5.7019248543911049e-01   0x1.23f044fac5996p-1
 difference:  1.1102230246251565e-16   0x1.0000000000000p-53
 ulp       :  1.0000
 max.ulp   :  0.0000
Failure: Test: atanh_towardzero (0x8.3f805380d6728p-4)
Result:
 is:          5.7019604623795527e-01   0x1.23f0bc75cd113p-1
 should be:   5.7019604623795516e-01   0x1.23f0bc75cd112p-1
 difference:  1.1102230246251565e-16   0x1.0000000000000p-53
 ulp       :  1.0000
 max.ulp   :  0.0000
Maximal error of `atanh_towardzero'
 is      : 1 ulp
 accepted: 0 ulp

Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.
2025-11-26 14:10:07 -03:00
Yury Khrustalev bc4bc1650b aarch64: make GCS configure checks aarch64-only
We only need to enable GCS tests on AArch64 targets, however previously
the configure checks for GCS support in compiler and linker were added
for all targets which was not efficient.

To enable tests for GCS we need 4 things to be true:

 - Compiler supports GCS branch protection.
 - Test compiler supports GCS branch protection.
 - Linker supports GCS marking of binaries.
 - The CRT objects provided by the toolchain have GCS marking.

To check for the latter, we add new macro to aclocal.m4 that allows to
grep output from readelf.

We check all four and then put the result in one make variable to
simplify checks in makefiles.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-26 13:50:15 +00:00
Adhemerval Zanella bf211c3499 math: New generic fma implementation
The current implementation relies on setting the rounding mode for
different calculations (first to FE_TONEAREST and then to FE_TOWARDZERO)
to obtain correctly rounded results. For most CPUs, this adds a significant
performance overhead since it requires executing a typically slow
instruction (to get/set the floating-point status), it necessitates
flushing the pipeline, and breaks some compiler assumptions/optimizations.

This patch introduces a new implementation originally written by Szabolcs
for musl, which utilizes mostly integer arithmetic.  Floating-point
arithmetic is used to raise the expected exceptions, without the need for
fenv.h operations.

I added some changes compared to the original code:

  * Fixed some signaling NaN issues when the 3-argument is NaN.

  * Use math_uint128.h for the 64-bit multiplication operation.  It allows
    the compiler to use 128-bit types where available, which enables some
    optimizations on certain targets (for instance, MIPS64).

  * Fixed an arm32 issue where the libgcc routine might not respect the
    rounding mode [1].  This can also be used on other targets to optimize
    the conversion from int64_t to double.

  * Use -fexcess-precision=standard on i686.

I tested this implementation on various targets (x86_64, i686, arm, aarch64,
powerpc), including some by manually disabling the compiler instructions.

Performance-wise, it shows large improvements:

reciprocal-throughput       master       patched       improvement
x86_64 [2]                289.4640       22.4396            12.90x
i686 [2]                  636.8660      169.3640             3.76x
aarch64 [3]                46.0020       11.3281             4.06x
armhf [3]                   63.989       26.5056             2.41x
powerpc [4]                23.9332       6.40205             3.74x

latency                     master       patched       improvement
x86_64                    293.7360       38.1478             7.70x
i686                      658.4160      187.9940             3.50x
aarch64                    44.5166       14.7157             3.03x
armhf                      63.7678       28.4116             2.24x
power10                    23.8561       11.4250             2.09x

Checked on x86_64-linux-gnu and i686-linux-gnu with —disable-multi-arch,
and on arm-linux-gnueabihf.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91970
[2] gcc 15.2.1, Zen3
[3] gcc 15.2.1, Neoverse N1
[4] gcc 15.2.1, POWER10

Signed-off-by: Szabolcs Nagy <nsz@gcc.gnu.org>
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-26 10:10:06 -03:00
Adhemerval Zanella 5dab2a3195 stdlib: Remove longlong.h
The gmp-arch.h now provides all the required definitions.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-26 10:10:06 -03:00
Adhemerval Zanella 7a0471f149 Add umul_ppmm to gmp-arch.hdoc
To enable “longlong.h” removal, the umul_ppmm is moved to a gmp-arch.h.
The generic implementation now uses a static inline, which provides
better type checking than the GNU extension to cast the asm constraint
(and it works better with clang).

Most of the architecture uses the generic implementation, which is
expanded from a macro, except for alpha, arm, hppa, x86, m68k, mips,
powerpc, and sparc.  The 32 bit architectures the compiler generates
good enough code using uint64_t types, where for 64 bit architecture
the patch leverages the math_u128.h definitions that uses 128-bit
integers when available (all 64 bit architectures on gcc 15).

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-26 10:10:06 -03:00
Adhemerval Zanella 8cd6efca5b Add add_ssaaaa and sub_ssaaaa to gmp-arch.h
To enable “longlong.h” removal, add_ssaaaa and sub_ssaaaa are moved to
gmp-arch.h.  The generic implementation now uses a static inline.  This
provides better type checking than the GNU extension, which casts the
asm constraint; and it also works better with clang.

Most architectures use the generic implementation, with except of
arc, arm, hppa, x86, m68k, powerpc, and sparc.  The 32 bit architectures
the compiler generates good enough code using uint64_t types, where
for 64 bit architecture the patch leverages the math_u128.h definitions
that uses 128-bit integers when available (all 64 bit architectures
on gcc 15).

The strongly typed implementation required some changes.  I adjusted
_FP_W_TYPE, _FP_WS_TYPE, and _FP_I_TYPE to use the same type as
mp_limb_t on aarch64, powerpc64le, x86_64, and riscv64.  This basically
means using “long” instead of “long long.”

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-26 10:10:02 -03:00
Adhemerval Zanella 476e962af7 Add gmp-arch and udiv_qrnnd
To enable “longlong.h” removal, the udiv_qrnnd is moved to a gmp-arch.h
file.  It allows each architecture to implement its own arch-specific
optimizations.  The generic implementation now uses a static inline,
which provides better type checking than the GNU extension to cast the
asm constraint (and it works better with clang).

Most of the architecture uses the generic implementation, which is
expanded from a macro, except for alpha, x86, m68k, sh, and sparc.
I kept that alpha, which uses out-of-the-line implementations and x86,
where there is no easy way to use the div{q} instruction from C code.
For the rest, the compiler generates good enough code.

The hppa also provides arch-specific implementations, but they are not
routed in “longlong.h” and thus never used.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-25 14:52:15 -03:00
Adhemerval Zanella e45174fe8c Add new math improvemenst to NEWS
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
2025-11-25 14:51:56 -03:00
Yury Khrustalev 6a29bbcf5a scripts: Fix minor lint warnings in build-many-glibcs.py
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-11-25 13:56:56 +00:00
Arjun Shankar 244c404ae8 malloc: Add threaded variants of single-threaded malloc tests
Single-threaded malloc tests exercise only the SINGLE_THREAD_P paths in
the malloc implementation.  This commit runs variants of these tests in
a multi-threaded environment in order to exercise the alternate code
paths in the same test scenarios, thus potentially improving coverage.

$(test)-threaded-main and $(test)-threaded-worker variants are
introduced for most single-threaded malloc tests (with a small number of
exceptions).  The -main variants run the base test in a main thread
while the test environment has an alternate thread running, whereas the
-worker variants run the test in an alternate thread while the main
thread waits on it.

The tests themselves are unmodified, and the change is accomplished by
using -DTEST_IN_THREAD at compile time, which instructs support/
infrastructure to run the test while an alternate thread waits on it.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-11-24 16:47:52 +01:00
Arjun Shankar bd0e88f05c support: Add support for running tests in a multi-threaded environment
It can be useful to be able to write a single-threaded test but run it
as part of a multi-threaded program simply to exercise glibc
synchronization code paths, e.g. the malloc implementation.

This commit adds support to enable this kind of testing.  Tests that
define TEST_IN_THREAD, either as TEST_THREAD_MAIN or TEST_THREAD_WORKER,
and then use support infrastructure (by including test-driver.c) will be
accordingly run in either the main thread, or in a second "worker"
thread while the other thread waits.

This can be used in new tests, or to easily make and run copies of
existing tests without modifying the tests themselves.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-11-24 16:47:52 +01:00
Samuel Thibault 0f7b73f2ed htl: Fix conditions for thread list variables
_dl_stack_used/user/etc. vs _dl_pthread_num_threads etc. is really an
nptl vs htl question rather than pthread being in libc.
2025-11-22 21:55:02 +01:00
Samuel Thibault c71ee65a79 pthread: Simplify condition for hidden proto
This is not needed yet for htl (only the Linux mq_notify), but we can as
well just simplify the header.
2025-11-22 21:55:02 +01:00
gfleury 585eee3962 htl: move c11 symbols into libc.
thrd_{create,detach,exit,join}.
mtx_{init,destroy,lock,trylock,unlock,timeelock}.
cnd_{broadcast,destroy,init,signal,timewait,wait,destroy}
tss_{create,delete,get,set}. call_once.
Message-ID: <20251121191336.1224485-1-gfleury@disroot.org>
2025-11-22 03:28:48 +01:00
Samuel Thibault 604bdb0f8e htl: Also use __libc_thread_freeres to clean TLS state 2025-11-22 03:27:40 +01:00
Adhemerval Zanella aa6066087f benchtests: Fix bench-build after cd748a63ab
The benchtests does not define _LIBC.
2025-11-21 13:22:34 -03:00
Adhemerval Zanella 907089ba36 linux: Handle EINVAL as unsupported on tst-pidfd_getinfo
Some kernels returns EINVAL for ioctl (PIDFD_GET_INFO) on pidfd
descriptors.

Checked on aarch64-linux-gnu with Linux 6.12.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-11-21 13:13:26 -03:00