Test wrapper script was used twice: once to run the test
command and second time within the text command which
seems unnecessary and results in false errors when running
this test.
Fixes 332f8e62af
Reviewed-by: Frédéric Bérat <fberat@redhat.com>
The allocation_index was being incremented before checking if mmap()
succeeds. If mmap() fails, allocation_index would still be incremented,
creating a gap in the allocations tracking array and making
allocation_index inconsistent with the actual number of successful
allocations.
This fix moves the allocation_index increment to after the mmap()
success check, ensuring it only increments when an allocation actually
succeeds. This maintains proper tracking for leak detection and
prevents gaps in the allocations array.
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
The expression inptr + 1 can technically be invalid: if inptr == inend,
inptr may point one element past the end of an array.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
C23 defines library macros __STDC_VERSION_<header>_H__ to indicate
that a header has support for new / changed features from C23. Now
that all the required library features are implemented in glibc,
define these macros. I'm not sure this is sufficiently much of a
user-visible feature to be worth a mention in NEWS.
Tested for x86_64.
There are various optional C23 features we don't yet have, of which I
might look at the Annex H ones (floating-point encoding conversion
functions and _Float16 functions) next.
* Optional time bases TIME_MONOTONIC, TIME_ACTIVE, TIME_THREAD_ACTIVE.
See
<https://sourceware.org/pipermail/libc-alpha/2023-June/149264.html>
- we need to review / update that patch. (I think patch 2/2,
inventing new names for all the nonstandard CLOCK_* supported by the
Linux kernel, is rather more dubious.)
* Updating conform/ tests for C23.
* Defining the rounding mode macro FE_TONEARESTFROMZERO for RISC-V (as
far as I know, the only architecture supported by glibc that has
hardware support for this rounding mode for binary floating point)
and supporting it throughout glibc and its tests (especially the
string/numeric conversions in both directions that explicitly handle
each possible rounding mode, and various tests that do likewise).
* Annex H floating-point encoding conversion functions. (It's not
entirely clear which are optional even given support for Annex H;
there's some wording applied inconsistently about only being
required when non-arithmetic interchange formats are supported; see
the comments I raised on the WG14 reflector on 23 Oct 2025.)
* _Float16 functions (and other header and testcase support for this
type).
* Decimal floating-point support.
* Fully supporting __int128 and unsigned __int128 as integer types
wider than intmax_t, as permitted by C23. Would need doing in
coordination with GCC, see GCC bug 113887 for more discussion of
what's involved.
The current implementation relies on setting the rounding mode for
different calculations (FE_TOWARDZERO) to obtain correctly rounded
results. For most CPUs, this adds significant performance overhead
because it requires executing a typically slow instruction (to
get/set the floating-point status), necessitates flushing the
pipeline, and breaks some compiler assumptions/optimizations.
The original implementation adds tests to handle underflow in corner
cases, whereas this implementation uses a different strategy that
checks both the mantissa and the result to determine whether the
result is not subject to double rounding.
I tested this implementation on various targets (x86_64, i686, arm,
aarch64, powerpc), including some by manually disabling the compiler
instructions.
Performance-wise, it shows large improvements:
reciprocal-throughput master patched improvement
x86_64 [1] 58.09 7.96 7.33x
i686 [1] 279.41 16.97 16.46x
aarch64 [2] 26.09 4.10 6.35x
armhf [2] 30.25 4.20 7.18x
powerpc [3] 9.46 1.46 6.45x
latency master patched improvement
x86_64 64.50 14.25 4.53x
i686 304.39 61.04 4.99x
aarch64 27.71 5.74 4.82x
armhf 33.46 7.34 4.55x
powerpc 10.96 2.65 4.13x
Checked on x86_64-linux-gnu and i686-linux-gnu with —disable-multi-arch,
and on arm-linux-gnueabihf.
[1] gcc 15.2.1, Zen3
[2] gcc 15.2.1, Neoverse N1
[3] gcc 15.2.1, POWER10
Signed-off-by: Szabolcs Nagy <nsz@gcc.gnu.org>
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Co-authored-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
We only need to enable GCS tests on AArch64 targets, however previously
the configure checks for GCS support in compiler and linker were added
for all targets which was not efficient.
To enable tests for GCS we need 4 things to be true:
- Compiler supports GCS branch protection.
- Test compiler supports GCS branch protection.
- Linker supports GCS marking of binaries.
- The CRT objects provided by the toolchain have GCS marking.
To check for the latter, we add new macro to aclocal.m4 that allows to
grep output from readelf.
We check all four and then put the result in one make variable to
simplify checks in makefiles.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The current implementation relies on setting the rounding mode for
different calculations (first to FE_TONEAREST and then to FE_TOWARDZERO)
to obtain correctly rounded results. For most CPUs, this adds a significant
performance overhead since it requires executing a typically slow
instruction (to get/set the floating-point status), it necessitates
flushing the pipeline, and breaks some compiler assumptions/optimizations.
This patch introduces a new implementation originally written by Szabolcs
for musl, which utilizes mostly integer arithmetic. Floating-point
arithmetic is used to raise the expected exceptions, without the need for
fenv.h operations.
I added some changes compared to the original code:
* Fixed some signaling NaN issues when the 3-argument is NaN.
* Use math_uint128.h for the 64-bit multiplication operation. It allows
the compiler to use 128-bit types where available, which enables some
optimizations on certain targets (for instance, MIPS64).
* Fixed an arm32 issue where the libgcc routine might not respect the
rounding mode [1]. This can also be used on other targets to optimize
the conversion from int64_t to double.
* Use -fexcess-precision=standard on i686.
I tested this implementation on various targets (x86_64, i686, arm, aarch64,
powerpc), including some by manually disabling the compiler instructions.
Performance-wise, it shows large improvements:
reciprocal-throughput master patched improvement
x86_64 [2] 289.4640 22.4396 12.90x
i686 [2] 636.8660 169.3640 3.76x
aarch64 [3] 46.0020 11.3281 4.06x
armhf [3] 63.989 26.5056 2.41x
powerpc [4] 23.9332 6.40205 3.74x
latency master patched improvement
x86_64 293.7360 38.1478 7.70x
i686 658.4160 187.9940 3.50x
aarch64 44.5166 14.7157 3.03x
armhf 63.7678 28.4116 2.24x
power10 23.8561 11.4250 2.09x
Checked on x86_64-linux-gnu and i686-linux-gnu with —disable-multi-arch,
and on arm-linux-gnueabihf.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91970
[2] gcc 15.2.1, Zen3
[3] gcc 15.2.1, Neoverse N1
[4] gcc 15.2.1, POWER10
Signed-off-by: Szabolcs Nagy <nsz@gcc.gnu.org>
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To enable “longlong.h” removal, the umul_ppmm is moved to a gmp-arch.h.
The generic implementation now uses a static inline, which provides
better type checking than the GNU extension to cast the asm constraint
(and it works better with clang).
Most of the architecture uses the generic implementation, which is
expanded from a macro, except for alpha, arm, hppa, x86, m68k, mips,
powerpc, and sparc. The 32 bit architectures the compiler generates
good enough code using uint64_t types, where for 64 bit architecture
the patch leverages the math_u128.h definitions that uses 128-bit
integers when available (all 64 bit architectures on gcc 15).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To enable “longlong.h” removal, add_ssaaaa and sub_ssaaaa are moved to
gmp-arch.h. The generic implementation now uses a static inline. This
provides better type checking than the GNU extension, which casts the
asm constraint; and it also works better with clang.
Most architectures use the generic implementation, with except of
arc, arm, hppa, x86, m68k, powerpc, and sparc. The 32 bit architectures
the compiler generates good enough code using uint64_t types, where
for 64 bit architecture the patch leverages the math_u128.h definitions
that uses 128-bit integers when available (all 64 bit architectures
on gcc 15).
The strongly typed implementation required some changes. I adjusted
_FP_W_TYPE, _FP_WS_TYPE, and _FP_I_TYPE to use the same type as
mp_limb_t on aarch64, powerpc64le, x86_64, and riscv64. This basically
means using “long” instead of “long long.”
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To enable “longlong.h” removal, the udiv_qrnnd is moved to a gmp-arch.h
file. It allows each architecture to implement its own arch-specific
optimizations. The generic implementation now uses a static inline,
which provides better type checking than the GNU extension to cast the
asm constraint (and it works better with clang).
Most of the architecture uses the generic implementation, which is
expanded from a macro, except for alpha, x86, m68k, sh, and sparc.
I kept that alpha, which uses out-of-the-line implementations and x86,
where there is no easy way to use the div{q} instruction from C code.
For the rest, the compiler generates good enough code.
The hppa also provides arch-specific implementations, but they are not
routed in “longlong.h” and thus never used.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Single-threaded malloc tests exercise only the SINGLE_THREAD_P paths in
the malloc implementation. This commit runs variants of these tests in
a multi-threaded environment in order to exercise the alternate code
paths in the same test scenarios, thus potentially improving coverage.
$(test)-threaded-main and $(test)-threaded-worker variants are
introduced for most single-threaded malloc tests (with a small number of
exceptions). The -main variants run the base test in a main thread
while the test environment has an alternate thread running, whereas the
-worker variants run the test in an alternate thread while the main
thread waits on it.
The tests themselves are unmodified, and the change is accomplished by
using -DTEST_IN_THREAD at compile time, which instructs support/
infrastructure to run the test while an alternate thread waits on it.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
It can be useful to be able to write a single-threaded test but run it
as part of a multi-threaded program simply to exercise glibc
synchronization code paths, e.g. the malloc implementation.
This commit adds support to enable this kind of testing. Tests that
define TEST_IN_THREAD, either as TEST_THREAD_MAIN or TEST_THREAD_WORKER,
and then use support infrastructure (by including test-driver.c) will be
accordingly run in either the main thread, or in a second "worker"
thread while the other thread waits.
This can be used in new tests, or to easily make and run copies of
existing tests without modifying the tests themselves.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Some kernels returns EINVAL for ioctl (PIDFD_GET_INFO) on pidfd
descriptors.
Checked on aarch64-linux-gnu with Linux 6.12.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
clangs warns of the implicit cast of RAND_MAX to float:
error: implicit conversion from 'int' to 'float' changes value from
2147483647 to 2147483648 [-Werror,-Wimplicit-const-int-float-conversion]
So make it explicit.
Reviewed-by: Sam James <sam@gentoo.org>
Similar to tst-printf-bz18872.sh, add the attribute_optimize to avoid
build failures with compilers that do not support "GCC optimize" pragma.
Reviewed-by: Sam James <sam@gentoo.org>
clang generates internal calls for some _chk symbol, so add internal
aliases for them, and stub some with rtld-stubbed-symbols to avoid
ld.so linker issues.
Reviewed-by: Sam James <sam@gentoo.org>
It is worse than the ldbl-64 version on recent x86 hardware. With
Zen3 and gcc-15:
ldbl-96 removal
reciprocal-throughput master patched improvement
x86_64 1176.2200 289.4640 4.06x
i686 1476.0600 636.8660 2.32x
latency master patched improvement
x86_64 1176.2200 293.7360 4.00x
i686 1480.0700 658.4160 2.25x
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
C23 makes various standard library functions, that return a pointer
into an input array, into macros that return a pointer to const when
the relevant argument passed to the macro is a pointer to const. (The
requirement is for macros, with the existing function types applying
when macro expansion is suppressed. When a null pointer constant is
passed, such as integer 0, that's the same as a pointer to non-const.)
Implement this feature. This only applies to C, not C++, since such
macros are not an appropriate way of doing this for C++ and all the
affected functions other than bsearch have overloads to implement an
equivalent feature for C++ anyway. Nothing is done to apply such a
change to any non-C23 functions with the same property of returning a
pointer into an input array.
The feature is also disabled when _LIBC is defined, since there are
various places in glibc that either redefine these identifiers as
macros, or define the functions themselves, and would need changing to
work in the presence of these macro definitions. A natural question
is whether we should in fact change those places and not disable the
macro definitions for _LIBC. If so, we'd need a solution for the
places in glibc that define the macro *before* including the relevant
header (in order in effect to disable the header declaration of the
function by renaming that declaration).
One testcase has #undef added to avoid conflicting with this feature
and another has const added; -Wno-discarded-qualifiers is added for
building zic (but could be removed once there's a new upstream tzcode
release that's const-safe with this C23 change and glibc has updated
to code from that new release). Probably other places in glibc proper
would need const added if we remove the _LIBC conditionals.
Another question would be whether some GCC extension should be added
to support this feature better with macros that only expand each
argument once (as well as reducing duplication of diagnostics for bad
usages such as non-pointer and pointer-to-volatile-qualfied
arguments).
Tested for x86_64.
Although binutils has supported --no-undefined-version for a long timei
(319416359200 back in 2002), --undefined-version was only added more
recently (27fb6a1a7fcd on 2.40).
Reviewed-by: Sam James <sam@gentoo.org>
Directly call _int_free_chunk during tcache shutdown to avoid recursion.
Calling __libc_free on a block from tcache gets flagged as a double free,
and tcache_double_free_verify checks every tcache chunk (quadratic
overhead).
Reviewed-by: Arjun Shankar <arjun@redhat.com>
The CORE-MATH commit 6736002f fixes some issues for RNDZ:
Failure: Test: acosh_towardzero (0x1.08000c1e79fp+0)
Result:
is: 2.4935636091994373e-01 0x1.feae8c399b18cp-3
should be: 2.4935636091994370e-01 0x1.feae8c399b18bp-3
difference: 2.7755575615628913e-17 0x1.0000000000000p-55
ulp : 1.0000
max.ulp : 0.0000
Failure: Test: acosh_towardzero (0x1.080016353964ep+0)
Result:
is: 2.4935874767710369e-01 0x1.feafcc91f518ep-3
should be: 2.4935874767710367e-01 0x1.feafcc91f518dp-3
difference: 2.7755575615628913e-17 0x1.0000000000000p-55
ulp : 1.0000
max.ulp : 0.0000
Maximal error of `acosh_towardzero'
is : 1 ulp
accepted: 0 ulp
This only happens when the ISA supports fma, such as x86_64-v3, aarch64,
or powerpc.
Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.
Verify that the kernel side of the termios interface gets the various
speed fields set according to our current canonicalization policy.
[ v2.1: fix formatting - Adhemerval Netto ]
[ v4: fix typo in patch description - Dan Horák ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (v2.1)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Remove legacy code for supporting an old Arm Optimised Routines
deprecated feature for throwing SIMD Exceptions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
powf:
Update scalar special case function to best use new interface.
pow:
Make specialcase NOINLINE to prevent str/ldr leaking in fast path.
Remove depency in sv_call2, as new callback impl is not a
performance gain.
Replace with vectorised specialcase since structure of scalar
routine is fairly simple.
Throughput gain of about 5-10% on V1 for large values and 25% for subnormal `x`.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Fixed svld1rq using incorrect predicates (BZ #33642).
Next to no performance variations (tested on V1).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>