The new file names are COPYINGv2 and COPYING.LESSERv2. Lots of
copyright headers mention COPYING.LIB, so add a symbolic link.
(This is not the first symbolic link in the repository, so this
should be fine.)
The files come from gnulib commit 3cc5b69dda06890929a2d0433f30708.
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
The license is referenced in various headers, so we should ship it.
The text was copied from gnulib commit d64d66cc4897d605f543257dcd0,
file doc/COPYINGv3.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Commit 3360913c37 ("elf: Add SFrame
stack tracing") added this file with an inconsistent copyright header.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is notably needed for the main thread structure to be always
initialized so that some pthread functions can work from the main thread
without other threads, e.g. pthread_cancel.
Add fast path optimization for frexpl (80-bit x87 extended precision) using
a single unsigned comparison to identify normal floating-point numbers and
return immediately via arithmetic on the exponent field.
The implementation uses arithmetic operations (se - ex ) to
adjust the exponent directly, which is simpler than bit masking. For subnormals,
the traditional multiply-based normalization is retained as it handles the
split word format more reliably.
The zero/infinity/NaN check groups these special cases together for better
branch prediction.
Benchmark results on Intel Core i9-13900H (13th Gen):
Baseline: 25.543 ns/op
Optimized: 25.531 ns/op
Speedup: 1.00x (neutral)
Zero: 17.774 ns/op
Denormal: 23.900 ns/op
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The 53807741fb added a configure check
for 64-bit atomic operations that were not previously enabled on some
32-bit ABIs.
However, the NPTL semaphore code casts a sem_t to a new_sem and issues
a 64-bit atomic operation for __HAVE_64B_ATOMICS. Since sem_t has
32-bit alignment on 32-bit architectures, this prevents the use of
64-bit atomics even if the ABI supports them.
Assume 64-bit atomic support from __WORDSIZE, which maps to how glibc
defines it before the broken change. Also rename __HAVE_64B_ATOMICS
to USE_64B_ATOMICS to define better the flag meaning.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The file was added with a GPL reference (but LGPL statement) in
commit 0d6bed7150 ("hppa: Add
____longjmp_check C implementation.").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
As discussed in bug 28327, C23 changed the fromfp functions to return
floating types instead of intmax_t / uintmax_t. (Although the
motivation in N2548 was reducing the use of intmax_t in library
interfaces, the new version does have the advantage of being able to
specify arbitrary integer widths for e.g. assigning the result to a
_BitInt, as well as being able to indicate an error case in-band with
a NaN return.)
As with other such changes from interfaces introduced in TS 18661,
implement the new types as a replacement for the old ones, with the
old functions remaining as compat symbols but not supported as an API.
The test generator used for many of the tests is updated to handle
both versions of the functions.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Also tested tgmath tests for x86_64 with GCC 7 to make sure that the
modified case for older compilers in <tgmath.h> does work.
Also tested for powerpc64le to cover the ldbl-128ibm implementation
and the other things that are handled differently for that
configuration. The new tests fail for ibm128, but all the failures
relate to incorrect signs of zero results and turn out to arise from
bugs in the underlying roundl, ceill, truncl and floorl
implementations that I've reported in bug 33623, rather than
indicating any bug in the actual new implementation of the functions
for that format. So given fixes for those functions (which shouldn't
be hard, and of course should add to the tests for those functions
rather than relying only on indirect testing via fromfp), the fromfp
tests should start passing for ibm128 as well.
Remove uses of float_t and double_t. This is not useful on modern machines,
and does not help given GCC defaults to -fexcess-precision=fast.
One use of double_t remains to allow forcing the precision to double
on targets where FLT_EVAL_METHOD=2. This fixes BZ #33563 on
i486-pc-linux-gnu.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Remove ldbl-128/s_fma.c - it makes no sense to use emulated float128
operations to emulate FMA. Benchmarking shows dbl-64/s_fma.c is about
twice as fast. Remove redundant dbl-64/s_fma.c includes in targets
that were trying to work around this issue.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It has been added on Linux 6.10 (8be7258aad44b5e25977a98db136f677fa6f4370)
as a way to block operations such as mapping, moving to another location,
shrinking the size, expanding the size, or modifying it to a pre-existing
memory mapping.
Although the system only works on 64-bit CPUs, the entrypoint was added
for all ABIs (since the kernel might eventually implement it for additional
ones and/or the ABI can execute on a 64-bit kernel).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
When R_LARCH_IRELATIVE is resolved by apply_irel, the ifunc resolver is
called via elf_ifunc_invoke so it can read HWCAP from the __ifunc_arg_t
argument. But when R_LARCH_IRELATIVE is resolved by elf_machine_rela (it
will happen if we dlopen() a shared object containing R_LARCH_IRELATIVE),
the ifunc resolver is invoked directly with no or different argument.
This causes a segfault if the resolver uses the __ifunc_arg_t.
Despite the LoongArch psABI does not specify this argument, IMO it's
more convenient to have this argument IMO and per hyrum's rule there may
be objects in wild which already relies on this argument (they just
didn't blow up because they are not dlopen()ed yet). So make the
behavior handling R_LARCH_IRELATIVE of elf_machine_rela same as
apply_irel.
This fixes BZ #33610.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
The definition of once_flag conflicts with std::once_flag in
if “using namespace std;” is active.
Updates commit a7ddbf456d
("Add once_flag, ONCE_FLAG_INIT and call_once to stdlib.h for C23").
Suggested-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Benchmarks indicate evex can be more profitable on Hygon hardware
than AVX512. So add Prefer_No_AVX512 to make it run with evex.
Change-Id: Icc59492f71fde7a783a8bd315714ffd6f7ecaf29
Signed-off-by: Li jing <lijing@hygon.cn>
Signed-off-by: Xie jiamei <xiejiamei@hygon.cn>
Add fast path optimization for frexpl (128-bit IEEE quad precision) using
a single unsigned comparison to identify normal floating-point numbers and
return immediately via arithmetic on the exponent field.
The implementation uses arithmetic operations hx = hx - (ex << 48)
to adjust the exponent in place, which is simpler and more efficient than
bit masking. For subnormals, the traditional multiply-based normalization
is retained for reliability with the split 64-bit word format.
The zero/infinity/NaN check groups these special cases together for better
branch prediction.
This optimization provides the same algorithmic improvements as the other
frexp variants while maintaining correctness for all edge cases.
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Add fast path optimization for frexp using a single unsigned comparison
to identify normal floating-point numbers and return immediately via
arithmetic on the bit representation.
The implementation uses asuint64()/asdouble() from math_config.h and arithmetic
operations to adjust the exponent, which generates better code than bit masking
on ARM and RISC-V architectures. For subnormals, stdc_leading_zeros provides
faster normalization than the traditional multiply approach.
The zero/infinity/NaN check is simplified to (int64_t)(ix << 1) <= 0, which
is more efficient than separate comparisons.
Benchmark results on Intel Core i9-13900H (13th Gen):
Baseline: 6.778 ns/op
Optimized: 4.007 ns/op
Speedup: 1.69x (40.9% faster)
Zero: 3.580 ns/op (fast path)
Denormal: 6.096 ns/op (slower, rare case)
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add fast path optimization for frexpf using a single unsigned comparison
to identify normal floating-point numbers and return immediately via
arithmetic on the bit representation.
The implementation uses asuint()/asfloat() from math_config.h and arithmetic
operations to adjust the exponent, which generates better code than bit masking
on ARM and RISC-V architectures. For subnormals, stdc_leading_zeros provides
faster normalization than the traditional multiply approach.
The zero/infinity/NaN check is simplified to (int32_t)(hx << 1) <= 0, which
is more efficient than separate comparisons.
Benchmark results on Intel Core i9-13900H (13th Gen):
Baseline: 5.858 ns/op
Optimized: 4.003 ns/op
Speedup: 1.46x (31.7% faster)
Zero: 3.580 ns/op (fast path)
Denormal: 5.597 ns/op (slower, rare case)
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add benchmark support for frexp, frexpf, and frexpl to measure the
performance improvement of the fast path optimization.
- Created frexp-inputs, frexpf-inputs, frexpl-inputs with random test values
- Added frexp, frexpf, frexpl to bench-math list
- Added CFLAGS to disable builtins for accurate benchmarking
These benchmarks will be used to quantify the performance gains from the
fast path optimization for normal floating-point numbers.
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
clang might generate an abort call when cleanup functions (set by
__attribute__ ((cleanup)) calls functions not marked as nothrow.
The hurd already provides abort for the loader at
sysdeps/mach/hurd/dl-sysdep.c, and adding it rtld-stubbed-symbols
triggers duplicate symbols.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
clang does not support the %v to select the AVX encoding, nor the '%d' asm
contrain, and for AVX build it requires all 3 arguments.
This patch add a new internal header, math-inline-asm.h, that adds
functions to abstract the inline asm required differences between
gcc and clang.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Some symbols that might be auto-generated by the compiler are redefined
to internal alias (for instance mempcpy to __mempcpy). However, if fortify
is enabled, the fortify wrapper is define before the alias re-defined and
clang warns attribute declaration must precede definition.
Use an asm alias if compiler does not support it, instead of an
attribute.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The count_leading_zeros is not used anymore, so there is no need to
provide the table for possible usage. The hppa already provides
the compat symbol on libgcc-compat.c.
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
This adds testing for the fix added in commit:
0fceed2545
"nss: Group merge does not react to ERANGE during merge (bug 33361)"
The in-use group size is increased large enough to trigger ERANGE
for initial buffers and cause a retry. The actualy size is
approximately twice that required to trigger the defect, though
any size larger than NSS_BUFLEN_GROUP triggers the defect.
Without the fix the group is not merged and the failure is detected,
but with the fix the ERANGE error is handled, buffers are enlarged
and subsequently correctly merged.
Tested with a/b testing before and after patching.
Tested on x86_64 with no regression.
Co-authored-by: Patsy Griffin <patsy@redhat.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
The variable was removed in commit 2c421fc430
("AArch64: Cleanup PAC and BTI"), so this Makefile fragment is
always excluded.
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Two tests fail from time to time when a new flag is added for the
p{write,read}v2 functions in a new Linux kernel:
- misc/tst-preadvwritev2
- misc/tst-preadvwritev64v2
This disrupts when testing Glibc on a system with a newer kernel
and it seems we can try improve testing for invalid flags setting
all the bits that are not supposed to be supported (rather than
setting only the next unsupported bit).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Using TEST_VERIFY (crname_target != crname) instructs some analysis
tools that crname_target == crname might hold. Under this assumption,
they report a use-after-free for crname_target->offset below, caused
by the previous free (crname).
Reviewed-by: Collin Funk <collin.funk1@gmail.com>