The __HAVE_64B_ATOMICS can be define based on __WORDSIZE, and
the __ARCH_ACQ_INSTR, MUTEX_HINT_*, and barriers definition are
defined by the target cpu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
These are already provided by the generic include/atomic.h and
the resulting macros are not Linux specific.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Both m68k and m68k-colfire do not support 64 bit atomis. The
atomic_barrier syscall on m68k is a no-op, so it can use the compiler
builtin.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The libgcc provides the required support to calling the kernel
auxiliary routines for !__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
These are already provided by the generic include/atomic.h.
Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
These are already provided by the generic include/atomic.h. Also
remove outdated comment from unsupported gcc versions.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Build programs in $(others-noinstall) like tests only if libgcc_s is
available. Otherwise, "build-many-glibcs.py compilers" will fail to
build the initial glibc with the initial limited gcc due to the missing
libgcc_s.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
C23 makes assert into a variadic macro to handle cases of an argument
that would be interpreted as a single function argument but more than
one macro argument (in particular, compound literals with an
unparenthesized comma in an initializer list); this change was made by
N2829. Note that this only applies to assert, not to other macros
specified in the C standard with particular numbers of arguments.
Implement this support in glibc. This change is only for C; C++ would
need a separate change to its separate assert implementations. It's
also applied only in C23 mode. It depends on support for (C99)
variadic macros, and also (in order to detect calls where more than
one expression is passed, via an unevaluated function call) a C99
boolean type. These requirements are encapsulated in the definition
of __ASSERT_VARIADIC. Tests with -std=c99 and -std=gnu99 (using
implementations continue to work.
I don't think we have a way in the glibc testsuite to validate that
passing more than one expression as an argument does produce the
desired error.
Tested for x86_64.
The Dynamic Linker chapter now includes a new section detailing
environment variables that influence its behavior.
This new section documents the `LD_DEBUG` environment variable,
explaining how to enable debugging output and listing its various
keywords like `libs`, `reloc`, `files`, `symbols`, `bindings`,
`versions`, `scopes`, `tls`, `all`, `statistics`, `unused`, and `help`.
It also documents `LD_DEBUG_OUTPUT`, which controls where the debug
output is written, allowing redirection to a file with the process ID
appended.
This provides users with essential information for controlling and
debugging the dynamic linker.
Reviewed-by: DJ Delorie <dj@redhat.com>
Introduce the `DL_DEBUG_TLS` debug mask to enable detailed logging for
Thread-Local Storage (TLS) and Thread Control Block (TCB) management.
This change integrates a new `tls` option into the `LD_DEBUG`
environment variable, allowing developers to trace:
- TCB allocation, deallocation, and reuse events in `dl-tls.c`,
`nptl/allocatestack.c`, and `nptl/nptl-stack.c`.
- Thread startup events, including the TID and TCB address, in
`nptl/pthread_create.c`.
A new test, `tst-dl-debug-tid`, has been added to validate the
functionality of this new debug logging, ensuring that relevant messages
are correctly generated for both main and worker threads.
This enhances the debugging capabilities for diagnosing issues related
to TLS allocation and thread lifecycle within the dynamic linker.
Reviewed-by: DJ Delorie <dj@redhat.com>
Introduce a RISC-V specific string-misc.h to provide an optimized
repeat_bytes implementation when the Zbkb extension is available.
The new version uses packh/packw/pack instruction count and avoiding
high latency instructions. This helper is used by several mem and string
functions, and falls back to the generic implementation when Zbkb is not
present.
Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn>
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Remove xfail from pow testcase since pow and powf have been fixed.
Also check float128 maximum value. See BZ #33563.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fix pow (DBL_MAX, 1.0) to return DBL_MAX when rouding upwards without FMA.
This fixes BZ #33563.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fix powf (0x1.fffffep+127, 1.0f) to return 0x1.fffffep+127 when
rouding upwards. Cleanup the special case code - performance
improves by ~1.2%. This fixes BZ #33563.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When building with -Og to enable debugging, there is currently a compiler error
because if __libc_message_wrapper() is not inline, the __va_arg_pack_len macro
cannot be used.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
According to the tcc (tiny C compiler) Changelog, tcc supports
__attribute__ since 0.9.3. Looking at history of tcc at
<https://repo.or.cz/tinycc.git>, __attribute__ support was added
in commit 14658993425878be300aae2e879560698e0c6c4c on 2002-01-03,
which also looks like the release of 0.9.3. While I'm unable to
find release tags for tcc before 0.9.18 (2003-04-14), the next
release (0.9.28) will include __attribute__((cleanup(func)) which
I rely on.
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Simplify the alignment steps for SZREG and BLOCK_SIZE multiples. The previous
three-instruction sequences
addi a7, a2, -SZREG
andi a7, a7, -SZREG
addi a7, a7, SZREG
and
addi a7, a2, -BLOCK_SIZE
andi a7, a7, -BLOCK_SIZE
addi a7, a7, BLOCK_SIZE
are equivalent to a single
andi a7, a2, -SZREG
andi a7, a2, -BLOCK_SIZE
because SZREG and BLOCK_SIZE are powers of two in this context, making the
surrounding addi steps cancel out. Folding to one instruction reduces code
size with identical semantics.
No functional change.
sysdeps/riscv/multiarch/memcpy_noalignment.S: Remove redundant addi around
alignment; keep a single andi for SZREG/BLOCK_SIZE rounding.
Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn>
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Tidy the temporary register allocation to favor registers eligible for
compressed encodings when Zca/Zcb are enabled. This keeps the ABI and
clobber set unchanged and does not alter control flow or memory access
behavior.
No functional change.
sysdeps/riscv/multiarch/memcpy_noalignment.S: Reassign temps to improve
compressed encoding opportunities.
Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn>
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
It improves latency for about 3-10% and throughput for about 5-15%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 1-10% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 3-7% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 2% and throughput for about 5%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 2-10% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 3-10% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The m68k provided an optimized version through __m81_u(fmod)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The m68k provided an optimized version through __m81_u(fmodf)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. It allows us to move the
implementation to a C one.
The performance on a Zen3 chip is slight better:
reciprocal-throughput input master no-SVID improvement
i686 subnormals 22.4741 20.1571 10.31%
i686 normal 74.1631 70.3606 5.13%
i686 close-exponent 22.5625 20.2435 10.28%
Tested on i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. It allows us to move the
implementation to a C one. The performance on a Zen3 chip is
similar to the SVID one.
Tested on i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The ifunc selector for wmemset had a stray '!' in the
X86_ISA_CPU_FEATURES_ARCH_P(...) check:
if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2)
&& X86_ISA_CPU_FEATURES_ARCH_P (cpu_features,
AVX_Fast_Unaligned_Load, !))
This effectively negated the predicate and caused the AVX2/AVX512
paths to be skipped, making the dispatcher fall back to the SSE2
implementation even on CPUs where AVX2/AVX512 are available. The
regression leads to noticeable throughput loss for wmemset.
Remove the stray '!' so the AVX_Fast_Unaligned_Load capability is
tested as intended and the correct AVX2/EVEX variants are selected.
Impact:
- On AVX2/AVX512-capable x86_64, wmemset no longer incorrectly
falls back to SSE2; perf now shows __wmemset_evex/avx2 variants.
Testing:
- benchtests/bench-wmemset shows improved bandwidth across sizes.
- perf confirm the selected symbol is no longer SSE2.
Signed-off-by: xiejiamei <xiejiamei@hygon.com>
Signed-off-by: Li jing <lijing@hygon.cn>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It issues:
../sysdeps/aarch64/tst-ifunc-arg-4.c:39:1: error: unused function 'resolver' [-Werror,-Wunused-function]
39 | resolver (uint64_t arg0, const uint64_t arg1[])
| ^~~~~~~~
1 error generated.
clang-19 and onwards do not trigger the warning.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Recent lld version default to --no-undefined-version, which triggers
errors when building multiple libraries. For ld.so on x86_64 it fails
with:
ld.lld: error: version script assignment of 'GLIBC_2.4' to symbol '__stack_chk_guard' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_PRIVATE' to symbol '__nptl_set_robust_list_avail' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_PRIVATE' to symbol '__pointer_chk_guard' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_PRIVATE' to symbol '_dl_starting_up' failed: symbol not defined
While for libc.so:
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_clearerr' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_fgetc' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_fileno' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_freopen' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_fscanf' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_fseek' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_peekc_unlocked' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_stderr_' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_stdin_' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_stdout_' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_pclose' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_perror' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_rewind' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_scanf' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_setbuf' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_setlinebuf' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_wdefault_setbuf' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '_IO_wfile_setbuf' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '__ctype32_tolower' failed: symbol not defined
ld.lld: error: version script assignment of 'GLIBC_2.17' to symbol '__ctype32_toupper' failed: symbol not defined
ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
The version script is created with multiple missing symbols to simplify
the build for multiple ABIs, each of which may have different symbols.
For instance, __stack_chk_guard is defined by default. This avoids
requiring each ABI to add this symbol to its version script, depending
on the stack protector ABI it uses.
The libc.so warnings do show unused symbols being defined (like
_IO_clearerr), which might trigger potential errors depending on how
symbols are exported. However, since we already have ABI checks for
missing and extra symbols, the linker's extra checks are not really
necessary.
The --no-undefined-version is the default for ld.bfd.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
clang 20 issues an warning for the unused '-c' argument used to create
errlist-data-aux-shared.S, errlist-data-aux.S, siglist-aux-shared.S,
and siglist-aux.S. Filter out the '-c' from the $(compile-command.c).
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The clang default to warning for missing fall-through and it does
not support all comment-like annotation that gcc does. Use C23
[[fallthrough]] annotation instead.
proper attribute instead.
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
The internal header redefines the some internal argp functions with
attribute_hidden, which triggers clang warning of mismatched attributes.
Reviewed-by: Collin Funk <collin.funk1@gmail.com>