2025-01-01 18:14:45 +00:00
|
|
|
/* Copyright (C) 2000-2025 Free Software Foundation, Inc.
|
2003-03-17 16:20:44 +00:00
|
|
|
This file is part of the GNU C Library.
|
|
|
|
|
|
|
|
The GNU C Library is free software; you can redistribute it and/or
|
|
|
|
modify it under the terms of the GNU Lesser General Public
|
|
|
|
License as published by the Free Software Foundation; either
|
|
|
|
version 2.1 of the License, or (at your option) any later version.
|
|
|
|
|
|
|
|
The GNU C Library is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
Lesser General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU Lesser General Public
|
2012-03-09 23:56:38 +00:00
|
|
|
License along with the GNU C Library. If not, see
|
Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 05:40:42 +00:00
|
|
|
<https://www.gnu.org/licenses/>. */
|
2003-03-17 16:20:44 +00:00
|
|
|
|
|
|
|
#ifndef _LINUX_MIPS_SYSDEP_H
|
|
|
|
#define _LINUX_MIPS_SYSDEP_H 1
|
|
|
|
|
|
|
|
/* There is some commonality. */
|
Remove PREPARE_VERSION and PREPARE_VERSION_KNOW
This patch removes the PREPARE_VERSION and PREPARE_VERSION_KNOW macro
and uses a static inline function instead, get_vdso_symbol. Each
architecture that supports vDSO must define the Linux version and its
hash for symbol resolution (VDSO_NAME and VDSO_HASH macro respectively).
It also organizes the HAVE_*_VSYSCALL for mips, powerpc, and s390 to
define them on a common header.
The idea is to require less code to configure and enable vDSO support
for newer ports. No semantic changes are expected.
Checked with a build against all affected architectures.
* sysdeps/unix/make-syscalls.sh: Make vDSO call use get_vdso_symbol.
* sysdeps/unix/sysv/linux/aarch64/gettimeofday.c (__gettimeofday):
Use get_vdso_symbol instead of _dl_vdso_vsym.
* sysdeps/unix/sysv/linux/powerpc/time.c (time): Likewise.
* sysdeps/unix/sysv/linux/riscv/flush-icache.c
(__lookup_riscv_flush_icache): Likewise.
* sysdeps/unix/sysv/linux/x86/gettimeofday.c (__gettimeofday):
Likewise.
* sysdeps/unix/sysv/linux/x86/time.c (time): Likewise.
* sysdeps/unix/sysv/linux/powerpc/gettimeofday.c: Likewise.
* sysdeps/unix/sysv/linux/aarch64/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/arm/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/i386/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/mips/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/powerpc/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/riscv/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/sparc/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/s390/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/init-first.c: Likewise.
* sysdeps/unix/sysv/linux/aarch64/sysdep.h (VDSO_NAME, VDSO_HASH):
Define.
* sysdeps/unix/sysv/linux/arm/sysdep.h (VDSO_NAME, VDSO_HASH):
Likewise.
* sysdeps/unix/sysv/linux/i386/sysdep.h (VDSO_NAME, VDSO_HASH):
Likewise.
* sysdeps/unix/sysv/linux/riscv/sysdep.h (VDSO_NAME, VDSO_HASH):
Likewise.
* sysdeps/unix/sysv/linux/sparc/sysdep.h (VDSO_NAME, VDSO_HASH):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h (VDSO_NAME, VDSO_HASH):
Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
(HAVE_CLOCK_GETTIME_VSYSCALL, HAVE_GETTIMEOFDAY_VSYSCALL): Remove
definition.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(HAVE_CLOCK_GETTIME_VSYSCALL, HAVE_GETTIMEOFDAY_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(HAVE_CLOCK_GETTIME_VSYSCALL, HAVE_GETTIMEOFDAY_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
(HAVE_CLOCK_GETTIME_VSYSCALL, HAVE_GETTIMEOFDAY_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
(HAVE_CLOCK_GETTIME_VSYSCALL, HAVE_GETTIMEOFDAY_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h
(HAVE_CLOCK_GETRES_VSYSCALL, HAVE_CLOCK_GETTIME_VSYSCALL,
HAVE_GETTIMEOFDAY_VSYSCALL, HAVE_GETCPU_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h
(HAVE_CLOCK_GETRES_VSYSCALL, HAVE_CLOCK_GETTIME_VSYSCALL,
HAVE_GETTIMEOFDAY_VSYSCALL, HAVE_GETCPU_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/mips/sysdep.h: New file.
* sysdeps/unix/sysv/linux/powerpc/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/s390/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/dl-vdso.h (PREPARE_VERSION,
PREPARE_VERSION_KNOWN, VDSO_NAME_LINUX_2_6, VDSO_HASH_LINUX_2_6,
VDSO_NAME_LINUX_2_6_15, VDSO_HASH_LINUX_2_6_15,
VDSO_NAME_LINUX_2_6_29, VDSO_HASH_LINUX_2_6_29,
VDSO_NAME_LINUX_4_15, VDSO_HASH_LINUX_4_15): Remove defines.
(get_vdso_symbol): New function.
2019-05-23 19:33:32 +00:00
|
|
|
#include <sysdeps/unix/sysv/linux/mips/sysdep.h>
|
Add INLINE_SYSCALL_ERROR_RETURN_VALUE
For ia32 PIC, the first thing of many syscalls does is to call
__x86.get_pc_thunk.reg to load PC into reg in case there is an error,
which is required for setting errno. In most cases, there are no
errors. But we still call __x86.get_pc_thunk.reg. This patch adds
INLINE_SYSCALL_ERROR_RETURN_VALUE so that i386 can optimize setting
errno by branching to the internal __syscall_error without PLT.
With i386 INLINE_SYSCALL_ERROR_RETURN_VALUE and i386 syscall inlining
optimization for GCC 5, for sysdeps/unix/sysv/linux/fchmodat.c with
-O2 -march=i686 -mtune=generic, GCC 5.2 now generates:
<fchmodat>:
0: push %ebx
1: mov 0x14(%esp),%eax
5: mov 0x8(%esp),%ebx
9: mov 0xc(%esp),%ecx
d: mov 0x10(%esp),%edx
11: test $0xfffffeff,%eax
16: jne 38 <fchmodat+0x38>
18: test $0x1,%ah
1b: jne 48 <fchmodat+0x48>
1d: mov $0x132,%eax
22: call *%gs:0x10
29: cmp $0xfffff000,%eax
2e: ja 58 <fchmodat+0x58>
30: pop %ebx
31: ret
32: lea 0x0(%esi),%esi
38: pop %ebx
39: mov $0xffffffea,%eax
3e: jmp 3f <fchmodat+0x3f> 3f: R_386_PC32 __syscall_error
43: nop
44: lea 0x0(%esi,%eiz,1),%esi
48: pop %ebx
49: mov $0xffffffa1,%eax
4e: jmp 4f <fchmodat+0x4f> 4f: R_386_PC32 __syscall_error
53: nop
54: lea 0x0(%esi,%eiz,1),%esi
58: pop %ebx
59: jmp 5a <fchmodat+0x5a> 5a: R_386_PC32 __syscall_error
instead of
<fchmodat>:
0: sub $0x8,%esp
3: mov 0x18(%esp),%eax
7: mov %ebx,(%esp)
a: call b <fchmodat+0xb> b: R_386_PC32 __x86.get_pc_thunk.bx
f: add $0x2,%ebx 11: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
15: mov %edi,0x4(%esp)
19: test $0xfffffeff,%eax
1e: jne 70 <fchmodat+0x70>
20: test $0x1,%ah
23: jne 88 <fchmodat+0x88>
25: mov 0x14(%esp),%edx
29: mov 0x10(%esp),%ecx
2d: mov 0xc(%esp),%edi
31: xchg %ebx,%edi
33: mov $0x132,%eax
38: call *%gs:0x10
3f: xchg %edi,%ebx
41: cmp $0xfffff000,%eax
46: ja 58 <fchmodat+0x58>
48: mov (%esp),%ebx
4b: mov 0x4(%esp),%edi
4f: add $0x8,%esp
52: ret
53: nop
54: lea 0x0(%esi,%eiz,1),%esi
58: mov 0x0(%ebx),%edx 5a: R_386_TLS_GOTIE __libc_errno
5e: neg %eax
60: mov %eax,%gs:(%edx)
63: mov $0xffffffff,%eax
68: jmp 48 <fchmodat+0x48>
6a: lea 0x0(%esi),%esi
70: mov 0x0(%ebx),%eax 72: R_386_TLS_GOTIE __libc_errno
76: movl $0x16,%gs:(%eax)
7d: mov $0xffffffff,%eax
82: jmp 48 <fchmodat+0x48>
84: lea 0x0(%esi,%eiz,1),%esi
88: mov 0x0(%ebx),%eax 8a: R_386_TLS_GOTIE __libc_errno
8e: movl $0x5f,%gs:(%eax)
95: mov $0xffffffff,%eax
9a: jmp 48 <fchmodat+0x48>
* sysdeps/unix/sysv/linux/sysdep.h: New file.
* sysdeps/unix/sysv/linux/i386/sysdep.c: Likewise.
* sysdeps/unix/sysv/linux/alpha/sysdep.h: Include
<sysdeps/unix/sysv/linux/sysdep.h>.
* sysdeps/unix/sysv/linux/arm/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/generic/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/sh/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/i386/Makefile [$(subdir) == csu]
(sysdep-dl-routines): Add sysdep.
[$(subdir) == nptl] (libpthread-routines): Likewise.
[$(subdir) == rt] (librt-routines): Likewise.
* sysdeps/unix/sysv/linux/i386/clone.S (__clone): Don't check
PIC when branching to SYSCALL_ERROR_LABEL.
* sysdeps/unix/sysv/linux/i386/sysdep.S: Removed.
* sysdeps/unix/sysv/linux/i386/sysdep.h: Include
<sysdeps/unix/sysv/linux/sysdep.h>.
(SYSCALL_ERROR_LABEL): Changed to __syscall_error.
(SYSCALL_ERROR_ERRNO): Removed.
(SYSCALL_ERROR_HANDLER): Changed to empty.
(SYSCALL_ERROR_HANDLER_TLS_STORE): Likewise.
(__syscall_error): New prototype.
[IS_IN (libc)] (INLINE_SYSCALL): New macro.
(INLINE_SYSCALL_ERROR_RETURN_VALUE): Likewise.
2015-10-13 18:58:53 +00:00
|
|
|
#include <sysdeps/unix/sysv/linux/sysdep.h>
|
2020-02-03 14:13:18 +00:00
|
|
|
#include <sysdeps/unix/mips/mips64/sysdep.h>
|
2003-03-17 16:20:44 +00:00
|
|
|
|
* sysdeps/mips/nptl/tls.h (THREAD_GSCOPE_RESET_FLAG): Pass
LLL_PRIVATE argument to lll_futex_wake.
* sysdeps/unix/sysv/linux/mips/bits/fcntl.h (O_CLOEXEC): Define.
* sysdeps/unix/sysv/linux/mips/bits/socket.h (PF_UNIX): Update
comment.
(PF_IUCV, PF_RXRPC): Define.
(PF_MAX): Update.
(AF_IUCV, AF_RXRPC): Define.
(MSG_CMSG_CLOEXEC): Define.
(_EXTERN_INLINE): Define to __extern_inline.
* sysdeps/unix/sysv/linux/mips/bits/stat.h (UTIME_NOW,
UTIME_OMIT): Define.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h: Include <tls.h>.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h: Renamed all
lll_mutex_* resp. lll_robust_mutex_* macros to lll_*
resp. lll_robust_*. Renamed all LLL_MUTEX_LOCK_* macros to
LLL_LOCK_*. Include <kernel-features.h>.
(LLL_LOCK_INITIALIZER): Remove duplicate definition.
(LLL_PRIVATE, LLL_SHARED, __lll_private_flag): Define.
* sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c
(clear_once_control, __pthread_once): Pass LLL_PRIVATE argument to
lll_futex_wait.
(lll_futex_wait, lll_futex_timed_wait, lll_futex_wake,
lll_robust_dead, lll_futex_requeue, lll_futex_wake_unlock): Take
private arguments.
(__lll_robust_trylock): Convert to macro.
(__lll_robust_lock_wait): Add private argument.
(__lll_lock_wait_private, __lll_lock_wait): Declare.
(__lll_lock): Convert to macro. Take private argument.
(__lll_cond_lock): Likewise.
(lll_lock, lll_cond_lock): Take private arguments.
(__lll_robust_lock): Take private argument. Convert to macro.
(lll_robust_lock, __lll_cond_lock, lll_cond_lock,
lll_robust_cond_lock): Take private arguments.
(__lll_timedlock_wait, __lll_robust_timedlock_wait): Take private
arguments.
(__lll_timedlock, __lll_robust_timedlock): Take private arguments.
(lll_timedlock, lll_robust_timedlock): Take private arguments.
(__lll_unlock, __lll_robust_unlock): Convert to macros. Take
private arguments.
(lll_unlock, lll_robust_unlock): Take private arguments.
(__lll_mutex_unlock_force, lll_mutex_unlock_force, lll_lock_t,
lll_trylock, lll_lock, lll_unlock, lll_islocked): Remove.
(lll_wait_tid): Pass LLL_SHARED to lll_futex_wait.
(__lll_cond_wait, __lll_cond_timedwait, __lll_cond_wake,
__lll_cond_broadcast, lll_cond_wait, lll_cond_timedwait,
lll_cond_wake, lll_cond_broadcast): Remove.
* sysdeps/unix/sysv/linux/mips/sys/tas.h (_EXTERN_INLINE): Define
to __extern_inline.
2007-09-12 12:57:41 +00:00
|
|
|
#include <tls.h>
|
|
|
|
|
2003-03-17 16:20:44 +00:00
|
|
|
/* For Linux we can use the system call table in the header file
|
|
|
|
/usr/include/asm/unistd.h
|
|
|
|
of the kernel. But these symbols do not follow the SYS_* syntax
|
|
|
|
so we have to redefine the `SYS_ify' macro here. */
|
|
|
|
#undef SYS_ify
|
2012-01-26 20:53:57 +00:00
|
|
|
#define SYS_ify(syscall_name) __NR_##syscall_name
|
2003-03-17 16:20:44 +00:00
|
|
|
|
2003-10-01 06:59:40 +00:00
|
|
|
#ifdef __ASSEMBLER__
|
|
|
|
|
|
|
|
/* We don't want the label for the error handler to be visible in the symbol
|
|
|
|
table when we define it here. */
|
2020-01-29 17:36:58 +00:00
|
|
|
# undef SYSCALL_ERROR_LABEL
|
2003-10-01 06:59:40 +00:00
|
|
|
# define SYSCALL_ERROR_LABEL 99b
|
|
|
|
|
|
|
|
#else /* ! __ASSEMBLER__ */
|
2003-03-17 16:20:44 +00:00
|
|
|
|
2020-06-26 19:06:49 +00:00
|
|
|
#undef HAVE_INTERNAL_BRK_ADDR_SYMBOL
|
|
|
|
#define HAVE_INTERNAL_BRK_ADDR_SYMBOL 1
|
|
|
|
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
#include <syscall_types.h>
|
2007-06-07 14:40:24 +00:00
|
|
|
|
2013-02-05 14:41:32 +00:00
|
|
|
/* Note that the original Linux syscall restart convention required the
|
|
|
|
instruction immediately preceding SYSCALL to initialize $v0 with the
|
|
|
|
syscall number. Then if a restart triggered, $v0 would have been
|
|
|
|
clobbered by the syscall interrupted, and needed to be reinititalized.
|
|
|
|
The kernel would decrement the PC by 4 before switching back to the
|
|
|
|
user mode so that $v0 had been reloaded before SYSCALL was executed
|
|
|
|
again. This implied the place $v0 was loaded from must have been
|
|
|
|
preserved across a syscall, e.g. an immediate, static register, stack
|
|
|
|
slot, etc.
|
|
|
|
|
|
|
|
The convention was relaxed in Linux with a change applied to the kernel
|
|
|
|
GIT repository as commit 96187fb0bc30cd7919759d371d810e928048249d, that
|
|
|
|
first appeared in the 2.6.36 release. Since then the kernel has had
|
|
|
|
code that reloads $v0 upon syscall restart and resumes right at the
|
|
|
|
SYSCALL instruction, so no special arrangement is needed anymore.
|
|
|
|
|
|
|
|
For backwards compatibility with existing kernel binaries we support
|
|
|
|
the old convention by choosing the instruction preceding SYSCALL
|
|
|
|
carefully. This also means we have to force a 32-bit encoding of the
|
|
|
|
microMIPS MOVE instruction if one is used. */
|
|
|
|
|
|
|
|
#ifdef __mips_micromips
|
|
|
|
# define MOVE32 "move32"
|
|
|
|
#else
|
|
|
|
# define MOVE32 "move"
|
|
|
|
#endif
|
|
|
|
|
2003-03-17 16:20:44 +00:00
|
|
|
#undef INTERNAL_SYSCALL
|
2020-01-29 20:38:36 +00:00
|
|
|
#define INTERNAL_SYSCALL(name, nr, args...) \
|
2013-02-05 14:41:32 +00:00
|
|
|
internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t", \
|
|
|
|
"IK" (SYS_ify (name)), \
|
2020-01-29 20:38:36 +00:00
|
|
|
0, args)
|
2003-03-17 16:20:44 +00:00
|
|
|
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
#undef INTERNAL_SYSCALL_NCS
|
2020-01-29 20:38:36 +00:00
|
|
|
#define INTERNAL_SYSCALL_NCS(number, nr, args...) \
|
2013-02-05 14:41:32 +00:00
|
|
|
internal_syscall##nr (MOVE32 "\t%0, %2\n\t", \
|
|
|
|
"r" (__s0), \
|
2020-01-29 20:38:36 +00:00
|
|
|
number, args)
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall0(v0_init, input, number, dummy...) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7"); \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set reorder" \
|
|
|
|
: "=r" (__v0), "=r" (__a3) \
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
: input \
|
2013-01-29 13:30:16 +00:00
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall1(v0_init, input, number, arg1) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
__syscall_arg_t _arg1 = __SSC (arg1); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a0 asm ("$4") = _arg1; \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7"); \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set reorder" \
|
|
|
|
: "=r" (__v0), "=r" (__a3) \
|
|
|
|
: input, "r" (__a0) \
|
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall2(v0_init, input, number, arg1, arg2) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
__syscall_arg_t _arg1 = __SSC (arg1); \
|
|
|
|
__syscall_arg_t _arg2 = __SSC (arg2); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a0 asm ("$4") = _arg1; \
|
|
|
|
register __syscall_arg_t __a1 asm ("$5") = _arg2; \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7"); \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set\treorder" \
|
|
|
|
: "=r" (__v0), "=r" (__a3) \
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
: input, "r" (__a0), "r" (__a1) \
|
2013-01-29 13:30:16 +00:00
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall3(v0_init, input, number, arg1, arg2, arg3) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
__syscall_arg_t _arg1 = __SSC (arg1); \
|
|
|
|
__syscall_arg_t _arg2 = __SSC (arg2); \
|
|
|
|
__syscall_arg_t _arg3 = __SSC (arg3); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a0 asm ("$4") = _arg1; \
|
|
|
|
register __syscall_arg_t __a1 asm ("$5") = _arg2; \
|
|
|
|
register __syscall_arg_t __a2 asm ("$6") = _arg3; \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7"); \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set\treorder" \
|
|
|
|
: "=r" (__v0), "=r" (__a3) \
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
: input, "r" (__a0), "r" (__a1), "r" (__a2) \
|
2013-01-29 13:30:16 +00:00
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall4(v0_init, input, number, arg1, arg2, arg3, \
|
|
|
|
arg4) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
__syscall_arg_t _arg1 = __SSC (arg1); \
|
|
|
|
__syscall_arg_t _arg2 = __SSC (arg2); \
|
|
|
|
__syscall_arg_t _arg3 = __SSC (arg3); \
|
|
|
|
__syscall_arg_t _arg4 = __SSC (arg4); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a0 asm ("$4") = _arg1; \
|
|
|
|
register __syscall_arg_t __a1 asm ("$5") = _arg2; \
|
|
|
|
register __syscall_arg_t __a2 asm ("$6") = _arg3; \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7") = _arg4; \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set\treorder" \
|
|
|
|
: "=r" (__v0), "+r" (__a3) \
|
|
|
|
: input, "r" (__a0), "r" (__a1), "r" (__a2) \
|
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall5(v0_init, input, number, arg1, arg2, arg3, \
|
|
|
|
arg4, arg5) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
__syscall_arg_t _arg1 = __SSC (arg1); \
|
|
|
|
__syscall_arg_t _arg2 = __SSC (arg2); \
|
|
|
|
__syscall_arg_t _arg3 = __SSC (arg3); \
|
|
|
|
__syscall_arg_t _arg4 = __SSC (arg4); \
|
|
|
|
__syscall_arg_t _arg5 = __SSC (arg5); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a0 asm ("$4") = _arg1; \
|
|
|
|
register __syscall_arg_t __a1 asm ("$5") = _arg2; \
|
|
|
|
register __syscall_arg_t __a2 asm ("$6") = _arg3; \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7") = _arg4; \
|
|
|
|
register __syscall_arg_t __a4 asm ("$8") = _arg5; \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set\treorder" \
|
|
|
|
: "=r" (__v0), "+r" (__a3) \
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
: input, "r" (__a0), "r" (__a1), "r" (__a2), "r" (__a4) \
|
2013-01-29 13:30:16 +00:00
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2020-01-29 20:38:36 +00:00
|
|
|
#define internal_syscall6(v0_init, input, number, arg1, arg2, arg3, \
|
|
|
|
arg4, arg5, arg6) \
|
2013-01-29 13:30:16 +00:00
|
|
|
({ \
|
2020-02-12 16:57:02 +00:00
|
|
|
long int _sys_result; \
|
2003-03-17 16:20:44 +00:00
|
|
|
\
|
|
|
|
{ \
|
nptl: Fix Race conditions in pthread cancellation [BZ#12683]
The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-06-25 19:17:44 +00:00
|
|
|
__syscall_arg_t _arg1 = __SSC (arg1); \
|
|
|
|
__syscall_arg_t _arg2 = __SSC (arg2); \
|
|
|
|
__syscall_arg_t _arg3 = __SSC (arg3); \
|
|
|
|
__syscall_arg_t _arg4 = __SSC (arg4); \
|
|
|
|
__syscall_arg_t _arg5 = __SSC (arg5); \
|
|
|
|
__syscall_arg_t _arg6 = __SSC (arg6); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __s0 asm ("$16") __attribute__ ((unused))\
|
2013-02-05 14:41:32 +00:00
|
|
|
= (number); \
|
2020-02-03 14:13:18 +00:00
|
|
|
register __syscall_arg_t __v0 asm ("$2"); \
|
|
|
|
register __syscall_arg_t __a0 asm ("$4") = _arg1; \
|
|
|
|
register __syscall_arg_t __a1 asm ("$5") = _arg2; \
|
|
|
|
register __syscall_arg_t __a2 asm ("$6") = _arg3; \
|
|
|
|
register __syscall_arg_t __a3 asm ("$7") = _arg4; \
|
|
|
|
register __syscall_arg_t __a4 asm ("$8") = _arg5; \
|
|
|
|
register __syscall_arg_t __a5 asm ("$9") = _arg6; \
|
2013-01-29 13:30:16 +00:00
|
|
|
__asm__ volatile ( \
|
|
|
|
".set\tnoreorder\n\t" \
|
2013-02-05 14:41:32 +00:00
|
|
|
v0_init \
|
2013-01-29 13:30:16 +00:00
|
|
|
"syscall\n\t" \
|
|
|
|
".set\treorder" \
|
|
|
|
: "=r" (__v0), "+r" (__a3) \
|
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them. Correct types for registers.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_SYSCALL): Update internal_syscall##nr invocation.
(INTERNAL_SYSCALL_NCS): New.
(internal_syscall0, internal_syscall1, internal_syscall2,
internal_syscall3, internal_syscall4, internal_syscall5,
internal_syscall6): Take ncs_init, cs_init, and input arguments.
Use them.
* sysdeps/unix/sysv/linux/mips/mips64/syscalls.list: Remove
recvfrom and sendto. Mark lseek, msgrcv, and msgsnd as cancellation
points.
* sysdeps/mips/dl-machine.h (elf_machine_rel): Remove unused "value".
Use Elf(Addr) for TLS relocation targets.
* sysdeps/unix/sysv/linux/mips/mips64/Makefile: New file.
* sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (lll_futex_wait,
lll_futex_timed_wait, lll_futex_wake, lll_futex_requeue): Cast
futexp to long for n64.
* sysdeps/unix/sysv/linux/mips/mips64/nptl/sysdep-cancel.h: New file.
2006-03-03 01:06:48 +00:00
|
|
|
: input, "r" (__a0), "r" (__a1), "r" (__a2), "r" (__a4), \
|
|
|
|
"r" (__a5) \
|
2013-01-29 13:30:16 +00:00
|
|
|
: __SYSCALL_CLOBBERS); \
|
2020-02-03 14:52:43 +00:00
|
|
|
_sys_result = __a3 != 0 ? -__v0 : __v0; \
|
2003-03-17 16:20:44 +00:00
|
|
|
} \
|
|
|
|
_sys_result; \
|
|
|
|
})
|
|
|
|
|
2019-11-19 14:06:20 +00:00
|
|
|
#if __mips_isa_rev >= 6
|
|
|
|
# define __SYSCALL_CLOBBERS "$1", "$3", "$10", "$11", "$12", "$13", \
|
|
|
|
"$14", "$15", "$24", "$25", "memory"
|
|
|
|
#else
|
|
|
|
# define __SYSCALL_CLOBBERS "$1", "$3", "$10", "$11", "$12", "$13", \
|
|
|
|
"$14", "$15", "$24", "$25", "hi", "lo", "memory"
|
|
|
|
#endif
|
2016-04-12 08:31:06 +00:00
|
|
|
|
2003-03-17 16:20:44 +00:00
|
|
|
#endif /* __ASSEMBLER__ */
|
|
|
|
|
|
|
|
#endif /* linux/mips/sysdep.h */
|