Commit Graph

58 Commits

Author SHA1 Message Date
Chris von Recklinghausen 3da8288d26 entry: kmsan: introduce kmsan_unpoison_entry_regs()
JIRA: https://issues.redhat.com/browse/RHEL-1848

commit 6cae637fa26df867449c6bc20ea8bc693abe49b0
Author: Alexander Potapenko <glider@google.com>
Date:   Thu Sep 15 17:04:14 2022 +0200

    entry: kmsan: introduce kmsan_unpoison_entry_regs()

    struct pt_regs passed into IRQ entry code is set up by uninstrumented asm
    functions, therefore KMSAN may not notice the registers are initialized.

    kmsan_unpoison_entry_regs() unpoisons the contents of struct pt_regs,
    preventing potential false positives.  Unlike kmsan_unpoison_memory(), it
    can be called under kmsan_in_runtime(), which is often the case in IRQ
    entry code.

    Link: https://lkml.kernel.org/r/20220915150417.722975-41-glider@google.com
    Signed-off-by: Alexander Potapenko <glider@google.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Konovalov <andreyknvl@google.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Eric Biggers <ebiggers@google.com>
    Cc: Eric Biggers <ebiggers@kernel.org>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Herbert Xu <herbert@gondor.apana.org.au>
    Cc: Ilya Leoshkevich <iii@linux.ibm.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Marco Elver <elver@google.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Vegard Nossum <vegard.nossum@oracle.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-10-20 06:14:42 -04:00
Waiman Long 641b964827 context_tracking: Take NMI eqs entrypoints over RCU
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516

commit 493c1822825f00025d6754ec0632990a27edc6f8
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Wed, 8 Jun 2022 16:40:27 +0200

    context_tracking: Take NMI eqs entrypoints over RCU

    The RCU dynticks counter is going to be merged into the context tracking
    subsystem. Prepare with moving the NMI extended quiescent states
    entrypoints to context tracking. For now those are dumb redirection to
    existing RCU calls.

    Acked-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
    Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
    Cc: Yu Liao <liaoyu15@huawei.com>
    Cc: Phil Auld <pauld@redhat.com>
    Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
    Cc: Alex Belits <abelits@marvell.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
    Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2023-03-30 08:36:17 -04:00
Waiman Long 4cabb4dcd7 context_tracking: Take IRQ eqs entrypoints over RCU
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516
Conflicts: The drivers/cpuidle/cpuidle-riscv-sbi.c hunk is dropped as
	   RISC-V is not a supported arch and the file is not currently
	   present in Centos-Stream-9.

commit 6f0e6c1598b1a3d19fc30db86b6e26d6f881b43d
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Wed, 8 Jun 2022 16:40:26 +0200

    context_tracking: Take IRQ eqs entrypoints over RCU

    The RCU dynticks counter is going to be merged into the context tracking
    subsystem. Prepare with moving the IRQ extended quiescent states
    entrypoints to context tracking. For now those are dumb redirection to
    existing RCU calls.

    [ paulmck: Apply Stephen Rothwell feedback from -next. ]
    [ paulmck: Apply Nathan Chancellor feedback. ]

    Acked-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
    Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
    Cc: Yu Liao <liaoyu15@huawei.com>
    Cc: Phil Auld <pauld@redhat.com>
    Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
    Cc: Alex Belits <abelits@marvell.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
    Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2023-03-30 08:36:16 -04:00
Juri Lelli 942fff8981 x86: Support for lazy preemption
Bugzilla: https://bugzilla.redhat.com/2171995
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git
Conflicts: Missing commit 5443f98fb9e0 ("x86: add CPU field to
           struct thread_info"). Not required for this change.

commit 7f0d38b7b7dd7fb3dfb0be514c9264c9ba10e681
Author:    Thomas Gleixner <tglx@linutronix.de>
Date:      Thu Nov 1 11:03:47 2012 +0100

    x86: Support for lazy preemption

    Implement the x86 pieces for lazy preempt.

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
2023-02-27 13:46:10 +01:00
Juri Lelli 013bd3e975 x86/entry: Use should_resched() in idtentry_exit_cond_resched()
Bugzilla: https://bugzilla.redhat.com/2171995
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git

commit 48411e850ea8e16f90b6e747874ea41c0a0ed67a
Author:    Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:      Tue Jun 30 11:45:14 2020 +0200

    x86/entry: Use should_resched() in idtentry_exit_cond_resched()

    The TIF_NEED_RESCHED bit is inlined on x86 into the preemption counter.
    By using should_resched(0) instead of need_resched() the same check can
    be performed which uses the same variable as 'preempt_count()` which was
    issued before.

    Use should_resched(0) instead need_resched().

    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
2023-02-27 13:46:10 +01:00
Waiman Long fa072c44f8 lockdep: Fix -Wunused-parameter for _THIS_IP_
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2141431

commit 8b023accc8df70e72f7704d29fead7ca914d6837
Author: Nick Desaulniers <ndesaulniers@google.com>
Date:   Mon, 14 Mar 2022 15:19:03 -0700

    lockdep: Fix -Wunused-parameter for _THIS_IP_

    While looking into a bug related to the compiler's handling of addresses
    of labels, I noticed some uses of _THIS_IP_ seemed unused in lockdep.
    Drive by cleanup.

    -Wunused-parameter:
    kernel/locking/lockdep.c:1383:22: warning: unused parameter 'ip'
    kernel/locking/lockdep.c:4246:48: warning: unused parameter 'ip'
    kernel/locking/lockdep.c:4844:19: warning: unused parameter 'ip'

    Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Waiman Long <longman@redhat.com>
    Link: https://lore.kernel.org/r/20220314221909.2027027-1-ndesaulniers@google.com

Signed-off-by: Waiman Long <longman@redhat.com>
2022-11-10 11:38:05 -05:00
Chris von Recklinghausen 3b8acb1eac resume_user_mode: Move to resume_user_mode.h
Conflicts: block/blk-cgroup.c - We already have
	672fdcf0e7de block: partition include/linux/blk-cgroup.h
	so keep include of linux/blk-cgroup.h

Bugzilla: https://bugzilla.redhat.com/2120352

commit 03248addadf1a5ef0a03cbcd5ec905b49adb9658
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Wed Feb 9 12:20:45 2022 -0600

    resume_user_mode: Move to resume_user_mode.h

    Move set_notify_resume and tracehook_notify_resume into resume_user_mode.h.
    While doing that rename tracehook_notify_resume to resume_user_mode_work.

    Update all of the places that included tracehook.h for these functions to
    include resume_user_mode.h instead.

    Update all of the callers of tracehook_notify_resume to call
    resume_user_mode_work.

    Reviewed-by: Kees Cook <keescook@chromium.org>
    Link: https://lkml.kernel.org/r/20220309162454.123006-12-ebiederm@xmission.c
om
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:47 -04:00
Chris von Recklinghausen 6fb7c30612 task_work: Call tracehook_notify_signal from get_signal on all architectures
Bugzilla: https://bugzilla.redhat.com/2120352

commit 8ba62d37949e248c698c26e0d82d72fda5d33ebf
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Wed Feb 9 09:51:14 2022 -0600

    task_work: Call tracehook_notify_signal from get_signal on all architectures

    Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3
    ("task_work: unconditionally run task_work from get_signal()") always
    calling task_work_run all of the work of tracehook_notify_signal is
    already happening except clearing TIF_NOTIFY_SIGNAL.

    Factor clear_notify_signal out of tracehook_notify_signal and use it in
    get_signal so that get_signal only needs one call of task_work_run.

    To keep the semantics in sync update xfer_to_guest_mode_work (which
    does not call get_signal) to call tracehook_notify_signal if either
    _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.

    Reviewed-by: Kees Cook <keescook@chromium.org>
    Link: https://lkml.kernel.org/r/20220309162454.123006-8-ebiederm@xmission.com
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:47 -04:00
Chris von Recklinghausen 1fbfa0be2e ptrace: Remove arch_syscall_{enter,exit}_tracehook
Bugzilla: https://bugzilla.redhat.com/2120352

commit 0cfcb2b9ef48bbcaf5d43b9f1893f63a938e8176
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Jan 27 12:00:55 2022 -0600

    ptrace: Remove arch_syscall_{enter,exit}_tracehook

    These functions are alwasy one-to-one wrappers around
    ptrace_report_syscall_entry and ptrace_report_syscall_exit.
    So directly call the functions they are wrapping instead.

    Reviewed-by: Kees Cook <keescook@chromium.org>
    Link: https://lkml.kernel.org/r/20220309162454.123006-4-ebiederm@xmission.com
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:47 -04:00
Chris von Recklinghausen 8d04d258fd ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
Conflicts: in arch/ only keep changes to arch/Kconfig,
	arch/arm64/kernel/ptrace.c, and arch/powerpc/kernel/ptrace/ptrace.c.
	The rest of the arch/ files in the upstream version of this patch are
	unsupported.

Bugzilla: https://bugzilla.redhat.com/2120352

commit 153474ba1a4aed0a7b797b4c2be8c35c7a4e57bd
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Jan 27 11:46:37 2022 -0600

    ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h

    Rename tracehook_report_syscall_{entry,exit} to
    ptrace_report_syscall_{entry,exit} and place them in ptrace.h

    There is no longer any generic tracehook infractructure so make
    these ptrace specific functions ptrace specific.

    Reviewed-by: Kees Cook <keescook@chromium.org>
    Link: https://lkml.kernel.org/r/20220309162454.123006-3-ebiederm@xmission.co
m
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:47 -04:00
Chris von Recklinghausen 38e6b77fa9 entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
Bugzilla: https://bugzilla.redhat.com/2120352

commit a68de80f61f6af397bc06fb391ff2e571c9c4d80
Author: Sean Christopherson <seanjc@google.com>
Date:   Wed Sep 1 13:30:27 2021 -0700

    entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()

    Invoke rseq_handle_notify_resume() from tracehook_notify_resume() now
    that the two function are always called back-to-back by architectures
    that have rseq.  The rseq helper is stubbed out for architectures that
    don't support rseq, i.e. this is a nop across the board.

    Note, tracehook_notify_resume() is horribly named and arguably does not
    belong in tracehook.h as literally every line of code in it has nothing
    to do with tracing.  But, that's been true since commit a42c6ded82
    ("move key_repace_session_keyring() into tracehook_notify_resume()")
    first usurped tracehook_notify_resume() back in 2012.  Punt cleaning that
    mess up to future patches.

    No functional change intended.

    Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210901203030.1292304-3-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:24 -04:00
Tobias Huschle 536ad3c14d entry: Rename arch_check_user_regs() to arch_enter_from_user_mode()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2110299
Upstream status: https://github.com/torvalds/linux.git
Tested: by IBM
Build-Info: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=47942643
Conflicts: None
commit 6d97af487dee3176cb1342d4ab16637e495440ad
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Wed May 4 08:23:50 2022 +0200

    entry: Rename arch_check_user_regs() to arch_enter_from_user_mode()

    arch_check_user_regs() is used at the moment to verify that struct pt_regs
    contains valid values when entering the kernel from userspace. s390 needs
    a place in the generic entry code to modify a cpu data structure when
    switching from userspace to kernel mode. As arch_check_user_regs() is
    exactly this, rename it to arch_enter_from_user_mode().

    When entering the kernel from userspace, arch_check_user_regs() is
    used to verify that struct pt_regs contains valid values. Note that
    the NMI codepath doesn't call this function. s390 needs a place in the
    generic entry code to modify a cpu data structure when switching from
    userspace to kernel mode. As arch_check_user_regs() is exactly this,
    rename it to arch_enter_from_user_mode().

    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Andy Lutomirski <luto@kernel.org>
    Link: https://lore.kernel.org/r/20220504062351.2954280-2-tmricht@linux.ibm.com
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Signed-off-by: Tobias Huschle <thuschle@redhat.com>
2022-09-26 08:16:44 +00:00
Vitaly Kuznetsov df89611134 entry: Snapshot thread flags
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2074832

commit 6ce895128b3bff738fe8d9dd74747a03e319e466
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Nov 29 13:06:44 2021 +0000

    entry: Snapshot thread flags

    Some thread flags can be set remotely, and so even when IRQs are disabled,
    the flags can change under our feet. Generally this is unlikely to cause a
    problem in practice, but it is somewhat unsound, and KCSAN will
    legitimately warn that there is a data race.

    To avoid such issues, a snapshot of the flags has to be taken prior to
    using them. Some places already use READ_ONCE() for that, others do not.

    Convert them all to the new flag accessor helpers.

    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Paul E. McKenney <paulmck@kernel.org>
    Link: https://lore.kernel.org/r/20211129130653.2037928-3-mark.rutland@arm.com

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
2022-05-30 16:46:29 +02:00
Phil Auld 922fbdc63c entry: Fix compile error in dynamic_irqentry_exit_cond_resched()
Bugzilla: http://bugzilla.redhat.com/2065226
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/urgent

commit 0a70045ed8516dfcff4b5728557e1ef3fd017c53
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Wed Mar 30 10:43:28 2022 +0200

    entry: Fix compile error in dynamic_irqentry_exit_cond_resched()

    kernel/entry/common.c: In function ‘dynamic_irqentry_exit_cond_resched’:
    kernel/entry/common.c:409:14: error: implicit declaration of function ‘static_key_unlikely’; did you mean ‘static_key_enable’? [-Werror=implicit-function-declaration]
      409 |         if (!static_key_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
          |              ^~~~~~~~~~~~~~~~~~~
          |              static_key_enable

    static_key_unlikely() should be static_branch_unlikely().

    Fixes: 99cf983cc8bca ("sched/preempt: Add PREEMPT_DYNAMIC using static keys")
    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: Mark Rutland <mark.rutland@arm.com>
    Link: https://lore.kernel.org/r/20220330084328.1805665-1-svens@linux.ibm.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2022-04-07 09:35:08 -04:00
Phil Auld 0e372dcf73 sched/preempt: Add PREEMPT_DYNAMIC using static keys
Bugzilla: http://bugzilla.redhat.com/2065226

commit 99cf983cc8bca4adb461b519664c939a565cfd4d
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Feb 14 16:52:14 2022 +0000

    sched/preempt: Add PREEMPT_DYNAMIC using static keys

    Where an architecture selects HAVE_STATIC_CALL but not
    HAVE_STATIC_CALL_INLINE, each static call has an out-of-line trampoline
    which will either branch to a callee or return to the caller.

    On such architectures, a number of constraints can conspire to make
    those trampolines more complicated and potentially less useful than we'd
    like. For example:

    * Hardware and software control flow integrity schemes can require the
      addition of "landing pad" instructions (e.g. `BTI` for arm64), which
      will also be present at the "real" callee.

    * Limited branch ranges can require that trampolines generate or load an
      address into a register and perform an indirect branch (or at least
      have a slow path that does so). This loses some of the benefits of
      having a direct branch.

    * Interaction with SW CFI schemes can be complicated and fragile, e.g.
      requiring that we can recognise idiomatic codegen and remove
      indirections understand, at least until clang proves more helpful
      mechanisms for dealing with this.

    For PREEMPT_DYNAMIC, we don't need the full power of static calls, as we
    really only need to enable/disable specific preemption functions. We can
    achieve the same effect without a number of the pain points above by
    using static keys to fold early returns into the preemption functions
    themselves rather than in an out-of-line trampoline, effectively
    inlining the trampoline into the start of the function.

    For arm64, this results in good code generation. For example, the
    dynamic_cond_resched() wrapper looks as follows when enabled. When
    disabled, the first `B` is replaced with a `NOP`, resulting in an early
    return.

    | <dynamic_cond_resched>:
    |        bti     c
    |        b       <dynamic_cond_resched+0x10>     // or `nop`
    |        mov     w0, #0x0
    |        ret
    |        mrs     x0, sp_el0
    |        ldr     x0, [x0, #8]
    |        cbnz    x0, <dynamic_cond_resched+0x8>
    |        paciasp
    |        stp     x29, x30, [sp, #-16]!
    |        mov     x29, sp
    |        bl      <preempt_schedule_common>
    |        mov     w0, #0x1
    |        ldp     x29, x30, [sp], #16
    |        autiasp
    |        ret

    ... compared to the regular form of the function:

    | <__cond_resched>:
    |        bti     c
    |        mrs     x0, sp_el0
    |        ldr     x1, [x0, #8]
    |        cbz     x1, <__cond_resched+0x18>
    |        mov     w0, #0x0
    |        ret
    |        paciasp
    |        stp     x29, x30, [sp, #-16]!
    |        mov     x29, sp
    |        bl      <preempt_schedule_common>
    |        mov     w0, #0x1
    |        ldp     x29, x30, [sp], #16
    |        autiasp
    |        ret

    Any architecture which implements static keys should be able to use this
    to implement PREEMPT_DYNAMIC with similar cost to non-inlined static
    calls. Since this is likely to have greater overhead than (inlined)
    static calls, PREEMPT_DYNAMIC is only defaulted to enabled when
    HAVE_PREEMPT_DYNAMIC_CALL is selected.

    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Acked-by: Frederic Weisbecker <frederic@kernel.org>
    Link: https://lore.kernel.org/r/20220214165216.2231574-6-mark.rutland@arm.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2022-04-07 09:35:08 -04:00
Phil Auld 05470a1cf6 sched/preempt: Simplify irqentry_exit_cond_resched() callers
Bugzilla: http://bugzilla.redhat.com/2065226

commit 4624a14f4daa8ab4578d274555fd8847254ce339
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Feb 14 16:52:12 2022 +0000

    sched/preempt: Simplify irqentry_exit_cond_resched() callers

    Currently callers of irqentry_exit_cond_resched() need to be aware of
    whether the function should be indirected via a static call, leading to
    ugly ifdeffery in callers.

    Save them the hassle with a static inline wrapper that does the right
    thing. The raw_irqentry_exit_cond_resched() will also be useful in
    subsequent patches which will add conditional wrappers for preemption
    functions.

    Note: in arch/x86/entry/common.c, xen_pv_evtchn_do_upcall() always calls
    irqentry_exit_cond_resched() directly, even when PREEMPT_DYNAMIC is in
    use. I believe this is a latent bug (which this patch corrects), but I'm
    not entirely certain this wasn't deliberate.

    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Acked-by: Frederic Weisbecker <frederic@kernel.org>
    Link: https://lore.kernel.org/r/20220214165216.2231574-4-mark.rutland@arm.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2022-04-07 09:35:07 -04:00
Frederic Weisbecker f268c3737e tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed
Checking for and processing RCU-nocb deferred wakeup upon user/guest
entry is only relevant when nohz_full runs on the local CPU, otherwise
the periodic tick should take care of it.

Make sure we don't needlessly pollute these fast-paths as a -3%
performance regression on a will-it-scale.per_process_ops has been
reported so far.

Fixes: 47b8ff194c (entry: Explicitly flush pending rcuog wakeup before last rescheduling point)
Fixes: 4ae7dc97f7 (entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point)
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210527113441.465489-1-frederic@kernel.org
2021-05-31 10:14:49 +02:00
Linus Torvalds 3b671bf4a7 A trivial cleanup of typo fixes.
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmCGgSATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTdtD/wPWKg5olDLUr3S9Oh15dPj2zzrZHl8
 opJkN/0LpZvTYJmAtVrV1SCywbQYhpsLEB+khBZj0OY/A9gDRGM2uBxxL3oyvyOU
 hRjJkimJuNF3ErDAIFnW3rmgYPIgRnUAnS1hflUpeROHeL5CfR/nPqQgT6I79fmZ
 c23kbGOdz7lw5vUNPiSvpU52UT4HfTfDs5NXB3B4A7Lc2292o3xCw20/LZGsICRy
 6CybMM9Lp7yKdosPZxyjjS9Fu2U2/HptmQ0ueNC4/GYKxq+zM3JDNHECZ4IXVSlu
 +KPgqIfrhec0fwcyWXdEPwDLvgg4XPipxr5mLi9qa45MFxXhtzeKdSD2ktUFU5QH
 G10VWfslDt1VBavKR4u4lz/1L0iiRW3EyuMnbuZNCvbObfNY/jKs+3Kn4OcZIYfL
 DurPMvBZo9BiS3+rIROEETJvzvf84sstDU2c4dZzQMKxtSs1DvVDpyypSqzBkBcr
 n2nWRNsAVhSz0avJ3ZP8yphy/8TFWUY9H1sLQC68ih3frn5sgKN7Ng2cWGCVzL5P
 2geUmbEybdUO9x8mz6ui78kDgwrapyHZlXOQvbaSmlEDA00tEQM0XRFC+B29OwkX
 P37SQjlkvnH5hYxU2x+v4tCrZvrefYuv30E0RbdV5g0WPXAYUL7OH7hK+UzZkKGh
 iZEVlJ2vEUw/lQ==
 =m4pW
 -----END PGP SIGNATURE-----

Merge tag 'core-entry-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core entry updates from Thomas Gleixner:
 "A trivial cleanup of typo fixes"

* tag 'core-entry-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Fix typos in comments
2021-04-26 09:41:15 -07:00
Zhouyi Zhou 0c89d87d1d preempt/dynamic: Fix typo in macro conditional statement
Commit 40607ee97e ("preempt/dynamic: Provide irqentry_exit_cond_resched()
static call") tried to provide irqentry_exit_cond_resched() static call
in irqentry_exit, but has a typo in macro conditional statement.

Fixes: 40607ee97e ("preempt/dynamic: Provide irqentry_exit_cond_resched() static call")
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210410073523.5493-1-zhouzhouyi@gmail.com
2021-04-19 20:02:57 +02:00
Ingo Molnar 97258ce902 entry: Fix typos in comments
Fix 3 single-word typos in the generic syscall entry code.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-03-22 03:57:39 +01:00
Frederic Weisbecker 47b8ff194c entry: Explicitly flush pending rcuog wakeup before last rescheduling point
Following the idle loop model, cleanly check for pending rcuog wakeup
before the last rescheduling point on resuming to user mode. This
way we can avoid to do it from rcu_user_enter() with the last resort
self-IPI hack that enforces rescheduling.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-5-frederic@kernel.org
2021-02-17 14:12:43 +01:00
Peter Zijlstra (Intel) 40607ee97e preempt/dynamic: Provide irqentry_exit_cond_resched() static call
Provide static call to control IRQ preemption (called in CONFIG_PREEMPT)
so that we can override its behaviour when preempt= is overriden.

Since the default behaviour is full preemption, its call is
initialized to provide IRQ preemption when preempt= isn't passed.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-8-frederic@kernel.org
2021-02-17 14:12:42 +01:00
Gabriel Krisman Bertazi 6342adcaa6 entry: Ensure trap after single-step on system call return
Commit 2991552447 ("entry: Drop usage of TIF flags in the generic syscall
code") introduced a bug on architectures using the generic syscall entry
code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall
return after receiving a TIF_SINGLESTEP.

The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to
cause the trap after a system call is executed, but since the above commit,
the syscall call handler only checks for the SYSCALL_WORK flags on the exit
work.

Split the meaning of TIF_SINGLESTEP such that it only means single-step
mode, and create a new type of SYSCALL_WORK to request a trap immediately
after a syscall in single-step mode.  In the current implementation, the
SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity.

Update x86 to flip this bit when a tracer enables single stepping.

Fixes: 2991552447 ("entry: Drop usage of TIF flags in the generic syscall code")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Huey <me@kylehuey.com>
Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com
2021-02-06 00:21:42 +01:00
Yuxuan Shui 41c1a06d1d entry: Unbreak single step reporting behaviour
The move of TIF_SYSCALL_EMU to SYSCALL_WORK_SYSCALL_EMU broke single step
reporting. The original code reported the single step when TIF_SINGLESTEP
was set and TIF_SYSCALL_EMU was not set. The SYSCALL_WORK conversion got
the logic wrong and now the reporting only happens when both bits are set.

Restore the original behaviour.

[ tglx: Massaged changelog and dropped the pointless double negation ]

Fixes: 64eb35f701 ("ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag")
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Link: https://lore.kernel.org/r/877do3gaq9.fsf@m5Zedd9JOGzJrf0
2021-01-28 13:46:55 +01:00
Linus Torvalds edd7ab7684 The new preemtible kmap_local() implementation:
- Consolidate all kmap_atomic() internals into a generic implementation
     which builds the base for the kmap_local() API and make the
     kmap_atomic() interface wrappers which handle the disabling/enabling of
     preemption and pagefaults.
 
   - Switch the storage from per-CPU to per task and provide scheduler
     support for clearing mapping when scheduling out and restoring them
     when scheduling back in.
 
   - Merge the migrate_disable/enable() code, which is also part of the
     scheduler pull request. This was required to make the kmap_local()
     interface available which does not disable preemption when a mapping
     is established. It has to disable migration instead to guarantee that
     the virtual address of the mapped slot is the same accross preemption.
 
   - Provide better debug facilities: guard pages and enforced utilization
     of the mapping mechanics on 64bit systems when the architecture allows
     it.
 
   - Provide the new kmap_local() API which can now be used to cleanup the
     kmap_atomic() usage sites all over the place. Most of the usage sites
     do not require the implicit disabling of preemption and pagefaults so
     the penalty on 64bit and 32bit non-highmem systems is removed and quite
     some of the code can be simplified. A wholesale conversion is not
     possible because some usage depends on the implicit side effects and
     some need to be cleaned up because they work around these side effects.
 
     The migrate disable side effect is only effective on highmem systems
     and when enforced debugging is enabled. On 64bit and 32bit non-highmem
     systems the overhead is completely avoided.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XyQwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUolD/9+R+BX96fGir+I8rG9dc3cbLw5meSi
 0I/Nq3PToZMs2Iqv50DsoaPYHHz/M6fcAO9LRIgsE9jRbnY93GnsBM0wU9Y8yQaT
 4wUzOG5WHaLDfqIkx/CN9coUl458oEiwOEbn79A2FmPXFzr7IpkufnV3ybGDwzwP
 p73bjMJMPPFrsa9ig87YiYfV/5IAZHi82PN8Cq1v4yNzgXRP3Tg6QoAuCO84ZnWF
 RYlrfKjcJ2xPdn+RuYyXolPtxr1hJQ0bOUpe4xu/UfeZjxZ7i1wtwLN9kWZe8CKH
 +x4Lz8HZZ5QMTQ9sCHOLtKzu2MceMcpISzoQH4/aFQCNMgLn1zLbS790XkYiQCuR
 ne9Cua+IqgYfGMG8cq8+bkU9HCNKaXqIBgPEKE/iHYVmqzCOqhW5Cogu4KFekf6V
 Wi7pyyUdX2en8BAWpk5NHc8de9cGcc+HXMq2NIcgXjVWvPaqRP6DeITERTZLJOmz
 XPxq5oPLGl7wdm7z+ICIaNApy8zuxpzb6sPLNcn7l5OeorViORlUu08AN8587wAj
 FiVjp6ZYomg+gyMkiNkDqFOGDH5TMENpOFoB0hNNEyJwwS0xh6CgWuwZcv+N8aPO
 HuS/P+tNANbD8ggT4UparXYce7YCtgOf3IG4GA3JJYvYmJ6pU+AZOWRoDScWq4o+
 +jlfoJhMbtx5Gg==
 =n71I
 -----END PGP SIGNATURE-----

Merge tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull kmap updates from Thomas Gleixner:
 "The new preemtible kmap_local() implementation:

   - Consolidate all kmap_atomic() internals into a generic
     implementation which builds the base for the kmap_local() API and
     make the kmap_atomic() interface wrappers which handle the
     disabling/enabling of preemption and pagefaults.

   - Switch the storage from per-CPU to per task and provide scheduler
     support for clearing mapping when scheduling out and restoring them
     when scheduling back in.

   - Merge the migrate_disable/enable() code, which is also part of the
     scheduler pull request. This was required to make the kmap_local()
     interface available which does not disable preemption when a
     mapping is established. It has to disable migration instead to
     guarantee that the virtual address of the mapped slot is the same
     across preemption.

   - Provide better debug facilities: guard pages and enforced
     utilization of the mapping mechanics on 64bit systems when the
     architecture allows it.

   - Provide the new kmap_local() API which can now be used to cleanup
     the kmap_atomic() usage sites all over the place. Most of the usage
     sites do not require the implicit disabling of preemption and
     pagefaults so the penalty on 64bit and 32bit non-highmem systems is
     removed and quite some of the code can be simplified. A wholesale
     conversion is not possible because some usage depends on the
     implicit side effects and some need to be cleaned up because they
     work around these side effects.

     The migrate disable side effect is only effective on highmem
     systems and when enforced debugging is enabled. On 64bit and 32bit
     non-highmem systems the overhead is completely avoided"

* tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  ARM: highmem: Fix cache_is_vivt() reference
  x86/crashdump/32: Simplify copy_oldmem_page()
  io-mapping: Provide iomap_local variant
  mm/highmem: Provide kmap_local*
  sched: highmem: Store local kmaps in task struct
  x86: Support kmap_local() forced debugging
  mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
  mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL
  microblaze/mm/highmem: Add dropped #ifdef back
  xtensa/mm/highmem: Make generic kmap_atomic() work correctly
  mm/highmem: Take kmap_high_get() properly into account
  highmem: High implementation details and document API
  Documentation/io-mapping: Remove outdated blurb
  io-mapping: Cleanup atomic iomap
  mm/highmem: Remove the old kmap_atomic cruft
  highmem: Get rid of kmap_types.h
  xtensa/mm/highmem: Switch to generic kmap atomic
  sparc/mm/highmem: Switch to generic kmap atomic
  powerpc/mm/highmem: Switch to generic kmap atomic
  nds32/mm/highmem: Switch to generic kmap atomic
  ...
2020-12-14 18:35:53 -08:00
Sven Schnelle c6156e1da6 entry: Add syscall_exit_to_user_mode_work()
This is the same as syscall_exit_to_user_mode() but without calling
exit_to_user_mode(). This can be used if there is an architectural reason
to avoid the combo function, e.g. restarting a syscall without returning to
userspace. Before returning to user space the caller has to invoke
exit_to_user_mode().

[ tglx: Amended comments ]

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201201142755.31931-6-svens@linux.ibm.com
2020-12-02 15:07:58 +01:00
Sven Schnelle 310de1a678 entry: Add exit_to_user_mode() wrapper
Called from architecture specific code when syscall_exit_to_user_mode() is
not suitable. It simply calls __exit_to_user_mode().

This way __exit_to_user_mode() can still be inlined because it is declared
static __always_inline.

[ tglx: Amended comments and moved it to a different place in the header ]

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201201142755.31931-5-svens@linux.ibm.com
2020-12-02 15:07:57 +01:00
Sven Schnelle 96e2fbccd0 entry_Add_enter_from_user_mode_wrapper
To be called from architecture specific code if the combo interfaces are
not suitable. It simply calls __enter_from_user_mode(). This way
__enter_from_user_mode will still be inlined because it is declared static
__always_inline.

[ tglx: Amend comments and move it to a different location in the header ]

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201201142755.31931-4-svens@linux.ibm.com
2020-12-02 15:07:57 +01:00
Sven Schnelle bb793562f0 entry: Rename exit_to_user_mode()
In order to make this function publicly available rename it so it can still
be inlined. An additional exit_to_user_mode() function will be added with
a later commit.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201201142755.31931-3-svens@linux.ibm.com
2020-12-02 15:07:57 +01:00
Sven Schnelle 6666bb714f entry: Rename enter_from_user_mode()
In order to make this function publicly available rename it so it can still
be inlined. An additional enter_from_user_mode() function will be added with
a later commit.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201201142755.31931-2-svens@linux.ibm.com
2020-12-02 15:07:57 +01:00
Gabriel Krisman Bertazi 11894468e3 entry: Support Syscall User Dispatch on common syscall entry
Syscall User Dispatch (SUD) must take precedence over seccomp and
ptrace, since the use case is emulation (it can be invoked with a
different ABI) such that seccomp filtering by syscall number doesn't
make sense in the first place.  In addition, either the syscall is
dispatched back to userspace, in which case there is no resource for to
trace, or the syscall will be executed, and seccomp/ptrace will execute
next.

Since SUD runs before tracepoints, it needs to be a SYSCALL_WORK_EXIT as
well, just to prevent a trace exit event when dispatch was triggered.
For that, the on_syscall_dispatch() examines context to skip the
tracepoint, audit and other work.

[ tglx: Add a comment on the exit side ]

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201127193238.821364-5-krisman@collabora.com
2020-12-02 15:07:56 +01:00
Thomas Gleixner 5fbda3ecd1 sched: highmem: Store local kmaps in task struct
Instead of storing the map per CPU provide and use per task storage. That
prepares for local kmaps which are preemptible.

The context switch code is preparatory and not yet in use because
kmap_atomic() runs with preemption disabled. Will be made usable in the
next step.

The context switch logic is safe even when an interrupt happens after
clearing or before restoring the kmaps. The kmap index in task struct is
not modified so any nesting kmap in an interrupt will use unused indices
and on return the counter is the same as before.

Also add an assert into the return to user space code. Going back to user
space with an active kmap local is a nono.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20201118204007.372935758@linutronix.de
2020-11-24 14:42:09 +01:00
Gabriel Krisman Bertazi 2991552447 entry: Drop usage of TIF flags in the generic syscall code
Now that the flags migration in the common syscall entry code is complete
and the code relies exclusively on thread_info::syscall_work, clean up the
accesses to TI flags in that path.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-10-krisman@collabora.com
2020-11-16 21:53:16 +01:00
Gabriel Krisman Bertazi 64eb35f701 ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_EMU, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-8-krisman@collabora.com
2020-11-16 21:53:16 +01:00
Gabriel Krisman Bertazi 64c19ba29b ptrace: Migrate to use SYSCALL_TRACE flag
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_TRACE, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-7-krisman@collabora.com
2020-11-16 21:53:16 +01:00
Gabriel Krisman Bertazi 524666cb5d tracepoints: Migrate to use SYSCALL_WORK flag
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_TRACEPOINT, use it in the generic entry code
and convert the code which uses the TIF specific helper functions to use
the new *_syscall_work() helpers which either resolve to the new mode for
users of the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-6-krisman@collabora.com
2020-11-16 21:53:15 +01:00
Gabriel Krisman Bertazi 23d67a5485 seccomp: Migrate to use SYSCALL_WORK flag
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SECCOMP, use it in the generic entry code and convert
the code which uses the TIF specific helper functions to use the new
*_syscall_work() helpers which either resolve to the new mode for users of
the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-5-krisman@collabora.com
2020-11-16 21:53:15 +01:00
Gabriel Krisman Bertazi b86678cf0f entry: Wire up syscall_work in common entry code
Prepare the common entry code to use the SYSCALL_WORK flags. They will
be defined in subsequent patches for each type of syscall
work. SYSCALL_WORK_ENTRY/EXIT are defined for the transition, as they
will replace the TIF_ equivalent defines.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20201116174206.2639648-4-krisman@collabora.com
2020-11-16 21:53:15 +01:00
Ira Weiny 78a56e0494 entry: Fix spelling/typo errors in irq entry code
s/reguired/required/
s/Interupts/Interrupts/
s/quiescient/quiescent/
s/assemenbly/assembly/

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201104230157.3378023-1-ira.weiny@intel.com
2020-11-15 23:54:00 +01:00
Thomas Gleixner b6be002bcd x86/entry: Move nmi entry/exit into common code
Lockdep state handling on NMI enter and exit is nothing specific to X86. It's
not any different on other architectures. Also the extra state type is not
necessary, irqentry_state_t can carry the necessary information as well.

Move it to common code and extend irqentry_state_t to carry lockdep state.

[ Ira: Make exit_rcu and lockdep a union as they are mutually exclusive
  between the IRQ and NMI exceptions, and add kernel documentation for
  struct irqentry_state_t ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201102205320.1458656-7-ira.weiny@intel.com
2020-11-04 22:55:36 +01:00
Thomas Gleixner 01be83eea0 Merge branch 'core/urgent' into core/entry
Pick up the entry fix before further modifications.
2020-11-04 18:14:52 +01:00
Thomas Gleixner 9d820f68b2 entry: Fix the incorrect ordering of lockdep and RCU check
When an exception/interrupt hits kernel space and the kernel is not
currently in the idle task then RCU must be watching.

irqentry_enter() validates this via rcu_irq_enter_check_tick(), which in
turn invokes lockdep when taking a lock. But at that point lockdep does not
yet know about the fact that interrupts have been disabled by the CPU,
which triggers a lockdep splat complaining about inconsistent state.

Invoking trace_hardirqs_off() before rcu_irq_enter_check_tick() defeats the
point of rcu_irq_enter_check_tick() because trace_hardirqs_off() uses RCU.

So use the same sequence as for the idle case and tell lockdep about the
irq state change first, invoke the RCU check and then do the lockdep and
tracer update.

Fixes: a5497bab5f ("entry: Provide generic interrupt entry/exit code")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87y2jhl19s.fsf@nanos.tec.linutronix.de
2020-11-04 18:06:14 +01:00
Ira Weiny 45ff510517 entry: Fixup irqentry_enter() comment
irq_enter_from_user_mode() was changed to irqentry_enter_from_user_mode().
Update the comment within irqentry_enter() to reflect this change.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201028163632.965518-1-ira.weiny@intel.com
2020-10-29 11:31:29 +01:00
Jens Axboe 12db8b6900 entry: Add support for TIF_NOTIFY_SIGNAL
Add TIF_NOTIFY_SIGNAL handling in the generic entry code, which if set,
will return true if signal_pending() is used in a wait loop. That causes an
exit of the loop so that notify_signal tracehooks can be run. If the wait
loop is currently inside a system call, the system call is restarted once
task_work has been processed.

In preparation for only having arch_do_signal() handle syscall restarts if
_TIF_SIGPENDING isn't set, rename it to arch_do_signal_or_restart().  Pass
in a boolean that tells the architecture specific signal handler if it
should attempt to get a signal, or just process a potential syscall
restart.

For !CONFIG_GENERIC_ENTRY archs, add the TIF_NOTIFY_SIGNAL handling to
get_signal(). This is done to minimize the needed architecture changes to
support this feature.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-3-axboe@kernel.dk
2020-10-29 09:37:36 +01:00
Linus Torvalds 4a22709e21 arch-cleanup-2020-10-22
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+SOXIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptrcD/93VUDmRAn73ChKNd0TtXUicJlAlNLVjvfs
 VFTXWBDnlJnGkZT7ElkDD9b8dsz8l4xGf/QZ5dzhC/th2OsfObQkSTfe0lv5cCQO
 mX7CRSrDpjaHtW+WGPDa0oQsGgIfpqUz2IOg9NKbZZ1LJ2uzYfdOcf3oyRgwZJ9B
 I3sh1vP6OzjZVVCMmtMTM+sYZEsDoNwhZwpkpiwMmj8tYtOPgKCYKpqCiXrGU0x2
 ML5FtDIwiwU+O3zYYdCBWqvCb2Db0iA9Aov2whEBz/V2jnmrN5RMA/90UOh1E2zG
 br4wM1Wt3hNrtj5qSxZGlF/HEMYJVB8Z2SgMjYu4vQz09qRVVqpGdT/dNvLAHQWg
 w4xNCj071kVZDQdfwnqeWSKYUau9Xskvi8xhTT+WX8a5CsbVrM9vGslnS5XNeZ6p
 h2D3Q+TAYTvT756icTl0qsYVP7PrPY7DdmQYu0q+Lc3jdGI+jyxO2h9OFBRLZ3p6
 zFX2N8wkvvCCzP2DwVnnhIi/GovpSh7ksHnb039F36Y/IhZPqV1bGqdNQVdanv6I
 8fcIDM6ltRQ7dO2Br5f1tKUZE9Pm6x60b/uRVjhfVh65uTEKyGRhcm5j9ztzvQfI
 cCBg4rbVRNKolxuDEkjsAFXVoiiEEsb7pLf4pMO+Dr62wxFG589tQNySySneUIVZ
 J9ILnGAAeQ==
 =aVWo
 -----END PGP SIGNATURE-----

Merge tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block

Pull arch task_work cleanups from Jens Axboe:
 "Two cleanups that don't fit other categories:

   - Finally get the task_work_add() cleanup done properly, so we don't
     have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates
     all callers, and also fixes up the documentation for
     task_work_add().

   - While working on some TIF related changes for 5.11, this
     TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch
     duplication for how that is handled"

* tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block:
  task_work: cleanup notification modes
  tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
2020-10-23 10:06:38 -07:00
Linus Torvalds 41eea65e2a Merge tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:

 - Debugging for smp_call_function()

 - RT raw/non-raw lock ordering fixes

 - Strict grace periods for KASAN

 - New smp_call_function() torture test

 - Torture-test updates

 - Documentation updates

 - Miscellaneous fixes

[ This doesn't actually pull the tag - I've dropped the last merge from
  the RCU branch due to questions about the series.   - Linus ]

* tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  smp: Make symbol 'csd_bug_count' static
  kernel/smp: Provide CSD lock timeout diagnostics
  smp: Add source and destination CPUs to __call_single_data
  rcu: Shrink each possible cpu krcp
  rcu/segcblist: Prevent useless GP start if no CBs to accelerate
  torture: Add gdb support
  rcutorture: Allow pointer leaks to test diagnostic code
  rcutorture: Hoist OOM registry up one level
  refperf: Avoid null pointer dereference when buf fails to allocate
  rcutorture: Properly synchronize with OOM notifier
  rcutorture: Properly set rcu_fwds for OOM handling
  torture: Add kvm.sh --help and update help message
  rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
  torture: Update initrd documentation
  rcutorture: Replace HTTP links with HTTPS ones
  locktorture: Make function torture_percpu_rwsem_init() static
  torture: document --allcpus argument added to the kvm.sh script
  rcutorture: Output number of elapsed grace periods
  rcutorture: Remove KCSAN stubs
  rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
  ...
2020-10-18 14:34:50 -07:00
Jens Axboe 3c532798ec tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
All the callers currently do this, clean it up and move the clearing
into tracehook_notify_resume() instead.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-17 15:04:36 -06:00
Linus Torvalds f94ab23113 * Misc minor cleanups.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl+ENW0ACgkQEsHwGGHe
 VUrWjw/+O0S/Bf7RQ2OIDnaHGo5u9k+T+FiklYtTO4klYqtfNEt/DFWVOIThVXBQ
 ma4I8Hspj+zUzlq2kqSeqJ2PiikTxRNDqkCUwZhqEFgbXS6/pt8VXXdPniKjeXge
 ZE4lcD1RIyDFxzVlKvVaYt1KryZZVVSRqRIChejLrujN23fI6riWfa0W4Bq54J6m
 fdiujuDJQ9oroak36dF5Ah6g4g8gL8hBLU9Oyzla9V+1O3GSZuDlwTgDsxZZkmC5
 LN4spxwd9tOXOmWhbH7vFfRtQL79KUHkHbUuUvZzZsJ/zs85bxhMa+fUAfjWAEja
 brMpD1GZKOcjUM7xzQ9HngMcKD8lWmlsTBTAO9drD89Z949ntjIA4uCY3d3RTJ1q
 NoYCV8Xw+8Q8e+zjnMW0tph39LCUEeuccT7t09XP5IF5UEXi5T5S14WoCu5Shnt9
 VTQ44NrAxpP7ZNWMpBTaxmr3aXABbdgnvDIxqrohqgQnCnPkWlBJ9FdKj8sQ3y9B
 K010ihIb1pWnmTyKGIC3GOWNjwtCpqz9z3gya76tI7EzAejVS6yUqwMohjaWq6JZ
 Tz/TtTSTUyczKiCCqoOf7P+5LKrhxjWS8IVBeMqMTeN7osCCIT69U+cox1Ih3DST
 pBfy7R3+FXKLHVi/iQv8E+fl3//pTGppKv4MM/wab0E6L+KhqEo=
 =NYxb
 -----END PGP SIGNATURE-----

Merge tag 'x86_cleanups_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Borislav Petkov:
 "Misc minor cleanups"

* tag 'x86_cleanups_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry: Fix typo in comments for syscall_enter_from_user_mode()
  x86/resctrl: Fix spelling in user-visible warning messages
  x86/entry/64: Do not include inst.h in calling.h
  x86/mpparse: Remove duplicate io_apic.h include
2020-10-12 10:51:02 -07:00
Ingo Molnar b36c830f8c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull v5.10 RCU changes from Paul E. McKenney:

- Debugging for smp_call_function().

- Strict grace periods for KASAN.  The point of this series is to find
  RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD
  Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is
  further disabled by dfefault.  Finally, the help text includes
  a goodly list of scary caveats.

- New smp_call_function() torture test.

- Torture-test updates.

- Documentation updates.

- Miscellaneous fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-10-09 08:21:56 +02:00
Kees Cook 900ffe39fe x86/entry: Fix typo in comments for syscall_enter_from_user_mode()
Just to help myself and others with finding the correct function names,
fix a typo for "usermode" vs "user_mode".

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200919080936.259819-1-keescook@chromium.org
2020-09-22 18:24:46 +02:00