Commit Graph

61 Commits

Author SHA1 Message Date
Baoquan He 308e9a3386 Document/kexec: generalize crash hotplug description
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: In Documentation/ABI/testing/sysfs-devices-system-cpu, there
           is conflict because of context fuzz.

commit c91c6062d6cd1bc366efb04973ee449c30398a49
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date:   Mon Aug 12 09:46:51 2024 +0530

    Document/kexec: generalize crash hotplug description

    Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")
    generalizes the crash hotplug support to allow architectures to update
    multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
    Therefore, update the relevant kernel documentation to reflect the same.

    No functional change.

    Link: https://lkml.kernel.org/r/20240812041651.703156-1-sourabhjain@linux.ibm.com
    Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Reviewed-by: Petr Tesarik <ptesarik@suse.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Hari Bathini <hbathini@linux.ibm.com>
    Cc: Petr Tesarik <petr@tesarici.cz>
    Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:36 +08:00
Baoquan He ca346af63f crash: add prefix for crash dumping messages
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 4707c13de3e42a47f0d99fe5fb58fa9dd23b455e
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Apr 18 11:58:43 2024 +0800

    crash: add prefix for crash dumping messages

    Add pr_fmt() to kernel/crash_core.c to add the module name to debugging
    message printed as prefix.

    And also add prefix 'crashkernel:' to two lines of message printing code
    in kernel/crash_reserve.c. In kernel/crash_reserve.c, almost all
    debugging messages have 'crashkernel:' prefix or there's keyword
    crashkernel at the beginning or in the middle, adding pr_fmt() makes it
    redundant.

    Link: https://lkml.kernel.org/r/20240418035843.1562887-1-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Jiri Slaby <jirislaby@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:36 +08:00
Baoquan He 930e56cdd6 crash: add a new kexec flag for hotplug support
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 79365026f86948b52c3cb7bf099dded92c559b4c
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date:   Tue Mar 26 11:24:09 2024 +0530

    crash: add a new kexec flag for hotplug support

    Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
    introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
    this flag to indicate to the kernel that it is safe to modify the
    elfcorehdr of the kdump image loaded using the kexec_load system call.

    However, it is possible that architectures may need to update kexec
    segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
    on PowerPC. Introducing a new kexec flag for every new kexec segment
    may not be a good solution. Hence, a generic kexec flag bit,
    `KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
    hotplug support intent between the kexec tool and the kernel for the
    kexec_load system call.

    Now we have two kexec flags that enables crash hotplug support for
    kexec_load system call. First is KEXEC_UPDATE_ELFCOREHDR (only used in
    x86), and second is KEXEC_CRASH_HOTPLUG_SUPPORT (for all architectures).

    To simplify the process of finding and reporting the crash hotplug
    support the following changes are introduced.

    1. Define arch specific function to process the kexec flags and
       determine crash hotplug support

    2. Rename the @update_elfcorehdr member of struct kimage to
       @hotplug_support and populate it for both kexec_load and
       kexec_file_load syscalls, because architecture can update more than
       one kexec segment

    3. Let generic function crash_check_hotplug_support report hotplug
       support for loaded kdump image based on value of @hotplug_support

    To bring the x86 crash hotplug support in line with the above points,
    the following changes have been made:

    - Introduce the arch_crash_hotplug_support function to process kexec
      flags and determine crash hotplug support

    - Remove the arch_crash_hotplug_[cpu|memory]_support functions

    Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://msgid.link/20240326055413.186534-3-sourabhjain@linux.ibm.com

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:36 +08:00
Baoquan He 009e42dc91 crash: forward memory_notify arg to arch crash hotplug handler
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 118005713e35a1893c6ee47ab2926cca277737de
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date:   Tue Mar 26 11:24:08 2024 +0530

    crash: forward memory_notify arg to arch crash hotplug handler

    In the event of memory hotplug or online/offline events, the crash
    memory hotplug notifier `crash_memhp_notifier()` receives a
    `memory_notify` object but doesn't forward that object to the
    generic and architecture-specific crash hotplug handler.

    The `memory_notify` object contains the starting PFN (Page Frame Number)
    and the number of pages in the hot-removed memory. This information is
    necessary for architectures like PowerPC to update/recreate the kdump
    image, specifically `elfcorehdr`.

    So update the function signature of `crash_handle_hotplug_event()` and
    `arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
    as an argument from crash memory hotplug notifier.

    Since no such object is available in the case of CPU hotplug event, the
    crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
    crash hotplug handler.

    Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://msgid.link/20240326055413.186534-2-sourabhjain@linux.ibm.com

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:36 +08:00
Baoquan He 6ddb054bd6 crash: split crash dumping code out from kexec_core.c
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: There's conflict in last hunk of include/linux/kexec.h
           because of the fuzz caused by earlier back ported commits
           related to commit f4af41bf177a ("kexec: fix the unexpected
           kexec_dprintk() macro").

commit 02aff8480533817a29e820729360866441d7403d
Author: Baoquan He <bhe@redhat.com>
Date:   Wed Jan 24 13:12:44 2024 +0800

    crash: split crash dumping code out from kexec_core.c

    Currently, KEXEC_CORE select CRASH_CORE automatically because crash codes
    need be built in to avoid compiling error when building kexec code even
    though the crash dumping functionality is not enabled. E.g
    --------------------
    CONFIG_CRASH_CORE=y
    CONFIG_KEXEC_CORE=y
    CONFIG_KEXEC=y
    CONFIG_KEXEC_FILE=y
    ---------------------

    After splitting out crashkernel reservation code and vmcoreinfo exporting
    code, there's only crash related code left in kernel/crash_core.c. Now
    move crash related codes from kexec_core.c to crash_core.c and only build it
    in when CONFIG_CRASH_DUMP=y.

    And also wrap up crash codes inside CONFIG_CRASH_DUMP ifdeffery scope,
    or replace inappropriate CONFIG_KEXEC_CORE ifdef with CONFIG_CRASH_DUMP
    ifdef in generic kernel files.

    With these changes, crash_core codes are abstracted from kexec codes and
    can be disabled at all if only kexec reboot feature is wanted.

    Link: https://lkml.kernel.org/r/20240124051254.67105-5-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Hari Bathini <hbathini@linux.ibm.com>
    Cc: Pingfan Liu <piliu@redhat.com>
    Cc: Klara Modin <klarasmodin@gmail.com>
    Cc: Michael Kelley <mhklinux@outlook.com>
    Cc: Nathan Chancellor <nathan@kernel.org>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Cc: Yang Li <yang.lee@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:35 +08:00
Baoquan He bb39202cd8 crash: split vmcoreinfo exporting code out from crash_core.c
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: There are conflicts in kernel/crash_core.c because dependency
           commit 55c49fee57af ("mm/vmalloc: remove vmap_area_list") not
           back ported yet; And commit d99e3140a4d3 ("mm: turn folio_test_hugetlb
           into a PageType") which is later than this commit in upstream
           has been merged into rhel.

commit 443cbaf9e2fdbef7d7cae457434a6cb8a679441b
Author: Baoquan He <bhe@redhat.com>
Date:   Wed Jan 24 13:12:42 2024 +0800

    crash: split vmcoreinfo exporting code out from crash_core.c

    Now move the relevant codes into separate files:
    kernel/crash_reserve.c, include/linux/crash_reserve.h.

    And add config item CRASH_RESERVE to control its enabling.

    And also update the old ifdeffery of CONFIG_CRASH_CORE, including of
    <linux/crash_core.h> and config item dependency on CRASH_CORE
    accordingly.

    And also do renaming as follows:
     - arch/xxx/kernel/{crash_core.c => vmcore_info.c}
    because they are only related to vmcoreinfo exporting on x86, arm64,
    riscv.

    And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to
    decide if build in crash_core.c.

    [yang.lee@linux.alibaba.com: remove duplicated include in vmcore_info.c]
      Link: https://lkml.kernel.org/r/20240126005744.16561-1-yang.lee@linux.alibaba.com
    Link: https://lkml.kernel.org/r/20240124051254.67105-3-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Pingfan Liu <piliu@redhat.com>
    Cc: Klara Modin <klarasmodin@gmail.com>
    Cc: Michael Kelley <mhklinux@outlook.com>
    Cc: Nathan Chancellor <nathan@kernel.org>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Cc: Yang Li <yang.lee@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:35 +08:00
Baoquan He bd724f6470 kexec: split crashkernel reservation code out from crash_core.c
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: There's conflict in arch/powerpc/mm/nohash/kaslr_booke.c,
           the 1st hunk need be edited manually because of fuzz; and
           discard changes related to risc-v.

commit 85fcde402db191b5f222ebfecda653777d7d084e
Author: Baoquan He <bhe@redhat.com>
Date:   Wed Jan 24 13:12:41 2024 +0800

    kexec: split crashkernel reservation code out from crash_core.c

    Patch series "Split crash out from kexec and clean up related config
    items", v3.

    Motivation:
    =============
    Previously, LKP reported a building error. When investigating, it can't
    be resolved reasonablly with the present messy kdump config items.

     https://lore.kernel.org/oe-kbuild-all/202312182200.Ka7MzifQ-lkp@intel.com/

    The kdump (crash dumping) related config items could causes confusions:

    Firstly,

    CRASH_CORE enables codes including
     - crashkernel reservation;
     - elfcorehdr updating;
     - vmcoreinfo exporting;
     - crash hotplug handling;

    Now fadump of powerpc, kcore dynamic debugging and kdump all selects
    CRASH_CORE, while fadump
     - fadump needs crashkernel parsing, vmcoreinfo exporting, and accessing
       global variable 'elfcorehdr_addr';
     - kcore only needs vmcoreinfo exporting;
     - kdump needs all of the current kernel/crash_core.c.

    So only enabling PROC_CORE or FA_DUMP will enable CRASH_CORE, this
    mislead people that we enable crash dumping, actual it's not.

    Secondly,

    It's not reasonable to allow KEXEC_CORE select CRASH_CORE.

    Because KEXEC_CORE enables codes which allocate control pages, copy
    kexec/kdump segments, and prepare for switching. These codes are
    shared by both kexec reboot and kdump. We could want kexec reboot,
    but disable kdump. In that case, CRASH_CORE should not be selected.

     --------------------
     CONFIG_CRASH_CORE=y
     CONFIG_KEXEC_CORE=y
     CONFIG_KEXEC=y
     CONFIG_KEXEC_FILE=y
     ---------------------

    Thirdly,

    It's not reasonable to allow CRASH_DUMP select KEXEC_CORE.

    That could make KEXEC_CORE, CRASH_DUMP are enabled independently from
    KEXEC or KEXEC_FILE. However, w/o KEXEC or KEXEC_FILE, the KEXEC_CORE
    code built in doesn't make any sense because no kernel loading or
    switching will happen to utilize the KEXEC_CORE code.
     ---------------------
     CONFIG_CRASH_CORE=y
     CONFIG_KEXEC_CORE=y
     CONFIG_CRASH_DUMP=y
     ---------------------

    In this case, what is worse, on arch sh and arm, KEXEC relies on MMU,
    while CRASH_DUMP can still be enabled when !MMU, then compiling error is
    seen as the lkp test robot reported in above link.

     ------arch/sh/Kconfig------
     config ARCH_SUPPORTS_KEXEC
             def_bool MMU

     config ARCH_SUPPORTS_CRASH_DUMP
             def_bool BROKEN_ON_SMP
     ---------------------------

    Changes:
    ===========
    1, split out crash_reserve.c from crash_core.c;
    2, split out vmcore_infoc. from crash_core.c;
    3, move crash related codes in kexec_core.c into crash_core.c;
    4, remove dependency of FA_DUMP on CRASH_DUMP;
    5, clean up kdump related config items;
    6, wrap up crash codes in crash related ifdefs on all 8 arch-es
       which support crash dumping, except of ppc;

    Achievement:
    ===========
    With above changes, I can rearrange the config item logic as below (the right
    item depends on or is selected by the left item):

        PROC_KCORE -----------> VMCORE_INFO

                   |----------> VMCORE_INFO
        FA_DUMP----|
                   |----------> CRASH_RESERVE

                                                        ---->VMCORE_INFO
                                                       /
                                                       |---->CRASH_RESERVE
        KEXEC      --|                                /|
                     |--> KEXEC_CORE--> CRASH_DUMP-->/-|---->PROC_VMCORE
        KEXEC_FILE --|                               \ |
                                                       \---->CRASH_HOTPLUG

        KEXEC      --|
                     |--> KEXEC_CORE (for kexec reboot only)
        KEXEC_FILE --|

    Test
    ========
    On all 8 architectures, including x86_64, arm64, s390x, sh, arm, mips,
    riscv, loongarch, I did below three cases of config item setting and
    building all passed. Take configs on x86_64 as exampmle here:

    (1) Both CONFIG_KEXEC and KEXEC_FILE is unset, then all kexec/kdump
    items are unset automatically:
    # Kexec and crash features
    # CONFIG_KEXEC is not set
    # CONFIG_KEXEC_FILE is not set
    # end of Kexec and crash features

    (2) set CONFIG_KEXEC_FILE and 'make olddefconfig':
    ---------------
    # Kexec and crash features
    CONFIG_CRASH_RESERVE=y
    CONFIG_VMCORE_INFO=y
    CONFIG_KEXEC_CORE=y
    CONFIG_KEXEC_FILE=y
    CONFIG_CRASH_DUMP=y
    CONFIG_CRASH_HOTPLUG=y
    CONFIG_CRASH_MAX_MEMORY_RANGES=8192
    # end of Kexec and crash features
    ---------------

    (3) unset CONFIG_CRASH_DUMP in case 2 and execute 'make olddefconfig':
    ------------------------
    # Kexec and crash features
    CONFIG_KEXEC_CORE=y
    CONFIG_KEXEC_FILE=y
    # end of Kexec and crash features
    ------------------------

    Note:
    For ppc, it needs investigation to make clear how to split out crash
    code in arch folder. Hope Hari and Pingfan can help have a look, see if
    it's doable. Now, I make it either have both kexec and crash enabled, or
    disable both of them altogether.

    This patch (of 14):

    Both kdump and fa_dump of ppc rely on crashkernel reservation.  Move the
    relevant codes into separate files: crash_reserve.c,
    include/linux/crash_reserve.h.

    And also add config item CRASH_RESERVE to control its enabling of the
    codes.  And update config items which has relationship with crashkernel
    reservation.

    And also change ifdeffery from CONFIG_CRASH_CORE to CONFIG_CRASH_RESERVE
    when those scopes are only crashkernel reservation related.

    And also rename arch/XXX/include/asm/{crash_core.h => crash_reserve.h} on
    arm64, x86 and risc-v because those architectures' crash_core.h is only
    related to crashkernel reservation.

    [akpm@linux-foundation.org: s/CRASH_RESEERVE/CRASH_RESERVE/, per Klara Modin]
    Link: https://lkml.kernel.org/r/20240124051254.67105-1-bhe@redhat.com
    Link: https://lkml.kernel.org/r/20240124051254.67105-2-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Pingfan Liu <piliu@redhat.com>
    Cc: Klara Modin <klarasmodin@gmail.com>
    Cc: Michael Kelley <mhklinux@outlook.com>
    Cc: Nathan Chancellor <nathan@kernel.org>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Cc: Yang Li <yang.lee@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:35 +08:00
Baoquan He 0f4b8d6066 kernel/crash_core.c: make __crash_hotplug_lock static
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 4e87ff59cebb53d3ce1333245e64ab7d51ebf118
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Tue Jan 9 21:35:21 2024 -0800

    kernel/crash_core.c: make __crash_hotplug_lock static

    sparse warnings:
    kernel/crash_core.c:749:1: sparse: sparse: symbol '__crash_hotplug_lock' was not declared. Should it be static?

    Fixes: e2a8f20dd8e9 ("Crash: add lock to serialize crash hotplug handling")
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202401080654.IjjU5oK7-lkp@intel.com/
    Cc: Baoquan He <bhe@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:34 +08:00
Baoquan He 2ede072ee3 kdump: defer the insertion of crashkernel resources
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 4a693ce65b186fddc1a73621bd6f941e6e3eca21
Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Fri Dec 29 16:02:13 2023 +0800

    kdump: defer the insertion of crashkernel resources

    In /proc/iomem, sub-regions should be inserted after their parent,
    otherwise the insertion of parent resource fails.  But after generic
    crashkernel reservation applied, in both RISC-V and ARM64 (LoongArch will
    also use generic reservation later on), crashkernel resources are inserted
    before their parent, which causes the parent disappear in /proc/iomem.  So
    we defer the insertion of crashkernel resources to an early_initcall().

    1, Without 'crashkernel' parameter:

     100d0100-100d01ff : LOON0001:00
       100d0100-100d01ff : LOON0001:00 LOON0001:00
     100e0000-100e0bff : LOON0002:00
       100e0000-100e0bff : LOON0002:00 LOON0002:00
     1fe001e0-1fe001e7 : serial
     90400000-fa17ffff : System RAM
       f6220000-f622ffff : Reserved
       f9ee0000-f9ee3fff : Reserved
       fa120000-fa17ffff : Reserved
     fa190000-fe0bffff : System RAM
       fa190000-fa1bffff : Reserved
     fe4e0000-47fffffff : System RAM
       43c000000-441ffffff : Reserved
       47ff98000-47ffa3fff : Reserved
       47ffa4000-47ffa7fff : Reserved
       47ffa8000-47ffabfff : Reserved
       47ffac000-47ffaffff : Reserved
       47ffb0000-47ffb3fff : Reserved

    2, With 'crashkernel' parameter, before this patch:

     100d0100-100d01ff : LOON0001:00
       100d0100-100d01ff : LOON0001:00 LOON0001:00
     100e0000-100e0bff : LOON0002:00
       100e0000-100e0bff : LOON0002:00 LOON0002:00
     1fe001e0-1fe001e7 : serial
     e6200000-f61fffff : Crash kernel
     fa190000-fe0bffff : System RAM
       fa190000-fa1bffff : Reserved
     fe4e0000-47fffffff : System RAM
       43c000000-441ffffff : Reserved
       47ff98000-47ffa3fff : Reserved
       47ffa4000-47ffa7fff : Reserved
       47ffa8000-47ffabfff : Reserved
       47ffac000-47ffaffff : Reserved
       47ffb0000-47ffb3fff : Reserved

    3, With 'crashkernel' parameter, after this patch:

     100d0100-100d01ff : LOON0001:00
       100d0100-100d01ff : LOON0001:00 LOON0001:00
     100e0000-100e0bff : LOON0002:00
       100e0000-100e0bff : LOON0002:00 LOON0002:00
     1fe001e0-1fe001e7 : serial
     90400000-fa17ffff : System RAM
       e6200000-f61fffff : Crash kernel
       f6220000-f622ffff : Reserved
       f9ee0000-f9ee3fff : Reserved
       fa120000-fa17ffff : Reserved
     fa190000-fe0bffff : System RAM
       fa190000-fa1bffff : Reserved
     fe4e0000-47fffffff : System RAM
       43c000000-441ffffff : Reserved
       47ff98000-47ffa3fff : Reserved
       47ffa4000-47ffa7fff : Reserved
       47ffa8000-47ffabfff : Reserved
       47ffac000-47ffaffff : Reserved
       47ffb0000-47ffb3fff : Reserved

    Link: https://lkml.kernel.org/r/20231229080213.2622204-1-chenhuacai@loongson.cn
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Fixes: 0ab97169aa05 ("crash_core: add generic function to do reservation")
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: <stable@vger.kernel.org>    [6.6+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:34 +08:00
Baoquan He 25d93b509c crash_core: fix and simplify the logic of crash_exclude_mem_range()
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 6dff315972640bfe542e2d044933751afd8e6c4a
Author: Yuntao Wang <ytcoode@gmail.com>
Date:   Tue Jan 2 22:49:05 2024 +0800

    crash_core: fix and simplify the logic of crash_exclude_mem_range()

    The purpose of crash_exclude_mem_range() is to remove all memory ranges
    that overlap with [mstart-mend].  However, the current logic only removes
    the first overlapping memory range.

    Commit a2e9a95d21 ("kexec: Improve & fix crash_exclude_mem_range() to
    handle overlapping ranges") attempted to address this issue, but it did
    not fix all error cases.

    Let's fix and simplify the logic of crash_exclude_mem_range().

    Link: https://lkml.kernel.org/r/20240102144905.110047-4-ytcoode@gmail.com
    Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Hari Bathini <hbathini@linux.ibm.com>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:34 +08:00
Baoquan He e1915ce210 crash_core: fix the check for whether crashkernel is from high memory
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 1dd11e977360ad3493812da0b05ffd9adcdd15a1
Author: Yuntao Wang <ytcoode@gmail.com>
Date:   Sat Dec 9 22:14:38 2023 +0800

    crash_core: fix the check for whether crashkernel is from high memory

    If crash_base is equal to CRASH_ADDR_LOW_MAX, it also indicates that
    the crashkernel memory is allocated from high memory. However, the
    current check only considers the case where crash_base is greater than
    CRASH_ADDR_LOW_MAX. Fix it.

    The runtime effects is that crashkernel high memory is successfully
    reserved, whereas the crashkernel low memory is bypassed in this case,
    then kdump kernel bootup will fail because of no low memory under 4G.

    This patch also includes some minor cleanups.

    Link: https://lkml.kernel.org/r/20231209141438.77233-1-ytcoode@gmail.com
    Fixes: 0ab97169aa05 ("crash_core: add generic function to do reservation")
    Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:34 +08:00
Baoquan He f44d72f378 crash_core: remove duplicated including of kexec.h
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 4459cd2e167e7208e57d517d16282408d9035dad
Author: Wang Jinchao <wangjinchao@xfusion.com>
Date:   Fri Dec 15 16:54:51 2023 +0800

    crash_core: remove duplicated including of kexec.h

    Remove second include of linux/kexec.h

    Link: https://lkml.kernel.org/r/202312151654+0800-wangjinchao@xfusion.com
    Signed-off-by: Wang Jinchao <wangjinchao@xfusion.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He 02c5a23604 crash_core.c: remove unneeded functions
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit c37e56cac3d62c69f093904afbc58fc428484d14
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Sep 14 11:31:42 2023 +0800

    crash_core.c: remove unneeded functions

    So far, nobody calls functions parse_crashkernel_high() and
    parse_crashkernel_low(), remove both of them.

    Link: https://lkml.kernel.org/r/20230914033142.676708-10-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chen Jiahao <chenjiahao16@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He ba669292fa crash_core: move crashk_*res definition into crash_core.c
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflict: There's conflict in kernel/kexec_core.c because of fuzz caused
          by beforehand back ported commit cbc2fe9d9cb2 ("kexec_file: add
          kexec_file flag to control debug printing").

commit b631b95dded5e7f007a3a79cbaf82ef50c1e2cf7
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Sep 14 11:31:38 2023 +0800

    crash_core: move crashk_*res definition into crash_core.c

    Both crashk_res and crashk_low_res are used to mark the reserved
    crashkernel regions in iomem_resource tree.  And later the generic
    crashkernel resrvation will be added into crash_core.c.  So move
    crashk_res and crashk_low_res definition into crash_core.c to avoid
    compiling error if CONFIG_CRASH_CORE=on while CONFIG_KEXEC_CORE is unset.

    Meanwhile include <asm/crash_core.h> in <linux/crash_core.h> if generic
    reservation is needed.  In that case, <asm/crash_core.h> need be added by
    ARCH.  In asm/crash_core.h, ARCH can provide its own macro definitions to
    override macros in <linux/crash_core.h> if needed.  Wrap the including
    into CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION ifdeffery scope to
    avoid compiling error in other ARCH-es which don't take the generic
    reservation way yet.

    Link: https://lkml.kernel.org/r/20230914033142.676708-6-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chen Jiahao <chenjiahao16@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He 52d3c063ff crash_core: add generic function to do reservation
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 0ab97169aa0517079b22c2e64192906caa5dc6d5
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Sep 14 11:31:37 2023 +0800

    crash_core: add generic function to do reservation

    In architecture like x86_64, arm64 and riscv, they have vast virtual
    address space and usually have huge physical memory RAM.  Their
    crashkernel reservation doesn't have to be limited under 4G RAM, but can
    be extended to the whole physical memory via crashkernel=,high support.

    Now add function reserve_crashkernel_generic() to reserve crashkernel
    memory if users specify any case of kernel pamameters, like
    crashkernel=xM[@offset] or crashkernel=,high|low.

    This is preparation to simplify code of crashkernel=,high support in
    architecutures.

    Link: https://lkml.kernel.org/r/20230914033142.676708-5-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chen Jiahao <chenjiahao16@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He 977ee68f8e crash_core: change parse_crashkernel() to support crashkernel=,high|low parsing
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 70916e9c8d9f1a286c99727072b22e395097909f
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Sep 14 11:31:36 2023 +0800

    crash_core: change parse_crashkernel() to support crashkernel=,high|low parsing

    Now parse_crashkernel() is a real entry point for all kinds of crahskernel
    parsing on any architecture.

    And wrap the crahskernel=,high|low handling inside
    CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION ifdeffery scope.

    Link: https://lkml.kernel.org/r/20230914033142.676708-4-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chen Jiahao <chenjiahao16@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He 48bac334a7 crash_core: change the prototype of function parse_crashkernel()
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit a9e1a3d84e4a0ea560ed4d84c28d06dbfdffed22
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Sep 14 11:31:35 2023 +0800

    crash_core: change the prototype of function parse_crashkernel()

    Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
    later crashkernel=,high|low parsing will be added.  Make adjustments in
    all call sites of parse_crashkernel() in arch.

    Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chen Jiahao <chenjiahao16@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He 619d3c4430 crash_core.c: remove unnecessary parameter of function
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit a6304272b03ece97346f16923453f7e36ec19a5a
Author: Baoquan He <bhe@redhat.com>
Date:   Thu Sep 14 11:31:34 2023 +0800

    crash_core.c: remove unnecessary parameter of function

    Patch series "kdump: use generic functions to simplify crashkernel
    reservation in arch", v3.

    In the current arm64, crashkernel=,high support has been finished after
    several rounds of posting and careful reviewing.  The code in arm64 which
    parses crashkernel kernel parameters firstly, then reserve memory can be a
    good example for other ARCH to refer to.

    Whereas in x86_64, the code mixing crashkernel parameter parsing and
    memory reserving is twisted, and looks messy.  Refactoring the code to
    make it more readable maintainable is necessary.

    Here, firstly abstract the crashkernel parameter parsing code into
    parse_crashkernel() to make it be able to parse crashkernel=,high|low.
    Then abstract the crashkernel memory reserving code into a generic
    function reserve_crashkernel_generic().  Finally, in ARCH which
    crashkernel=,high support is needed, a simple arch_reserve_crashkernel()
    can be added to call above two functions.  This can remove the duplicated
    implmentation code in each ARCH, like arm64, x86_64 and riscv.

    crashkernel=512M,high
    crashkernel=512M,high crashkernel=256M,low
    crashkernel=512M,high crashkernel=0M,low
    crashkernel=0M,high crashkernel=256M,low
    crashkernel=512M
    crashkernel=512M@0x4f000000
    crashkernel=1G-4G:256M,4G-64G:320M,64G-:576M
    crashkernel=0M

    This patch (of 9):

    In all call sites of __parse_crashkernel(), the parameter 'name' is
    hardcoded as "crashkernel=".  So remove the unnecessary parameter 'name',
    add local varibale 'name' inside __parse_crashkernel() instead.

    Link: https://lkml.kernel.org/r/20230914033142.676708-1-bhe@redhat.com
    Link: https://lkml.kernel.org/r/20230914033142.676708-2-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chen Jiahao <chenjiahao16@huawei.com>
    Cc: Zhen Lei <thunder.leizhen@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:33 +08:00
Baoquan He 201a316c8d Crash: add lock to serialize crash hotplug handling
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit e2a8f20dd8e9df695f736e51cd9115ae55be92d1
Author: Baoquan He <bhe@redhat.com>
Date:   Tue Sep 26 20:09:05 2023 +0800

    Crash: add lock to serialize crash hotplug handling

    Eric reported that handling corresponding crash hotplug event can be
    failed easily when many memory hotplug event are notified in a short
    period.  They failed because failing to take __kexec_lock.

    =======
    [   78.714569] Fallback order for Node 0: 0
    [   78.714575] Built 1 zonelists, mobility grouping on.  Total pages: 1817886
    [   78.717133] Policy zone: Normal
    [   78.724423] crash hp: kexec_trylock() failed, elfcorehdr may be inaccurate
    [   78.727207] crash hp: kexec_trylock() failed, elfcorehdr may be inaccurate
    [   80.056643] PEFILE: Unsigned PE binary
    =======

    The memory hotplug events are notified very quickly and very many, while
    the handling of crash hotplug is much slower relatively.  So the atomic
    variable __kexec_lock and kexec_trylock() can't guarantee the
    serialization of crash hotplug handling.

    Here, add a new mutex lock __crash_hotplug_lock to serialize crash hotplug
    handling specifically.  This doesn't impact the usage of __kexec_lock.

    Link: https://lkml.kernel.org/r/20230926120905.392903-1-bhe@redhat.com
    Fixes: 247262756121 ("crash: add generic infrastructure for crash hotplug support")
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Tested-by: Eric DeVolder <eric.devolder@oracle.com>
    Reviewed-by: Eric DeVolder <eric.devolder@oracle.com>
    Reviewed-by: Valentin Schneider <vschneid@redhat.com>
    Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:32 +08:00
Baoquan He 3283813522 crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit a396d0f81b1c81b1fa60b5554b5c0e3cea1f27c0
Author: Eric DeVolder <eric.devolder@oracle.com>
Date:   Mon Aug 14 17:44:45 2023 -0400

    crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()

    The function crash_prepare_elf64_headers() generates the elfcorehdr which
    describes the CPUs and memory in the system for the crash kernel.  In
    particular, it writes out ELF PT_NOTEs for memory regions and the CPUs in
    the system.

    With respect to the CPUs, the current implementation utilizes
    for_each_present_cpu() which means that as CPUs are added and removed, the
    elfcorehdr must again be updated to reflect the new set of CPUs.

    The reasoning behind the move to use for_each_possible_cpu(), is:

    - At kernel boot time, all percpu crash_notes are allocated for all
      possible CPUs; that is, crash_notes are not allocated dynamically
      when CPUs are plugged/unplugged. Thus the crash_notes for each
      possible CPU are always available.

    - The crash_prepare_elf64_headers() creates an ELF PT_NOTE per CPU.
      Changing to for_each_possible_cpu() is valid as the crash_notes
      pointed to by each CPU PT_NOTE are present and always valid.

    Furthermore, examining a common crash processing path of:

     kernel panic -> crash kernel -> makedumpfile -> 'crash' analyzer
               elfcorehdr      /proc/vmcore     vmcore

    reveals how the ELF CPU PT_NOTEs are utilized:

    - Upon panic, each CPU is sent an IPI and shuts itself down, recording
     its state in its crash_notes. When all CPUs are shutdown, the
     crash kernel is launched with a pointer to the elfcorehdr.

    - The crash kernel via linux/fs/proc/vmcore.c does not examine or
     use the contents of the PT_NOTEs, it exposes them via /proc/vmcore.

    - The makedumpfile utility uses /proc/vmcore and reads the CPU
     PT_NOTEs to craft a nr_cpus variable, which is reported in a
     header but otherwise generally unused. Makedumpfile creates the
     vmcore.

    - The 'crash' dump analyzer does not appear to reference the CPU
     PT_NOTEs. Instead it looks-up the cpu_[possible|present|onlin]_mask
     symbols and directly examines those structure contents from vmcore
     memory. From that information it is able to determine which CPUs
     are present and online, and locate the corresponding crash_notes.
     Said differently, it appears that 'crash' analyzer does not rely
     on the ELF PT_NOTEs for CPUs; rather it obtains the information
     directly via kernel symbols and the memory within the vmcore.

    (There maybe other vmcore generating and analysis tools that do use these
    PT_NOTEs, but 'makedumpfile' and 'crash' seems to be the most common
    solution.)

    This results in the benefit of having all CPUs described in the
    elfcorehdr, and therefore reducing the need to re-generate the elfcorehdr
    on CPU changes, at the small expense of an additional 56 bytes per PT_NOTE
    for not-present-but-possible CPUs.

    On systems where kexec_file_load() syscall is utilized, all the above is
    valid.  On systems where kexec_load() syscall is utilized, there may be
    the need for the elfcorehdr to be regenerated once.  The reason being that
    some archs only populate the 'present' CPUs from the
    /sys/devices/system/cpus entries, which the userspace 'kexec' utility uses
    to generate the userspace-supplied elfcorehdr.  In this situation, one
    memory or CPU change will rewrite the elfcorehdr via the
    crash_prepare_elf64_headers() function and now all possible CPUs will be
    described, just as with kexec_file_load() syscall.

    Link: https://lkml.kernel.org/r/20230814214446.6659-8-eric.devolder@oracle.com
    Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
    Suggested-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Akhil Raj <lf32.dev@gmail.com>
    Cc: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Mimi Zohar <zohar@linux.ibm.com>
    Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    Cc: Sean Christopherson <seanjc@google.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Cc: Valentin Schneider <vschneid@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:32 +08:00
Baoquan He a907b5882c crash: hotplug support for kexec_load()
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: There's conflict in last hunk of include/linux/kexec.h
           because commit cbc2fe9d9cb2 ("kexec_file: add kexec_file
           flag to control debug printing") has been back ported, and
           in 1st hunk of kernel/ksysfs.c because of fuzz.
           Meanwhile move definition of kexec_dprintk down for easier
           latter back porting.

commit a72bbec70da285a7e09e53fb13c2da7da2032da9
Author: Eric DeVolder <eric.devolder@oracle.com>
Date:   Mon Aug 14 17:44:44 2023 -0400

    crash: hotplug support for kexec_load()

    The hotplug support for kexec_load() requires changes to the userspace
    kexec-tools and a little extra help from the kernel.

    Given a kdump capture kernel loaded via kexec_load(), and a subsequent
    hotplug event, the crash hotplug handler finds the elfcorehdr and rewrites
    it to reflect the hotplug change.  That is the desired outcome, however,
    at kernel panic time, the purgatory integrity check fails (because the
    elfcorehdr changed), and the capture kernel does not boot and no vmcore is
    generated.

    Therefore, the userspace kexec-tools/kexec must indicate to the kernel
    that the elfcorehdr can be modified (because the kexec excluded the
    elfcorehdr from the digest, and sized the elfcorehdr memory buffer
    appropriately).

    To facilitate hotplug support with kexec_load():
     - a new kexec flag KEXEC_UPATE_ELFCOREHDR indicates that it is
       safe for the kernel to modify the kexec_load()'d elfcorehdr
     - the /sys/kernel/crash_elfcorehdr_size node communicates the
       preferred size of the elfcorehdr memory buffer
     - The sysfs crash_hotplug nodes (ie.
       /sys/devices/system/[cpu|memory]/crash_hotplug) dynamically
       take into account kexec_file_load() vs kexec_load() and
       KEXEC_UPDATE_ELFCOREHDR.
       This is critical so that the udev rule processing of crash_hotplug
       is all that is needed to determine if the userspace unload-then-load
       of the kdump image is to be skipped, or not. The proposed udev
       rule change looks like:
       # The kernel updates the crash elfcorehdr for CPU and memory changes
       SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
       SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

    The table below indicates the behavior of kexec_load()'d kdump image
    updates (with the new udev crash_hotplug rule in place):

     Kernel |Kexec
     -------+-----+----
     Old    |Old  |New
            |  a  | a
     -------+-----+----
     New    |  a  | b
     -------+-----+----

    where kexec 'old' and 'new' delineate kexec-tools has the needed
    modifications for the crash hotplug feature, and kernel 'old' and 'new'
    delineate the kernel supports this crash hotplug feature.

    Behavior 'a' indicates the unload-then-reload of the entire kdump image.
    For the kexec 'old' column, the unload-then-reload occurs due to the
    missing flag KEXEC_UPDATE_ELFCOREHDR.  An 'old' kernel (with 'new' kexec)
    does not present the crash_hotplug sysfs node, which leads to the
    unload-then-reload of the kdump image.

    Behavior 'b' indicates the desired optimized behavior of the kernel
    directly modifying the elfcorehdr and avoiding the unload-then-reload of
    the kdump image.

    If the udev rule is not updated with crash_hotplug node check, then no
    matter any combination of kernel or kexec is new or old, the kdump image
    continues to be unload-then-reload on hotplug changes.

    To fully support crash hotplug feature, there needs to be a rollout of
    kernel, kexec-tools and udev rule changes.  However, the order of the
    rollout of these pieces does not matter; kexec_load()'d kdump images still
    function for hotplug as-is.

    Link: https://lkml.kernel.org/r/20230814214446.6659-7-eric.devolder@oracle.com
    Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
    Suggested-by: Hari Bathini <hbathini@linux.ibm.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Akhil Raj <lf32.dev@gmail.com>
    Cc: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Mimi Zohar <zohar@linux.ibm.com>
    Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    Cc: Sean Christopherson <seanjc@google.com>
    Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Cc: Valentin Schneider <vschneid@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:32 +08:00
Baoquan He e5524c12ce crash: add generic infrastructure for crash hotplug support
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflict: there's conflict in include/linux/kexec.h when defining
          arch_crash_handle_hotplug_event because below two commits have
          been back ported:
          commit 013a5d02a3 ("crash: memory and CPU hotplug sysfs attributes")
          commit ba5f45be9a ("kexec_file: add kexec_file flag to control debug printing")

commit 24726275612140af6b1c0afc7c6611ad66233207
Author: Eric DeVolder <eric.devolder@oracle.com>
Date:   Mon Aug 14 17:44:40 2023 -0400

    crash: add generic infrastructure for crash hotplug support

    To support crash hotplug, a mechanism is needed to update the crash
    elfcorehdr upon CPU or memory changes (eg.  hot un/plug or off/ onlining).
    The crash elfcorehdr describes the CPUs and memory to be written into the
    vmcore.

    To track CPU changes, callbacks are registered with the cpuhp mechanism
    via cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN).  The crash hotplug
    elfcorehdr update has no explicit ordering requirement (relative to other
    cpuhp states), so meets the criteria for utilizing CPUHP_BP_PREPARE_DYN.
    CPUHP_BP_PREPARE_DYN is a dynamic state and avoids the need to introduce a
    new state for crash hotplug.  Also, CPUHP_BP_PREPARE_DYN is the last state
    in the PREPARE group, just prior to the STARTING group, which is very
    close to the CPU starting up in a plug/online situation, or stopping in a
    unplug/ offline situation.  This minimizes the window of time during an
    actual plug/online or unplug/offline situation in which the elfcorehdr
    would be inaccurate.  Note that for a CPU being unplugged or offlined, the
    CPU will still be present in the list of CPUs generated by
    crash_prepare_elf64_headers().  However, there is no need to explicitly
    omit the CPU, see justification in 'crash: change
    crash_prepare_elf64_headers() to for_each_possible_cpu()'.

    To track memory changes, a notifier is registered to capture the memblock
    MEM_ONLINE and MEM_OFFLINE events via register_memory_notifier().

    The CPU callbacks and memory notifiers invoke crash_handle_hotplug_event()
    which performs needed tasks and then dispatches the event to the
    architecture specific arch_crash_handle_hotplug_event() to update the
    elfcorehdr with the current state of CPUs and memory.  During the process,
    the kexec_lock is held.

    Link: https://lkml.kernel.org/r/20230814214446.6659-3-eric.devolder@oracle.com
    Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
    Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Akhil Raj <lf32.dev@gmail.com>
    Cc: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Mimi Zohar <zohar@linux.ibm.com>
    Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    Cc: Sean Christopherson <seanjc@google.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Cc: Valentin Schneider <vschneid@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:32 +08:00
Baoquan He 536f874a42 crash: move a few code bits to setup support of crash hotplug
JIRA: https://issues.redhat.com/browse/RHEL-58641

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 6f991cc363a3269866476b8ff10a112768d3d45c
Author: Eric DeVolder <eric.devolder@oracle.com>
Date:   Mon Aug 14 17:44:39 2023 -0400

    crash: move a few code bits to setup support of crash hotplug

    Patch series "crash: Kernel handling of CPU and memory hot un/plug", v28.

    Once the kdump service is loaded, if changes to CPUs or memory occur,
    either by hot un/plug or off/onlining, the crash elfcorehdr must also be
    updated.

    The elfcorehdr describes to kdump the CPUs and memory in the system, and
    any inaccuracies can result in a vmcore with missing CPU context or memory
    regions.

    The current solution utilizes udev to initiate an unload-then-reload of
    the kdump image (eg.  kernel, initrd, boot_params, purgatory and
    elfcorehdr) by the userspace kexec utility.  In the original post I
    outlined the significant performance problems related to offloading this
    activity to userspace.

    This patchset introduces a generic crash handler that registers with the
    CPU and memory notifiers.  Upon CPU or memory changes, from either hot
    un/plug or off/onlining, this generic handler is invoked and performs
    important housekeeping, for example obtaining the appropriate lock, and
    then invokes an architecture specific handler to do the appropriate
    elfcorehdr update.

    Note the description in patch 'crash: change crash_prepare_elf64_headers()
    to for_each_possible_cpu()' and 'x86/crash: optimize CPU changes' that
    enables further optimizations related to CPU plug/unplug/online/offline
    performance of elfcorehdr updates.

    In the case of x86_64, the arch specific handler generates a new
    elfcorehdr, and overwrites the old one in memory; thus no involvement with
    userspace needed.

    To realize the benefits/test this patchset, one must make a couple
    of minor changes to userspace:

     - Prevent udev from updating kdump crash kernel on hot un/plug changes.
       Add the following as the first lines to the RHEL udev rule file
       /usr/lib/udev/rules.d/98-kexec.rules:

       # The kernel updates the crash elfcorehdr for CPU and memory changes
       SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
       SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

       With this changeset applied, the two rules evaluate to false for
       CPU and memory change events and thus skip the userspace
       unload-then-reload of kdump.

     - Change to the kexec_file_load for loading the kdump kernel:
       Eg. on RHEL: in /usr/bin/kdumpctl, change to:
        standard_kexec_args="-p -d -s"
       which adds the -s to select kexec_file_load() syscall.

    This kernel patchset also supports kexec_load() with a modified kexec
    userspace utility.  A working changeset to the kexec userspace utility is
    posted to the kexec-tools mailing list here:

     http://lists.infradead.org/pipermail/kexec/2023-May/027049.html

    To use the kexec-tools patch, apply, build and install kexec-tools, then
    change the kdumpctl's standard_kexec_args to replace the -s with
    --hotplug.  The removal of -s reverts to the kexec_load syscall and the
    addition of --hotplug invokes the changes put forth in the kexec-tools
    patch.

    This patch (of 8):

    The crash hotplug support leans on the work for the kexec_file_load()
    syscall.  To also support the kexec_load() syscall, a few bits of code
    need to be move outside of CONFIG_KEXEC_FILE.  As such, these bits are
    moved out of kexec_file.c and into a common location crash_core.c.

    In addition, struct crash_mem and crash_notes were moved to new locales so
    that PROC_KCORE, which sets CRASH_CORE alone, builds correctly.

    No functionality change intended.

    Link: https://lkml.kernel.org/r/20230814214446.6659-1-eric.devolder@oracle.com
    Link: https://lkml.kernel.org/r/20230814214446.6659-2-eric.devolder@oracle.com
    Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
    Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
    Acked-by: Hari Bathini <hbathini@linux.ibm.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Akhil Raj <lf32.dev@gmail.com>
    Cc: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Mimi Zohar <zohar@linux.ibm.com>
    Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    Cc: Sean Christopherson <seanjc@google.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Cc: Valentin Schneider <vschneid@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-12-23 09:35:32 +08:00
Rafael Aquini 6be9f58cab mm: turn folio_test_hugetlb into a PageType
JIRA: https://issues.redhat.com/browse/RHEL-27745
Conflicts:
  * kernel/vmcore_info.c: hunk applied to kernel/crash_core.c instead, as
    RHEL9 misses upstream commit 443cbaf9e2fd ("crash: split vmcoreinfo
    exporting code out from crash_core.c") and related series;
  * mm/hugetlb.c: minor context difference due to RHEL9 missing upstream
    commit d67e32f26713 ("hugetlb: restructure pool allocations") and its
    related series;

This patch is a backport of the following upstream commit:
commit d99e3140a4d33e26066183ff727d8f02f56bec64
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Thu Mar 21 14:24:43 2024 +0000

    mm: turn folio_test_hugetlb into a PageType

    The current folio_test_hugetlb() can be fooled by a concurrent folio split
    into returning true for a folio which has never belonged to hugetlbfs.
    This can't happen if the caller holds a refcount on it, but we have a few
    places (memory-failure, compaction, procfs) which do not and should not
    take a speculative reference.

    Since hugetlb pages do not use individual page mapcounts (they are always
    fully mapped and use the entire_mapcount field to record the number of
    mappings), the PageType field is available now that page_mapcount()
    ignores the value in this field.

    In compaction and with CONFIG_DEBUG_VM enabled, the current implementation
    can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b
    ("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks
    in the PageHuge() testing path.

    [willy@infradead.org: update vmcoreinfo]
      Link: https://lkml.kernel.org/r/ZgGZUvsdhaT1Va-T@casper.infradead.org
    Link: https://lkml.kernel.org/r/20240321142448.1645400-6-willy@infradead.org
    Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR")
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Reported-by: Luis Chamberlain <mcgrof@kernel.org>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:52 -05:00
Rafael Aquini 9f578eff61 mm, treewide: introduce NR_PAGE_ORDERS
JIRA: https://issues.redhat.com/browse/RHEL-27745
Conflicts:
  * drivers/gpu/drm/*, include/drm/ttm/ttm_pool.h: all hunks dropped due to
    RHEL-only commit ca8b16c11b ("Merge DRM changes from upstream v6.7..v6.8");
  * include/linux/mmzone.h: 3rd hunk dropped due to RHEL-only commit
    afa0ca9cf7 ("Partial backport of mm, treewide: introduce NR_PAGE_ORDERS");

This patch is a backport of the following upstream commit:
commit fd37721803c6e73619108f76ad2e12a9aa5fafaf
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Thu Dec 28 17:47:03 2023 +0300

    mm, treewide: introduce NR_PAGE_ORDERS

    NR_PAGE_ORDERS defines the number of page orders supported by the page
    allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.

    NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
    more natural iteration over them.

    [kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
      Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
    Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:14 -05:00
Rafael Aquini 5f2d093c62 mm: free up a word in the first tail page
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit ebc1baf5c9b46c2240c580a2fd992b2e48606dfa
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Aug 16 16:11:58 2023 +0100

    mm: free up a word in the first tail page

    Store the folio order in the low byte of the flags word in the first tail
    page.  This frees up the word that was being used to store the order and
    dtor bytes previously.

    Link: https://lkml.kernel.org/r/20230816151201.3655946-11-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Cc: Yanteng Si <siyanteng@loongson.cn>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:53 -04:00
Rafael Aquini c6fc2dab3f mm: add large_rmappable page flag
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit de53c05f2ae3d47d30db58e9c4e54e3bbc868377
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Aug 16 16:11:56 2023 +0100

    mm: add large_rmappable page flag

    Stored in the first tail page's flags, this flag replaces the destructor.
    That removes the last of the destructors, so remove all references to
    folio_dtor and compound_dtor.

    Link: https://lkml.kernel.org/r/20230816151201.3655946-9-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Cc: Yanteng Si <siyanteng@loongson.cn>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:51 -04:00
Rafael Aquini 980ab30d90 mm: remove HUGETLB_PAGE_DTOR
JIRA: https://issues.redhat.com/browse/RHEL-27743
Conflicts:
  * mm/hugetlb.c: conflict on the 4th hunk due to out-of-order backport of
      commit d8f5f7e445f0 ("hugetlb: set hugetlb page flag before optimizing vmemmap")

This patch is a backport of the following upstream commit:
commit 9c5ccf2db04b8d7c3df363fdd4856c2b79ab2c6a
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Aug 16 16:11:55 2023 +0100

    mm: remove HUGETLB_PAGE_DTOR

    We can use a bit in page[1].flags to indicate that this folio belongs to
    hugetlb instead of using a value in page[1].dtors.  That lets
    folio_test_hugetlb() become an inline function like it should be.  We can
    also get rid of NULL_COMPOUND_DTOR.

    Link: https://lkml.kernel.org/r/20230816151201.3655946-8-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Cc: Yanteng Si <siyanteng@loongson.cn>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:50 -04:00
Lucas Zampieri 5739ae2afe Merge: Rebase kexec/kdump to upstream kernel v6.5
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4053

```
JIRA: https://issues.redhat.com/browse/RHEL-32199
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

This rebase kexec/kdump of rhel9 kernel to v6.5 of mainline kernel. This is for rhel9.5. Last time rebase was done in rhel9.2 and synchronized to v6.0.

Signed-off-by: Baoquan He <bhe@redhat.com>
```

Approved-by: Vladis Dronov <vdronov@redhat.com>
Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Lenny Szubowicz <lszubowi@redhat.com>
Approved-by: Lichen Liu <lichliu@redhat.com>
Approved-by: Tao Liu <ltao@redhat.com>
Approved-by: Pingfan Liu <piliu@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-05-27 13:52:25 +00:00
Baoquan He 99a43b7a43 vmcoreinfo: warn if we exceed vmcoreinfo data size
JIRA: https://issues.redhat.com/browse/RHEL-32199

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 08fc35f31b9e14cb4e8ba5bf43f824559dbdbe88
Author: Stephen Brennan <stephen.s.brennan@oracle.com>
Date:   Thu Oct 27 13:50:08 2022 -0700

    vmcoreinfo: warn if we exceed vmcoreinfo data size

    Though vmcoreinfo is intended to be small, at just one page, useful
    information is still added to it, so we risk running out of space.
    Currently there is no runtime check to see whether the vmcoreinfo buffer
    has been exhausted.  Add a warning for this case.

    Currently, my static checking tool[1] indicates that a good upper bound
    for vmcoreinfo size is currently 3415 bytes, but the best time to add
    warnings is before the risk becomes too high.

    [1] https://github.com/brenns10/kernel_stuff/blob/master/vmcoreinfosize/vmcoreinfosize.py

    Link: https://lkml.kernel.org/r/20221027205008.312534-1-stephen.s.brennan@oracle.com
    Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Kees Cook <keescook@chromium.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-05-15 10:32:32 +08:00
Aristeu Rozanski 51bc2d0667 mm: remove 'First tail page' members from struct page
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 1c5509be58f636afabbdaf66e7436da8ec0a1828
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Jan 11 14:29:08 2023 +0000

    mm: remove 'First tail page' members from struct page

    All former users now use the folio equivalents, so remove them from the
    definition of struct page.

    Link: https://lkml.kernel.org/r/20230111142915.1001531-23-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:06 -04:00
Paolo Bonzini 538bf6f332 mm, treewide: redefine MAX_ORDER sanely
JIRA: https://issues.redhat.com/browse/RHEL-10059

MAX_ORDER currently defined as number of orders page allocator supports:
user can ask buddy allocator for page order between 0 and MAX_ORDER-1.

This definition is counter-intuitive and lead to number of bugs all over
the kernel.

Change the definition of MAX_ORDER to be inclusive: the range of orders
user can ask from buddy allocator is 0..MAX_ORDER now.

[kirill@shutemov.name: fix min() warning]
  Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box
[akpm@linux-foundation.org: fix another min_t warning]
[kirill@shutemov.name: fixups per Zi Yan]
  Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name
[akpm@linux-foundation.org: fix underlining in docs]
  Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/
Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 23baf831a32c04f9a968812511540b1b3e648bf5)

[RHEL: Fix conflicts by changing MAX_ORDER - 1 to MAX_ORDER,
       ">= MAX_ORDER" to "> MAX_ORDER", etc.]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-30 09:12:37 +01:00
Baoquan He a4af720f36 vmcoreinfo: add kallsyms_num_syms symbol
Bugzilla: https://bugzilla.redhat.com/2119002
Upstream Status: Linus's tree
Conflict: None

commit f09bddbd86619bf6213c96142a3b6b6a84818798
Author: Stephen Brennan <stephen.s.brennan@oracle.com>
Date:   Mon Aug 8 13:54:10 2022 -0700

    vmcoreinfo: add kallsyms_num_syms symbol

    The rest of the kallsyms symbols are useless without knowing the number of
    symbols in the table.  In an earlier patch, I somehow dropped the
    kallsyms_num_syms symbol, so add it back in.

    Link: https://lkml.kernel.org/r/20220808205410.18590-1-stephen.s.brennan@oracle.com
    Fixes: 5fd8fea935a1 ("vmcoreinfo: include kallsyms symbols")
    Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Baoquan He <bhe@redhat.com>

Signed-off-by: Baoquan He <bhe@redhat.com>
2022-11-16 03:47:03 -05:00
Baoquan He f3ba334e62 vmcoreinfo: include kallsyms symbols
Bugzilla: https://bugzilla.redhat.com/2119002
Upstream Status: Linus's tree
Conflict: None

commit 5fd8fea935a1091083506d0b982fcc5d35062f06
Author: Stephen Brennan <stephen.s.brennan@oracle.com>
Date:   Mon May 16 17:05:08 2022 -0700

    vmcoreinfo: include kallsyms symbols

    The internal kallsyms tables contain information which could be quite
    useful to a debugging tool in the absence of other debuginfo.  If kallsyms
    is enabled, then a debugging tool could parse it and use it as a fallback
    symbol table.  Combined with BTF data, live & post-mortem debuggers can
    support basic operations without needing a large DWARF debuginfo file
    available.  As many as five symbols are necessary to properly parse
    kallsyms names and addresses.  Add these to the vmcoreinfo note.

    CONFIG_KALLSYMS_ABSOLUTE_PERCPU does impact the computation of symbol
    addresses.  However, a debugger can infer this configuration value by
    comparing the address of _stext in the vmcoreinfo with the address
    computed via kallsyms.  So there's no need to include information about
    this config value in the vmcoreinfo note.

    To verify that we're still well below the maximum of 4096 bytes, I created
    a script[1] to compute a rough upper bound on the possible size of
    vmcoreinfo.  On v5.18-rc7, the script reports 3106 bytes, and with this
    patch, the maximum become 3370 bytes.

    [1]: https://github.com/brenns10/kernel_stuff/blob/master/vmcoreinfosize/

    Link: https://lkml.kernel.org/r/20220517000508.777145-3-stephen.s.brennan@oracle.com
    Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Bixuan Cui <cuibixuan@huawei.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: David Vernet <void@manifault.com>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Nick Desaulniers <ndesaulniers@google.com>
    Cc: Sami Tolvanen <samitolvanen@google.com>
    Cc: Stephen Boyd <swboyd@chromium.org>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Baoquan He <bhe@redhat.com>

Signed-off-by: Baoquan He <bhe@redhat.com>
2022-11-16 03:47:03 -05:00
Baoquan He 193263bfe0 kernel/crash_core.c: remove redundant check of ck_cmdline
Bugzilla: https://bugzilla.redhat.com/2119002
Upstream Status: Linus's tree
Conflict: None

commit a7bd57b87f65e0e1c5d41baf51a0d0b49fb30808
Author: lizhe <sensor1010@163.com>
Date:   Thu May 12 20:38:36 2022 -0700

    kernel/crash_core.c: remove redundant check of ck_cmdline

    At the end of get_last_crashkernel(), the judgement of ck_cmdline is
    obviously unnecessary and causes redundance, let's clean it up.

    Link: https://lkml.kernel.org/r/20220506104116.259323-1-sensor1010@163.com
    Signed-off-by: lizhe <sensor1010@163.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Acked-by: Philipp Rudo <prudo@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Baoquan He <bhe@redhat.com>

Signed-off-by: Baoquan He <bhe@redhat.com>
2022-11-16 03:47:02 -05:00
Tao Liu 633240916d kdump: round up the total memory size to 128M for crashkernel reservation
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/patch/?id=adca0f4c42c99d665a308ed44f0a09159dc93d11
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2058040

    kdump: round up the total memory size to 128M for crashkernel reservation

    The total memory size we get in kernel is usually slightly less than the
    actual memory size because BIOS/firmware will reserve some memory region.
    So it won't export all memory as usable.

    E.g, on my x86_64 kvm guest with 1G memory, the total_mem value shows:
    UEFI boot with ovmf: 0x3faef000 Legacy boot kvm guest: 0x3ff7ec00

    When specifying crashkernel=1G-2G:128M, if we have a 1G memory machine, we
    get total size 1023M from firmware.  Then it will not fall into 1G-2G,
    thus no memory reserved.  User will never know this, it is hard to let
    user know the exact total value in kernel.

    One way is to use dmi/smbios to get physical memory size, but it's not
    reliable as well.  According to Prarit hardware vendors sometimes screw
    this up.  Thus round up total size to 128M to work around this problem.

    This patch is a resend of [1] and rebased onto v5.19-rc2, and the
    original credit goes to Dave Young.

    [1]: http://lists.infradead.org/pipermail/kexec/2018-April/020568.html

    Link: https://lkml.kernel.org/r/20220627074440.187222-1-ltao@redhat.com
    Signed-off-by: Tao Liu <ltao@redhat.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Tao Liu <ltao@redhat.com>
2022-07-05 19:55:02 +08:00
Pingfan Liu 65b5613620 kdump: return -ENOENT if required cmdline option does not exist
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2091852
Upstream Status: https://github.com/torvalds/linux.git
Tested: on ampere-mtsnow-altra-11.khw4.lab.eng.bos.redhat.com
Build-info: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=46106617
Conflicts: none

commit 2e5920bb073a4e3e69cf8e581836cafc8ba1b464
Author: Zhen Lei <thunder.leizhen@huawei.com>
Date:   Fri May 6 19:43:57 2022 +0800

    kdump: return -ENOENT if required cmdline option does not exist

    According to the current crashkernel=Y,low support in other ARCHes, it's
    an optional command-line option. When it doesn't exist, kernel will try
    to allocate minimum required memory below 4G automatically.

    However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
    can't distinguish the nonexistent option from invalid option.

    Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
    case. With this change, crashkernel,low memory will take the default
    value if crashkernel=,low is not specified; while crashkernel reservation
    will fail and bail out if an invalid option is specified.

    Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Link: https://lore.kernel.org/r/20220506114402.365-2-thunder.leizhen@huawei.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Signed-off-by: Pingfan Liu <piliu@redhat.com>
2022-06-22 11:00:20 +08:00
Philipp Rudo 139263a772 kernel/crash_core: suppress unknown crashkernel parameter warning
Bugzilla: https://bugzilla.redhat.com/2026570
Upstream Status: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Conflicts: None

commit 71d2bcec2d4d69ff109c497e6611d6c53c8926d4
Author: Philipp Rudo <prudo@redhat.com>
Date:   Fri Dec 24 21:12:39 2021 -0800

    kernel/crash_core: suppress unknown crashkernel parameter warning

    When booting with crashkernel= on the kernel command line a warning
    similar to

        Kernel command line: ro console=ttyS0 crashkernel=256M
        Unknown kernel command line parameters "crashkernel=256M", will be passed to user space.

    is printed.

    This comes from crashkernel= being parsed independent from the kernel
    parameter handling mechanism.  So the code in init/main.c doesn't know
    that crashkernel= is a valid kernel parameter and prints this incorrect
    warning.

    Suppress the warning by adding a dummy early_param handler for
    crashkernel=.

    Link: https://lkml.kernel.org/r/20211208133443.6867-1-prudo@redhat.com
    Fixes: 86d1919a4f ("init: print out unknown kernel parameters")
    Signed-off-by: Philipp Rudo <prudo@redhat.com>
    Acked-by: Baoquan He <bhe@redhat.com>
    Cc: Andrew Halaney <ahalaney@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Philipp Rudo <prudo@redhat.com>
2022-01-03 17:58:21 +01:00
Stephen Boyd 44e8a5e912 kdump: use vmlinux_build_id to simplify
We can use the vmlinux_build_id array here now instead of open coding it.
This mostly consolidates code.

Link: https://lkml.kernel.org/r/20210511003845.2429846-14-swboyd@chromium.org
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Evan Green <evgreen@chromium.org>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:22 -07:00
Mike Rapoport 43b02ba93b mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM
After removal of the DISCONTIGMEM memory model the FLAT_NODE_MEM_MAP
configuration option is equivalent to FLATMEM.

Drop CONFIG_FLAT_NODE_MEM_MAP and use CONFIG_FLATMEM instead.

Link: https://lkml.kernel.org/r/20210608091316.3622-10-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29 10:53:55 -07:00
Mike Rapoport a9ee6cf5c6 mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA
configuration options are equivalent.

Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead.

Done with

	$ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \
		$(git grep -wl CONFIG_NEED_MULTIPLE_NODES)
	$ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \
		$(git grep -wl NEED_MULTIPLE_NODES)

with manual tweaks afterwards.

[rppt@linux.ibm.com: fix arm boot crash]
  Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com

Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29 10:53:55 -07:00
Pingfan Liu 4f5aecdff2 crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo
As mentioned in kernel commit 1d50e5d0c5 ("crash_core, vmcoreinfo:
Append 'MAX_PHYSMEM_BITS' to vmcoreinfo"), SECTION_SIZE_BITS in the
formula:

    #define SECTIONS_SHIFT    (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Besides SECTIONS_SHIFT, SECTION_SIZE_BITS is also used to calculate
PAGES_PER_SECTION in makedumpfile just like kernel.

Unfortunately, this arch-dependent macro SECTION_SIZE_BITS changes, e.g.
recently in kernel commit f0b13ee232 ("arm64/sparsemem: reduce
SECTION_SIZE_BITS").  But user space wants a stable interface to get
this info.  Such info is impossible to be deduced from a crashdump
vmcore.  Hence append SECTION_SIZE_BITS to vmcoreinfo.

Link: https://lkml.kernel.org/r/20210608103359.84907-1-kernelfans@gmail.com
Link: http://lists.infradead.org/pipermail/kexec/2021-June/022676.html
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Boris Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Anderson <anderson@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Alexander Egorenkov ca4a9241cc kdump: append uts_namespace.name offset to VMCOREINFO
The offset of the field 'init_uts_ns.name' has changed since commit
9a56493f69 ("uts: Use generic ns_common::count").

Make the offset of the field 'uts_namespace.name' available in VMCOREINFO
because tools like 'crash-utility' and 'makedumpfile' must be able to read
it from crash dumps.

Link: https://lore.kernel.org/r/159644978167.604812.1773586504374412107.stgit@localhost.localdomain
Link: https://lkml.kernel.org/r/20200930102328.396488-1-egorenar@linux.ibm.com
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: lijiang <lijiang@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 22:46:18 -08:00
Eric Biggers a24d22b225 crypto: sha - split sha.h into sha1.h and sha2.h
Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure.  So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1.  It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-20 14:45:33 +11:00
Vijay Balakrishna 0935288c6e kdump: append kernel build-id string to VMCOREINFO
Make kernel GNU build-id available in VMCOREINFO.  Having build-id in
VMCOREINFO facilitates presenting appropriate kernel namelist image with
debug information file to kernel crash dump analysis tools.  Currently
VMCOREINFO lacks uniquely identifiable key for crash analysis automation.

Regarding if this patch is necessary or matching of linux_banner and
OSRELEASE in VMCOREINFO employed by crash(8) meets the need -- IMO,
build-id approach more foolproof, in most instances it is a cryptographic
hash generated using internal code/ELF bits unlike kernel version string
upon which linux_banner is based that is external to the code.  I feel
each is intended for a different purpose.  Also OSRELEASE is not suitable
when two different kernel builds from same version with different features
enabled.

Currently for most linux (and non-linux) systems build-id can be extracted
using standard methods for file types such as user mode crash dumps,
shared libraries, loadable kernel modules etc., This is an exception for
linux kernel dump.  Having build-id in VMCOREINFO brings some uniformity
for automation tools.

Tyler said:

: I think this is a nice improvement over today's linux_banner approach for
: correlating vmlinux to a kernel dump.
:
: The elf notes parsing in this patch lines up with what is described in in
: the "Notes (Nhdr)" section of the elf(5) man page.
:
: BUILD_ID_MAX is sufficient to hold a sha1 build-id, which is the default
: build-id type today in GNU ld(2).  It is also sufficient to hold the
: "fast" build-id, which is the default build-id type today in LLVM lld(2).

Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/1591849672-34104-1-git-send-email-vijayb@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:01 -07:00
Bhupesh Sharma 1d50e5d0c5 crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT    (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

While at it also update vmcoreinfo documentation for 'MAX_PHYSMEM_BITS'
variable being added to vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Tested-by: John Donnelly <john.p.donnelly@oracle.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Boris Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: x86@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: kexec@lists.infradead.org
Link: https://lore.kernel.org/r/1589395957-24628-2-git-send-email-bhsharma@redhat.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-02 17:56:11 +01:00
Thomas Gleixner 40b0b3f8fb treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230
Based on 2 normalized pattern(s):

  this source code is licensed under the gnu general public license
  version 2 see the file copying for more details

  this source code is licensed under general public license version 2
  see

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 52 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:06 +02:00
David Hildenbrand e04b742f74 kexec: export PG_offline to VMCOREINFO
Right now, pages inflated as part of a balloon driver will be dumped by
dump tools like makedumpfile.  While XEN is able to check in the crash
kernel whether a certain pfn is actuall backed by memory in the
hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of
other balloon inflated memory will essentially result in zero pages
getting allocated by the hypervisor and the dump getting filled with
this data.

The allocation and reading of zero pages can directly be avoided if a
dumping tool could know which pages only contain stale information not
to be dumped.

We now have PG_offline which can be (and already is by virtio-balloon)
used for marking pages as logically offline.  Follow up patches will
make use of this flag also in other balloon implementations.

Let's export PG_offline via PAGE_OFFLINE_MAPCOUNT_VALUE, so makedumpfile
can directly skip pages that are logically offline and the content
therefore stale.

Please note that this is also helpful for a problem we were seeing under
Hyper-V: Dumping logically offline memory (pages kept fake offline while
onlining a section via online_page_callback) would under some condicions
result in a kernel panic when dumping them.

Link: http://lkml.kernel.org/r/20181119101616.8901-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Arnd Bergmann 91bc9aaf74 kernel/crash_core.c: print timestamp using time64_t
The get_seconds() call returns a 32-bit timestamp on some architectures,
and will overflow in the future.  The newer ktime_get_real_seconds()
always returns a 64-bit timestamp that does not suffer from this problem.

Link: http://lkml.kernel.org/r/20180618150329.941903-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Marc-Andr Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:47 -07:00
Omar Sandoval 23c85094fe proc/kcore: add vmcoreinfo note to /proc/kcore
The vmcoreinfo information is useful for runtime debugging tools, not just
for crash dumps.  A lot of this information can be determined by other
means, but this is much more convenient, and it only adds a page at most
to the file.

Link: http://lkml.kernel.org/r/fddbcd08eed76344863303878b12de1c1e2a04b6.1531953780.git.osandov@fb.com
Signed-off-by: Omar Sandoval <osandov@fb.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:46 -07:00