Commit Graph

524 Commits

Author SHA1 Message Date
Paulo Alcantara 82eaafecc4 smb: client: improve compound padding in encryption
JIRA: https://issues.redhat.com/browse/RHEL-76046

Conflicts:
	- Export is_vmalloc_or_module_addr() as upstream commit
	018584697533 ("netfs: Add a function to extract an iterator
	into a scatterlist") couldn't be backported due to many deps
	of unreleated netfs/fscache rewrite commits.

commit bc925c1216f0848da96ac642fba3cb92ae1f4e06
Author: Paulo Alcantara <pc@manguebit.com>
Date:   Mon Nov 18 12:35:14 2024 -0300

    smb: client: improve compound padding in encryption

    After commit f7f291e14dde ("cifs: fix oops during encryption"), the
    encryption layer can handle vmalloc'd buffers as well as kmalloc'd
    buffers, so there is no need to inefficiently squash request iovs
    into a single one to handle padding in compound requests.

    Cc: David Howells <dhowells@redhat.com>
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>

Signed-off-by: Paulo Alcantara <paalcant@redhat.com>
2025-01-28 10:33:17 -03:00
Rafael Aquini 431f70ae66 mm: vmalloc: ensure vmap_block is initialised before adding to queue
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 3e3de7947c751509027d26b679ecd243bc9db255
Author: Will Deacon <will@kernel.org>
Date:   Mon Aug 12 18:16:06 2024 +0100

    mm: vmalloc: ensure vmap_block is initialised before adding to queue

    Commit 8c61291fd850 ("mm: fix incorrect vbq reference in
    purge_fragmented_block") extended the 'vmap_block' structure to contain a
    'cpu' field which is set at allocation time to the id of the initialising
    CPU.

    When a new 'vmap_block' is being instantiated by new_vmap_block(), the
    partially initialised structure is added to the local 'vmap_block_queue'
    xarray before the 'cpu' field has been initialised.  If another CPU is
    concurrently walking the xarray (e.g.  via vm_unmap_aliases()), then it
    may perform an out-of-bounds access to the remote queue thanks to an
    uninitialised index.

    This has been observed as UBSAN errors in Android:

     | Internal error: UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP
     |
     | Call trace:
     |  purge_fragmented_block+0x204/0x21c
     |  _vm_unmap_aliases+0x170/0x378
     |  vm_unmap_aliases+0x1c/0x28
     |  change_memory_common+0x1dc/0x26c
     |  set_memory_ro+0x18/0x24
     |  module_enable_ro+0x98/0x238
     |  do_init_module+0x1b0/0x310

    Move the initialisation of 'vb->cpu' in new_vmap_block() ahead of the
    addition to the xarray.

    Link: https://lkml.kernel.org/r/20240812171606.17486-1-will@kernel.org
    Fixes: 8c61291fd850 ("mm: fix incorrect vbq reference in purge_fragmented_block")
    Signed-off-by: Will Deacon <will@kernel.org>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
    Cc: Hailong.Liu <hailong.liu@oppo.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:25:35 -05:00
Rafael Aquini c65dd7c5a8 mm: fix incorrect vbq reference in purge_fragmented_block
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 8c61291fd8500e3b35c7ec0c781b273d8cc96cde
Author: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Date:   Fri Jun 7 10:31:16 2024 +0800

    mm: fix incorrect vbq reference in purge_fragmented_block

    xa_for_each() in _vm_unmap_aliases() loops through all vbs.  However,
    since commit 062eacf57ad9 ("mm: vmalloc: remove a global vmap_blocks
    xarray") the vb from xarray may not be on the corresponding CPU
    vmap_block_queue.  Consequently, purge_fragmented_block() might use the
    wrong vbq->lock to protect the free list, leading to vbq->free breakage.

    Incorrect lock protection can exhaust all vmalloc space as follows:
    CPU0                                            CPU1
    +--------------------------------------------+
    |    +--------------------+     +-----+      |
    +--> |                    |---->|     |------+
         | CPU1:vbq free_list |     | vb1 |
    +--- |                    |<----|     |<-----+
    |    +--------------------+     +-----+      |
    +--------------------------------------------+

    _vm_unmap_aliases()                             vb_alloc()
                                                    new_vmap_block()
    xa_for_each(&vbq->vmap_blocks, idx, vb)
    --> vb in CPU1:vbq->freelist

    purge_fragmented_block(vb)
    spin_lock(&vbq->lock)                           spin_lock(&vbq->lock)
    --> use CPU0:vbq->lock                          --> use CPU1:vbq->lock

    list_del_rcu(&vb->free_list)                    list_add_tail_rcu(&vb->free_list, &vbq->free)
        __list_del(vb->prev, vb->next)
            next->prev = prev
        +--------------------+
        |                    |
        | CPU1:vbq free_list |
    +---|                    |<--+
    |   +--------------------+   |
    +----------------------------+
                                                    __list_add(new, head->prev, head)
    +--------------------------------------------+
    |    +--------------------+     +-----+      |
    +--> |                    |---->|     |------+
         | CPU1:vbq free_list |     | vb2 |
    +--- |                    |<----|     |<-----+
    |    +--------------------+     +-----+      |
    +--------------------------------------------+

            prev->next = next
    +--------------------------------------------+
    |----------------------------+               |
    |    +--------------------+  |  +-----+      |
    +--> |                    |--+  |     |------+
         | CPU1:vbq free_list |     | vb2 |
    +--- |                    |<----|     |<-----+
    |    +--------------------+     +-----+      |
    +--------------------------------------------+
    Here’s a list breakdown. All vbs, which were to be added to
    ‘prev’, cannot be used by list_for_each_entry_rcu(vb, &vbq->free,
    free_list) in vb_alloc(). Thus, vmalloc space is exhausted.

    This issue affects both erofs and f2fs, the stacktrace is as follows:
    erofs:
    [<ffffffd4ffb93ad4>] __switch_to+0x174
    [<ffffffd4ffb942f0>] __schedule+0x624
    [<ffffffd4ffb946f4>] schedule+0x7c
    [<ffffffd4ffb947cc>] schedule_preempt_disabled+0x24
    [<ffffffd4ffb962ec>] __mutex_lock+0x374
    [<ffffffd4ffb95998>] __mutex_lock_slowpath+0x14
    [<ffffffd4ffb95954>] mutex_lock+0x24
    [<ffffffd4fef2900c>] reclaim_and_purge_vmap_areas+0x44
    [<ffffffd4fef25908>] alloc_vmap_area+0x2e0
    [<ffffffd4fef24ea0>] vm_map_ram+0x1b0
    [<ffffffd4ff1b46f4>] z_erofs_lz4_decompress+0x278
    [<ffffffd4ff1b8ac4>] z_erofs_decompress_queue+0x650
    [<ffffffd4ff1b8328>] z_erofs_runqueue+0x7f4
    [<ffffffd4ff1b66a8>] z_erofs_read_folio+0x104
    [<ffffffd4feeb6fec>] filemap_read_folio+0x6c
    [<ffffffd4feeb68c4>] filemap_fault+0x300
    [<ffffffd4fef0ecac>] __do_fault+0xc8
    [<ffffffd4fef0c908>] handle_mm_fault+0xb38
    [<ffffffd4ffb9f008>] do_page_fault+0x288
    [<ffffffd4ffb9ed64>] do_translation_fault[jt]+0x40
    [<ffffffd4fec39c78>] do_mem_abort+0x58
    [<ffffffd4ffb8c3e4>] el0_ia+0x70
    [<ffffffd4ffb8c260>] el0t_64_sync_handler[jt]+0xb0
    [<ffffffd4fec11588>] ret_to_user[jt]+0x0

    f2fs:
    [<ffffffd4ffb93ad4>] __switch_to+0x174
    [<ffffffd4ffb942f0>] __schedule+0x624
    [<ffffffd4ffb946f4>] schedule+0x7c
    [<ffffffd4ffb947cc>] schedule_preempt_disabled+0x24
    [<ffffffd4ffb962ec>] __mutex_lock+0x374
    [<ffffffd4ffb95998>] __mutex_lock_slowpath+0x14
    [<ffffffd4ffb95954>] mutex_lock+0x24
    [<ffffffd4fef2900c>] reclaim_and_purge_vmap_areas+0x44
    [<ffffffd4fef25908>] alloc_vmap_area+0x2e0
    [<ffffffd4fef24ea0>] vm_map_ram+0x1b0
    [<ffffffd4ff1a3b60>] f2fs_prepare_decomp_mem+0x144
    [<ffffffd4ff1a6c24>] f2fs_alloc_dic+0x264
    [<ffffffd4ff175468>] f2fs_read_multi_pages+0x428
    [<ffffffd4ff17b46c>] f2fs_mpage_readpages+0x314
    [<ffffffd4ff1785c4>] f2fs_readahead+0x50
    [<ffffffd4feec3384>] read_pages+0x80
    [<ffffffd4feec32c0>] page_cache_ra_unbounded+0x1a0
    [<ffffffd4feec39e8>] page_cache_ra_order+0x274
    [<ffffffd4feeb6cec>] do_sync_mmap_readahead+0x11c
    [<ffffffd4feeb6764>] filemap_fault+0x1a0
    [<ffffffd4ff1423bc>] f2fs_filemap_fault+0x28
    [<ffffffd4fef0ecac>] __do_fault+0xc8
    [<ffffffd4fef0c908>] handle_mm_fault+0xb38
    [<ffffffd4ffb9f008>] do_page_fault+0x288
    [<ffffffd4ffb9ed64>] do_translation_fault[jt]+0x40
    [<ffffffd4fec39c78>] do_mem_abort+0x58
    [<ffffffd4ffb8c3e4>] el0_ia+0x70
    [<ffffffd4ffb8c260>] el0t_64_sync_handler[jt]+0xb0
    [<ffffffd4fec11588>] ret_to_user[jt]+0x0

    To fix this, introducee cpu within vmap_block to record which this vb
    belongs to.

    Link: https://lkml.kernel.org/r/20240614021352.1822225-1-zhaoyang.huang@unisoc.com
    Link: https://lkml.kernel.org/r/20240607023116.1720640-1-zhaoyang.huang@unisoc.com
    Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks")
    Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
    Suggested-by: Hailong.Liu <hailong.liu@oppo.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:25:17 -05:00
Rado Vrbovsky 91d387b6f5 Merge: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5354

JIRA: https://issues.redhat.com/browse/RHEL-58558  
CVE: CVE-2024-45022  
  
Signed-off-by: Rafael Aquini <raquini@redhat.com>

Approved-by: Waiman Long <longman@redhat.com>
Approved-by: Herton R. Krzesinski <herton@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-01 08:08:13 +00:00
Rado Vrbovsky 14b4cc02eb Merge: BPF 6.9 rebase
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5142

Rebase BPF subsystem to upstream version 6.9

JIRA: https://issues.redhat.com/browse/RHEL-23649

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>

Approved-by: Viktor Malik <vmalik@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: Mark Salter <msalter@redhat.com>
Approved-by: Toke Høiland-Jørgensen <toke@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-10-30 07:25:08 +00:00
Jerome Marchand 33482c3f06 mm: Introduce vmap_page_range() to map pages in PCI address space
JIRA: https://issues.redhat.com/browse/RHEL-23649

Conflicts: There is no loongarch arch on RHEL-9 kernel.

commit d7bca9199a27b8690ae1c71dc11f825154af7234
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Fri Mar 8 09:12:54 2024 -0800

    mm: Introduce vmap_page_range() to map pages in PCI address space

    ioremap_page_range() should be used for ranges within vmalloc range only.
    The vmalloc ranges are allocated by get_vm_area(). PCI has "resource"
    allocator that manages PCI_IOBASE, IO_SPACE_LIMIT address range, hence
    introduce vmap_page_range() to be used exclusively to map pages
    in PCI address space.

    Fixes: 3e49a866c9dc ("mm: Enforce VM_IOREMAP flag and range in ioremap_page_range.")
    Reported-by: Miguel Ojeda <ojeda@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: Miguel Ojeda <ojeda@kernel.org>
    Link: https://lore.kernel.org/bpf/CANiq72ka4rir+RTN2FQoT=Vvprp_Ao-CvoYEkSNqtSY+RZj+AA@mail.gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:14 +02:00
Jerome Marchand 1e25c96f20 mm: Introduce VM_SPARSE kind and vm_area_[un]map_pages().
JIRA: https://issues.redhat.com/browse/RHEL-23649

commit e6f798225a31485e47a6e4f6aa07ee9fdf80c2cb
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Mon Mar 4 19:05:16 2024 -0800

    mm: Introduce VM_SPARSE kind and vm_area_[un]map_pages().

    vmap/vmalloc APIs are used to map a set of pages into contiguous kernel
    virtual space.

    get_vm_area() with appropriate flag is used to request an area of kernel
    address range. It's used for vmalloc, vmap, ioremap, xen use cases.
    - vmalloc use case dominates the usage. Such vm areas have VM_ALLOC flag.
    - the areas created by vmap() function should be tagged with VM_MAP.
    - ioremap areas are tagged with VM_IOREMAP.

    BPF would like to extend the vmap API to implement a lazily-populated
    sparse, yet contiguous kernel virtual space. Introduce VM_SPARSE flag
    and vm_area_map_pages(area, start_addr, count, pages) API to map a set
    of pages within a given area.
    It has the same sanity checks as vmap() does.
    It also checks that get_vm_area() was created with VM_SPARSE flag
    which identifies such areas in /proc/vmallocinfo
    and returns zero pages on read through /proc/kcore.

    The next commits will introduce bpf_arena which is a sparsely populated
    shared memory region between bpf program and user space process. It will
    map privately-managed pages into a sparse vm area with the following steps:

      // request virtual memory region during bpf prog verification
      area = get_vm_area(area_size, VM_SPARSE);

      // on demand
      vm_area_map_pages(area, kaddr, kend, pages);
      vm_area_unmap_pages(area, kaddr, kend);

      // after bpf program is detached and unloaded
      free_vm_area(area);

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
    Link: https://lore.kernel.org/bpf/20240305030516.41519-3-alexei.starovoitov@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:12 +02:00
Jerome Marchand 48fe77504d mm: Enforce VM_IOREMAP flag and range in ioremap_page_range.
JIRA: https://issues.redhat.com/browse/RHEL-23649

commit 3e49a866c9dcbd8173e4f3e491293619a9e81fa4
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Mon Mar 4 19:05:15 2024 -0800

    mm: Enforce VM_IOREMAP flag and range in ioremap_page_range.

    There are various users of get_vm_area() + ioremap_page_range() APIs.
    Enforce that get_vm_area() was requested as VM_IOREMAP type and range
    passed to ioremap_page_range() matches created vm_area to avoid
    accidentally ioremap-ing into wrong address range.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/bpf/20240305030516.41519-2-alexei.starovoitov@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:12 +02:00
Rafael Aquini ccd819d2e9 mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
JIRA: https://issues.redhat.com/browse/RHEL-58558
CVE: CVE-2024-45022
Conflicts:
  * minor context difference due to RHEL missing upstream's v6.10
    commit 88ae5fb755b0 ("mm: vmalloc: enable memory allocation profiling")

This patch is a backport of the following upstream commit:
commit 61ebe5a747da649057c37be1c37eb934b4af79ca
Author: Hailong Liu <hailong.liu@oppo.com>
Date:   Thu Aug 8 20:19:56 2024 +0800

    mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0

    The __vmap_pages_range_noflush() assumes its argument pages** contains
    pages with the same page shift.  However, since commit e9c3cda4d86e ("mm,
    vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes
    __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation
    failed for high order, the pages** may contain two different page shifts
    (high order and order-0).  This could lead __vmap_pages_range_noflush() to
    perform incorrect mappings, potentially resulting in memory corruption.

    Users might encounter this as follows (vmap_allow_huge = true, 2M is for
    PMD_SIZE):

    kvmalloc(2M, __GFP_NOFAIL|GFP_X)
        __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP)
            vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0
                vmap_pages_range()
                    vmap_pages_range_noflush()
                        __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens

    We can remove the fallback code because if a high-order allocation fails,
    __vmalloc_node_range_noprof() will retry with order-0.  Therefore, it is
    unnecessary to fallback to order-0 here.  Therefore, fix this by removing
    the fallback code.

    Link: https://lkml.kernel.org/r/20240808122019.3361-1-hailong.liu@oppo.com
    Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations")
    Signed-off-by: Hailong Liu <hailong.liu@oppo.com>
    Reported-by: Tangquan Zheng <zhengtangquan@oppo.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Acked-by: Barry Song <baohua@kernel.org>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-02 13:22:52 -04:00
Rafael Aquini e66e65400a mm: hugetlb: add huge page size param to set_huge_pte_at()
JIRA: https://issues.redhat.com/browse/RHEL-27743
Conflicts:
  * arch/parisc/include/asm/hugetlb.h: hunks dropped (unsupported arch)
  * arch/parisc/mm/hugetlbpage.c:  hunks dropped (unsupported arch)
  * arch/riscv/include/asm/hugetlb.h: hunks dropped (unsupported arch)
  * arch/riscv/mm/hugetlbpage.c: hunks dropped (unsupported arch)
  * arch/sparc/mm/hugetlbpage.c: hunks dropped (unsupported arch)
  * mm/rmap.c: minor context conflict on the 7th hunk due to backport of
      upstream commit 322842ea3c72 ("mm/rmap: fix missing swap_free() in
      try_to_unmap() after arch_unmap_one() failed")

This patch is a backport of the following upstream commit:
commit 935d4f0c6dc8b3533e6e39346de7389a84490178
Author: Ryan Roberts <ryan.roberts@arm.com>
Date:   Fri Sep 22 12:58:03 2023 +0100

    mm: hugetlb: add huge page size param to set_huge_pte_at()

    Patch series "Fix set_huge_pte_at() panic on arm64", v2.

    This series fixes a bug in arm64's implementation of set_huge_pte_at(),
    which can result in an unprivileged user causing a kernel panic.  The
    problem was triggered when running the new uffd poison mm selftest for
    HUGETLB memory.  This test (and the uffd poison feature) was merged for
    v6.5-rc7.

    Ideally, I'd like to get this fix in for v6.6 and I've cc'ed stable
    (correctly this time) to get it backported to v6.5, where the issue first
    showed up.

    Description of Bug
    ==================

    arm64's huge pte implementation supports multiple huge page sizes, some of
    which are implemented in the page table with multiple contiguous entries.
    So set_huge_pte_at() needs to work out how big the logical pte is, so that
    it can also work out how many physical ptes (or pmds) need to be written.
    It previously did this by grabbing the folio out of the pte and querying
    its size.

    However, there are cases when the pte being set is actually a swap entry.
    But this also used to work fine, because for huge ptes, we only ever saw
    migration entries and hwpoison entries.  And both of these types of swap
    entries have a PFN embedded, so the code would grab that and everything
    still worked out.

    But over time, more calls to set_huge_pte_at() have been added that set
    swap entry types that do not embed a PFN.  And this causes the code to go
    bang.  The triggering case is for the uffd poison test, commit
    99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which
    causes a PTE_MARKER_POISONED swap entry to be set, coutesey of commit
    8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") -
    added in v6.5-rc7.  Although review shows that there are other call sites
    that set PTE_MARKER_UFFD_WP (which also has no PFN), these don't trigger
    on arm64 because arm64 doesn't support UFFD WP.

    If CONFIG_DEBUG_VM is enabled, we do at least get a BUG(), but otherwise,
    it will dereference a bad pointer in page_folio():

        static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry)
        {
            VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry));

            return page_folio(pfn_to_page(swp_offset_pfn(entry)));
        }

    Fix
    ===

    The simplest fix would have been to revert the dodgy cleanup commit
    18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), but since
    things have moved on, this would have required an audit of all the new
    set_huge_pte_at() call sites to see if they should be converted to
    set_huge_swap_pte_at().  As per the original intent of the change, it
    would also leave us open to future bugs when people invariably get it
    wrong and call the wrong helper.

    So instead, I've added a huge page size parameter to set_huge_pte_at().
    This means that the arm64 code has the size in all cases.  It's a bigger
    change, due to needing to touch the arches that implement the function,
    but it is entirely mechanical, so in my view, low risk.

    I've compile-tested all touched arches; arm64, parisc, powerpc, riscv,
    s390, sparc (and additionally x86_64).  I've additionally booted and run
    mm selftests against arm64, where I observe the uffd poison test is fixed,
    and there are no other regressions.

    This patch (of 2):

    In order to fix a bug, arm64 needs to be told the size of the huge page
    for which the pte is being set in set_huge_pte_at().  Provide for this by
    adding an `unsigned long sz` parameter to the function.  This follows the
    same pattern as huge_pte_clear().

    This commit makes the required interface modifications to the core mm as
    well as all arches that implement this function (arm64, parisc, powerpc,
    riscv, s390, sparc).  The actual arm64 bug will be fixed in a separate
    commit.

    No behavioral changes intended.

    Link: https://lkml.kernel.org/r/20230922115804.2043771-1-ryan.roberts@arm.com
    Link: https://lkml.kernel.org/r/20230922115804.2043771-2-ryan.roberts@arm.com
    Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs")
    Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
    Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>     [powerpc 8xx]
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>       [vmalloc change]
    Cc: Alexandre Ghiti <alex@ghiti.fr>
    Cc: Albert Ou <aou@eecs.berkeley.edu>
    Cc: Alexander Gordeev <agordeev@linux.ibm.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Axel Rasmussen <axelrasmussen@google.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Palmer Dabbelt <palmer@dabbelt.com>
    Cc: Paul Walmsley <paul.walmsley@sifive.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Qi Zheng <zhengqi.arch@bytedance.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: SeongJae Park <sj@kernel.org>
    Cc: Sven Schnelle <svens@linux.ibm.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: <stable@vger.kernel.org>    [6.5+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:22:19 -04:00
Rafael Aquini a85223eeb8 mm: ptep_get() conversion
JIRA: https://issues.redhat.com/browse/RHEL-27742
Conflicts:
  * drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c: hunks dropped as
      these are already applied via RHEL commit 26418f1a34 ("Merge DRM
      changes from upstream v6.4..v6.5")
  * kernel/events/uprobes.c: minor context difference due to backport of upstream
      commit ec8832d007cb ("mmu_notifiers: don't invalidate secondary TLBs
      as part of mmu_notifier_invalidate_range_end()")
  * mm/gup.c: minor context difference on the 2nd hunk due to backport of upstream
      commit d74943a2f3cd ("mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT")
  * mm/hugetlb.c: hunk dropped as it's unecessary given the proactive work done
      on the backport of upstream  commit 191fcdb6c9cf ("mm/hugetlb.c: fix a bug
      within a BUG(): inconsistent pte comparison")
  * mm/ksm.c: context conflicts and differences on the 1st hunk are due to
      out-of-order backport of upstream commit 04dee9e85cf5 ("mm/various:
      give up if pte_offset_map[_lock]() fails") being compensated for only now.
  * mm/memory.c: minor context difference on the 35th hunk due to backport of
      upstream commit 04c35ab3bdae ("x86/mm/pat: fix VM_PAT handling in COW mappings")
  * mm/mempolicy.c: minor context difference on the 1st hunk due to backport of
      upstream commit 24526268f4e3 ("mm: mempolicy: keep VMA walk if both
      MPOL_MF_STRICT and MPOL_MF_MOVE are specified")
  * mm/migrate.c: minor context difference on the 2nd hunk due to backport of
      upstream commits 161e393c0f63 ("mm: Make pte_mkwrite() take a VMA"), and
      f3ebdf042df4 ("mm: don't check VMA write permissions if the PTE/PMD
      indicates write permissions")
  * mm/migrate_device.c: minor context difference on the 5th hunk due to backport
      of upstream commit ec8832d007cb ("mmu_notifiers: don't invalidate secondary
      TLBs  as part of mmu_notifier_invalidate_range_end()")
  * mm/swapfile.c: minor contex differences on the 1st and 2nd hunks due to
      backport of upstream commit f985fc322063 ("mm/swapfile: fix wrong swap
      entry type for hwpoisoned swapcache page")
  * mm/vmscan.c: minor context difference on the 3rd hunk due to backport of
      upstream commit c28ac3c7eb94 ("mm/mglru: skip special VMAs in
      lru_gen_look_around()")

This patch is a backport of the following upstream commit:
commit c33c794828f21217f72ce6fc140e0d34e0d56bff
Author: Ryan Roberts <ryan.roberts@arm.com>
Date:   Mon Jun 12 16:15:45 2023 +0100

    mm: ptep_get() conversion

    Convert all instances of direct pte_t* dereferencing to instead use
    ptep_get() helper.  This means that by default, the accesses change from a
    C dereference to a READ_ONCE().  This is technically the correct thing to
    do since where pgtables are modified by HW (for access/dirty) they are
    volatile and therefore we should always ensure READ_ONCE() semantics.

    But more importantly, by always using the helper, it can be overridden by
    the architecture to fully encapsulate the contents of the pte.  Arch code
    is deliberately not converted, as the arch code knows best.  It is
    intended that arch code (arm64) will override the default with its own
    implementation that can (e.g.) hide certain bits from the core code, or
    determine young/dirty status by mixing in state from another source.

    Conversion was done using Coccinelle:

    ----

    // $ make coccicheck \
    //          COCCI=ptepget.cocci \
    //          SPFLAGS="--include-headers" \
    //          MODE=patch

    virtual patch

    @ depends on patch @
    pte_t *v;
    @@

    - *v
    + ptep_get(v)

    ----

    Then reviewed and hand-edited to avoid multiple unnecessary calls to
    ptep_get(), instead opting to store the result of a single call in a
    variable, where it is correct to do so.  This aims to negate any cost of
    READ_ONCE() and will benefit arch-overrides that may be more complex.

    Included is a fix for an issue in an earlier version of this patch that
    was pointed out by kernel test robot.  The issue arose because config
    MMU=n elides definition of the ptep helper functions, including
    ptep_get().  HUGETLB_PAGE=n configs still define a simple
    huge_ptep_clear_flush() for linking purposes, which dereferences the ptep.
    So when both configs are disabled, this caused a build error because
    ptep_get() is not defined.  Fix by continuing to do a direct dereference
    when MMU=n.  This is safe because for this config the arch code cannot be
    trying to virtualize the ptes because none of the ptep helpers are
    defined.

    Link: https://lkml.kernel.org/r/20230612151545.3317766-4-ryan.roberts@arm.com
    Reported-by: kernel test robot <lkp@intel.com>
    Link: https://lore.kernel.org/oe-kbuild-all/202305120142.yXsNEo6H-lkp@intel.com/
    Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Alex Williamson <alex.williamson@redhat.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: Dave Airlie <airlied@gmail.com>
    Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jason Gunthorpe <jgg@ziepe.ca>
    Cc: Jérôme Glisse <jglisse@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
    Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: SeongJae Park <sj@kernel.org>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:52 -04:00
Rafael Aquini 614c355285 mm/vmalloc: replace the ternary conditional operator with min()
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 0e4bc271110e0c58c010071a9bbf150f39851dac
Author: Lu Hongfei <luhongfei@vivo.com>
Date:   Fri Jun 9 17:30:57 2023 +0800

    mm/vmalloc: replace the ternary conditional operator with min()

    It would be better to replace the traditional ternary conditional
    operator with min() in zero_iter

    Link: https://lkml.kernel.org/r/20230609093057.27777-1-luhongfei@vivo.com
    Signed-off-by: Lu Hongfei <luhongfei@vivo.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:10 -04:00
Rafael Aquini 4ff8d8912e mm: vmalloc must set pte via arch code
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit b3f78e74986546a6da5d28c24e627d95d17f79ec
Author: Ryan Roberts <ryan.roberts@arm.com>
Date:   Fri Jun 2 10:29:46 2023 +0100

    mm: vmalloc must set pte via arch code

    Patch series "Fixes for pte encapsulation bypasses", v3.

    A series to improve the encapsulation of pte entries by disallowing
    non-arch code from directly dereferencing pte_t pointers.

    This patch (of 4):

    It is bad practice to directly set pte entries within a pte table.
    Instead all modifications must go through arch-provided helpers such as
    set_pte_at() to give the arch code visibility and allow it to check (and
    potentially modify) the operation.

    Link: https://lkml.kernel.org/r/20230602092949.545577-1-ryan.roberts@arm.com
    Link: https://lkml.kernel.org/r/20230602092949.545577-2-ryan.roberts@arm.com
    Fixes: 3e9a9e256b ("mm: add a vmap_pfn function")
    Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Acked-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: SeongJae Park <sj@kernel.org>
    Cc: Yu Zhao <yuzhao@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:09 -04:00
Rafael Aquini 333153a591 mm/vmalloc: dont purge usable blocks unnecessarily
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 77e50af07f14ea7b53f82f9417ddf2fd96c78da3
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu May 25 14:57:09 2023 +0200

    mm/vmalloc: dont purge usable blocks unnecessarily

    Purging fragmented blocks is done unconditionally in several contexts:

      1) From drain_vmap_area_work(), when the number of lazy to be freed
         vmap_areas reached the threshold

      2) Reclaiming vmalloc address space from pcpu_get_vm_areas()

      3) _vm_unmap_aliases()

    #1 There is no reason to zap fragmented vmap blocks unconditionally, simply
       because reclaiming all lazy areas drains at least

          32MB * fls(num_online_cpus())

       per invocation which is plenty.

    #2 Reclaiming when running out of space or due to memory pressure makes a
       lot of sense

    #3 _unmap_aliases() requires to touch everything because the caller has no
       clue which vmap_area used a particular page last and the vmap_area lost
       that information too.

       Except for the vfree + VM_FLUSH_RESET_PERMS case, which removes the
       vmap area first and then cares about the flush. That in turn requires
       a full walk of _all_ vmap areas including the one which was just
       added to the purge list.

       But as this has to be flushed anyway this is an opportunity to combine
       outstanding TLB flushes and do the housekeeping of purging freed areas,
       but like #1 there is no real good reason to zap usable vmap blocks
       unconditionally.

    Add a @force_purge argument to the newly split out block purge function and
    if not true only purge fragmented blocks which have less than 1/4 of their
    capacity left.

    Rename purge_vmap_area_lazy() to reclaim_and_purge_vmap_areas() to make it
    clear what the function does.

    [lstoakes@gmail.com: correct VMAP_PURGE_THRESHOLD check]
      Link: https://lkml.kernel.org/r/3e92ef61-b910-4576-88e7-cf43211fd4e7@lucifer.local
    Link: https://lkml.kernel.org/r/20230525124504.864005691@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:09 -04:00
Rafael Aquini 77fce32226 mm/vmalloc: add missing READ/WRITE_ONCE() annotations
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 7f48121e9fa82bdaf0bd0f7a7e49f48803c6c0e8
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu May 25 14:57:08 2023 +0200

    mm/vmalloc: add missing READ/WRITE_ONCE() annotations

    purge_fragmented_blocks() accesses vmap_block::free and vmap_block::dirty
    lockless for a quick check.

    Add the missing READ/WRITE_ONCE() annotations.

    Link: https://lkml.kernel.org/r/20230525124504.807356682@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:08 -04:00
Rafael Aquini c112ce61e4 mm/vmalloc: check free space in vmap_block lockless
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 43d7650234c62201ba3ca5b731226b0b189989a8
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu May 25 14:57:07 2023 +0200

    mm/vmalloc: check free space in vmap_block lockless

    vb_alloc() unconditionally locks a vmap_block on the free list to check
    the free space.

    This can be done locklessly because vmap_block::free never increases, it's
    only decreased on allocations.

    Check the free space lockless and only if that succeeds, recheck under the
    lock.

    Link: https://lkml.kernel.org/r/20230525124504.750481992@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:07 -04:00
Rafael Aquini 84ecd7ac0b mm/vmalloc: prevent flushing dirty space over and over
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit a09fad96ffb1d0da007283727235a03b813f989b
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu May 25 14:57:05 2023 +0200

    mm/vmalloc: prevent flushing dirty space over and over

    vmap blocks which have active mappings cannot be purged.  Allocations
    which have been freed are accounted for in vmap_block::dirty_min/max, so
    that they can be detected in _vm_unmap_aliases() as potentially stale
    TLBs.

    If there are several invocations of _vm_unmap_aliases() then each of them
    will flush the dirty range.  That's pointless and just increases the
    probability of full TLB flushes.

    Avoid that by resetting the flush range after accounting for it.  That's
    safe versus other invocations of _vm_unmap_aliases() because this is all
    serialized with vmap_purge_lock.

    Link: https://lkml.kernel.org/r/20230525124504.692056496@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:07 -04:00
Rafael Aquini 9c69762245 mm/vmalloc: avoid iterating over per CPU vmap blocks twice
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit ca5e46c3400badc418a8fbcaeba711ad60ff4e1b
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu May 25 14:57:04 2023 +0200

    mm/vmalloc: avoid iterating over per CPU vmap blocks twice

    _vunmap_aliases() walks the per CPU xarrays to find partially unmapped
    blocks and then walks the per cpu free lists to purge fragmented blocks.

    Arguably that's waste of CPU cycles and cache lines as the full xarray
    walk already touches every block.

    Avoid this double iteration:

      - Split out the code to purge one block and the code to free the local
        purge list into helper functions.

      - Try to purge the fragmented blocks in the xarray walk before looking at
        their dirty space.

    Link: https://lkml.kernel.org/r/20230525124504.633469722@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:06 -04:00
Rafael Aquini 1c677d5471 mm/vmalloc: prevent stale TLBs in fully utilized blocks
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit fc1e0d980037e065441cd1d9a1a5e9c9117e4ba2
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu May 25 14:57:03 2023 +0200

    mm/vmalloc: prevent stale TLBs in fully utilized blocks

    Patch series "mm/vmalloc: Assorted fixes and improvements", v2.

    this series addresses the following issues:

      1) Prevent the stale TLB problem related to fully utilized vmap blocks

      2) Avoid the double per CPU list walk in _vm_unmap_aliases()

      3) Avoid flushing dirty space over and over

      4) Add a lockless quickcheck in vb_alloc() and add missing
         READ/WRITE_ONCE() annotations

      5) Prevent overeager purging of usable vmap_blocks if
         not under memory/address space pressure.

    This patch (of 6):

    _vm_unmap_aliases() is used to ensure that no unflushed TLB entries for a
    page are left in the system. This is required due to the lazy TLB flush
    mechanism in vmalloc.

    This is tried to achieve by walking the per CPU free lists, but those do
    not contain fully utilized vmap blocks because they are removed from the
    free list once the blocks free space became zero.

    When the block is not fully unmapped then it is not on the purge list
    either.

    So neither the per CPU list iteration nor the purge list walk find the
    block and if the page was mapped via such a block and the TLB has not yet
    been flushed, the guarantee of _vm_unmap_aliases() that there are no stale
    TLBs after returning is broken:

    x = vb_alloc() // Removes vmap_block from free list because vb->free became 0
    vb_free(x)     // Unmaps page and marks in dirty_min/max range
                   // Block has still mappings and is not put on purge list

    // Page is reused
    vm_unmap_aliases() // Can't find vmap block with the dirty space -> FAIL

    So instead of walking the per CPU free lists, walk the per CPU xarrays
    which hold pointers to _all_ active blocks in the system including those
    removed from the free lists.

    Link: https://lkml.kernel.org/r/20230525122342.109672430@linutronix.de
    Link: https://lkml.kernel.org/r/20230525124504.573987880@linutronix.de
    Fixes: db64fe0225 ("mm: rewrite vmap layer")
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:05 -04:00
Lucas Zampieri dde2447c06 Merge: CVE-2024-41032: mm: vmalloc: check if a hash-index is in cpu_possible_mask
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4841

JIRA: https://issues.redhat.com/browse/RHEL-50955  
CVE: CVE-2024-41032

```
mm: vmalloc: check if a hash-index is in cpu_possible_mask

The problem is that there are systems where cpu_possible_mask has gaps
between set CPUs, for example SPARC.  In this scenario addr_to_vb_xa()
hash function can return an index which accesses to not-possible and not
setup CPU area using per_cpu() macro.  This results in an oops on SPARC.

A per-cpu vmap_block_queue is also used as hash table, incorrectly
assuming the cpu_possible_mask has no gaps.  Fix it by adjusting an index
to a next possible CPU.

Link: https://lkml.kernel.org/r/20240626140330.89836-1-urezki@gmail.com
Fixes: 062eacf57ad9 ("mm: vmalloc: remove a global vmap_blocks xarray")
Reported-by: Nick Bowler <nbowler@draconx.ca>
Closes: https://lore.kernel.org/linux-kernel/ZntjIE6msJbF8zTa@MiWiFi-R3L-srv/T/
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hailong.Liu <hailong.liu@oppo.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit a34acf30b19bc4ee3ba2f1082756ea2604c19138)
```

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>

Approved-by: Audra Mitchell <aubaker@redhat.com>
Approved-by: Waiman Long <longman@redhat.com>
Approved-by: Baoquan He <5820488-baoquan_he@users.noreply.gitlab.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-08-15 12:24:21 +00:00
CKI Backport Bot 4eb8a5f6ff mm: vmalloc: check if a hash-index is in cpu_possible_mask
JIRA: https://issues.redhat.com/browse/RHEL-50955
CVE: CVE-2024-41032

commit a34acf30b19bc4ee3ba2f1082756ea2604c19138
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date:   Wed Jun 26 16:03:30 2024 +0200

    mm: vmalloc: check if a hash-index is in cpu_possible_mask

    The problem is that there are systems where cpu_possible_mask has gaps
    between set CPUs, for example SPARC.  In this scenario addr_to_vb_xa()
    hash function can return an index which accesses to not-possible and not
    setup CPU area using per_cpu() macro.  This results in an oops on SPARC.

    A per-cpu vmap_block_queue is also used as hash table, incorrectly
    assuming the cpu_possible_mask has no gaps.  Fix it by adjusting an index
    to a next possible CPU.

    Link: https://lkml.kernel.org/r/20240626140330.89836-1-urezki@gmail.com
    Fixes: 062eacf57ad9 ("mm: vmalloc: remove a global vmap_blocks xarray")
    Reported-by: Nick Bowler <nbowler@draconx.ca>
    Closes: https://lore.kernel.org/linux-kernel/ZntjIE6msJbF8zTa@MiWiFi-R3L-srv/T/
    Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Hailong.Liu <hailong.liu@oppo.com>
    Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-07-30 08:57:16 +00:00
CKI Backport Bot d00be2c43a mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
JIRA: https://issues.redhat.com/browse/RHEL-46467
CVE: CVE-2024-39474

commit 8e0545c83d672750632f46e3f9ad95c48c91a0fc
Author: Hailong.Liu <hailong.liu@oppo.com>
Date:   Fri May 10 18:01:31 2024 +0800

    mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL

    commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc")
    includes support for __GFP_NOFAIL, but it presents a conflict with commit
    dd544141b9eb ("vmalloc: back off when the current task is OOM-killed").  A
    possible scenario is as follows:

    process-a
    __vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL)
        __vmalloc_area_node()
            vm_area_alloc_pages()
                    --> oom-killer send SIGKILL to process-a
            if (fatal_signal_pending(current)) break;
    --> return NULL;

    To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages()
    if __GFP_NOFAIL set.

    This issue occurred during OPLUS KASAN TEST. Below is part of the log
    -> oom-killer sends signal to process
    [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198

    [65731.259685] [T32454] Call trace:
    [65731.259698] [T32454]  dump_backtrace+0xf4/0x118
    [65731.259734] [T32454]  show_stack+0x18/0x24
    [65731.259756] [T32454]  dump_stack_lvl+0x60/0x7c
    [65731.259781] [T32454]  dump_stack+0x18/0x38
    [65731.259800] [T32454]  mrdump_common_die+0x250/0x39c [mrdump]
    [65731.259936] [T32454]  ipanic_die+0x20/0x34 [mrdump]
    [65731.260019] [T32454]  atomic_notifier_call_chain+0xb4/0xfc
    [65731.260047] [T32454]  notify_die+0x114/0x198
    [65731.260073] [T32454]  die+0xf4/0x5b4
    [65731.260098] [T32454]  die_kernel_fault+0x80/0x98
    [65731.260124] [T32454]  __do_kernel_fault+0x160/0x2a8
    [65731.260146] [T32454]  do_bad_area+0x68/0x148
    [65731.260174] [T32454]  do_mem_abort+0x151c/0x1b34
    [65731.260204] [T32454]  el1_abort+0x3c/0x5c
    [65731.260227] [T32454]  el1h_64_sync_handler+0x54/0x90
    [65731.260248] [T32454]  el1h_64_sync+0x68/0x6c

    [65731.260269] [T32454]  z_erofs_decompress_queue+0x7f0/0x2258
    --> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL);
            kernel panic by NULL pointer dereference.
            erofs assume kvmalloc with __GFP_NOFAIL never return NULL.
    [65731.260293] [T32454]  z_erofs_runqueue+0xf30/0x104c
    [65731.260314] [T32454]  z_erofs_readahead+0x4f0/0x968
    [65731.260339] [T32454]  read_pages+0x170/0xadc
    [65731.260364] [T32454]  page_cache_ra_unbounded+0x874/0xf30
    [65731.260388] [T32454]  page_cache_ra_order+0x24c/0x714
    [65731.260411] [T32454]  filemap_fault+0xbf0/0x1a74
    [65731.260437] [T32454]  __do_fault+0xd0/0x33c
    [65731.260462] [T32454]  handle_mm_fault+0xf74/0x3fe0
    [65731.260486] [T32454]  do_mem_abort+0x54c/0x1b34
    [65731.260509] [T32454]  el0_da+0x44/0x94
    [65731.260531] [T32454]  el0t_64_sync_handler+0x98/0xb4
    [65731.260553] [T32454]  el0t_64_sync+0x198/0x19c

    Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com
    Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL")
    Signed-off-by: Hailong.Liu <hailong.liu@oppo.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Suggested-by: Barry Song <21cnbao@gmail.com>
    Reported-by: Oven <liyangouwen1@oppo.com>
    Reviewed-by: Barry Song <baohua@kernel.org>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Chao Yu <chao@kernel.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Gao Xiang <xiang@kernel.org>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-07-10 20:44:46 +00:00
Aristeu Rozanski 865b733db1 mm/vmalloc: eliminated the lock contention from twice to once
JIRA: https://issues.redhat.com/browse/RHEL-28501
Tested: by me
Conflicts: we lack d093602919a, so vma_area_lock is used instead of vmap_node->busy.lock

commit aaab830ad887629156ef17097c2ad24ce6fb8177
Author: rulinhuang <rulin.huang@intel.com>
Date:   Wed Mar 6 21:14:40 2024 -0500

    mm/vmalloc: eliminated the lock contention from twice to once

    When allocating a new memory area where the mapping address range is
    known, it is observed that the vmap_node->busy.lock is acquired twice.

    The first acquisition occurs in the alloc_vmap_area() function when
    inserting the vm area into the vm mapping red-black tree.  The second
    acquisition occurs in the setup_vmalloc_vm() function when updating the
    properties of the vm, such as flags and address, etc.

    Combine these two operations together in alloc_vmap_area(), which improves
    scalability when the vmap_node->busy.lock is contended.  By doing so, the
    need to acquire the lock twice can also be eliminated to once.

    With the above change, tested on intel sapphire rapids platform(224 vcpu),
    a 4% performance improvement is gained on
    stress-ng/pthread(https://github.com/ColinIanKing/stress-ng), which is the
    stress test of thread creations.

    Link: https://lkml.kernel.org/r/20240307021440.64967-1-rulin.huang@intel.com
    Co-developed-by: "Chen, Tim C" <tim.c.chen@intel.com>
    Signed-off-by: "Chen, Tim C" <tim.c.chen@intel.com>
    Co-developed-by: "King, Colin" <colin.king@intel.com>
    Signed-off-by: "King, Colin" <colin.king@intel.com>
    Signed-off-by: rulinhuang <rulin.huang@intel.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Tim Chen <tim.c.chen@linux.intel.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Wangyang Guo <wangyang.guo@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-06-12 09:00:35 -04:00
Nico Pache c96ebb2389 mm/vmalloc: add a safer version of find_vm_area() for debug
commit 0818e739b5c061b0251c30152380600fb9b84c0c
Author: Joel Fernandes (Google) <joel@joelfernandes.org>
Date:   Mon Sep 4 18:08:04 2023 +0000

    mm/vmalloc: add a safer version of find_vm_area() for debug

    It is unsafe to dump vmalloc area information when trying to do so from
    some contexts.  Add a safer trylock version of the same function to do a
    best-effort VMA finding and use it from vmalloc_dump_obj().

    [applied test robot feedback on unused function fix.]
    [applied Uladzislau feedback on locking.]
    Link: https://lkml.kernel.org/r/20230904180806.1002832-1-joel@joelfernandes.org
    Fixes: 98f180837a ("mm: Make mem_dump_obj() handle vmalloc() memory")
    Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reported-by: Zhen Lei <thunder.leizhen@huaweicloud.com>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Zqiang <qiang.zhang1211@gmail.com>
    Cc: <stable@vger.kernel.org>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5619
Signed-off-by: Nico Pache <npache@redhat.com>
2024-04-30 17:51:28 -06:00
Nico Pache 05fc00e3ce mm: add a call to flush_cache_vmap() in vmap_pfn()
commit a50420c79731fc5cf27ad43719c1091e842a2606
Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Wed Aug 9 18:46:33 2023 +0200

    mm: add a call to flush_cache_vmap() in vmap_pfn()

    flush_cache_vmap() must be called after new vmalloc mappings are installed
    in the page table in order to allow architectures to make sure the new
    mapping is visible.

    It could lead to a panic since on some architectures (like powerpc),
    the page table walker could see the wrong pte value and trigger a
    spurious page fault that can not be resolved (see commit f1cb8f9beb
    ("powerpc/64s/radix: avoid ptesync after set_pte and
    ptep_set_access_flags")).

    But actually the patch is aiming at riscv: the riscv specification
    allows the caching of invalid entries in the TLB, and since we recently
    removed the vmalloc page fault handling, we now need to emit a tlb
    shootdown whenever a new vmalloc mapping is emitted
    (https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/).
    That's a temporary solution, there are ways to avoid that :)

    Link: https://lkml.kernel.org/r/20230809164633.1556126-1-alexghiti@rivosinc.com
    Fixes: 3e9a9e256b ("mm: add a vmap_pfn function")
    Reported-by: Dylan Jhong <dylan@andestech.com>
    Closes: https://lore.kernel.org/linux-riscv/ZMytNY2J8iyjbPPy@atctrx.andestech.com/
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
    Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
    Reviewed-by: Dylan Jhong <dylan@andestech.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5619
Signed-off-by: Nico Pache <npache@redhat.com>
2024-04-30 17:51:28 -06:00
Nico Pache 9d6c4aad20 mm/vmalloc: fix the unchecked dereference warning in vread_iter()
commit ca6c2ce1b481996ae5b16e4589d3c0dd61899fa8
Author: Baoquan He <bhe@redhat.com>
Date:   Wed Oct 18 22:50:14 2023 +0800

    mm/vmalloc: fix the unchecked dereference warning in vread_iter()

    LKP reported smatch warning as below:

    ===================
    smatch warnings:
    mm/vmalloc.c:3689 vread_iter() error: we previously assumed 'vm' could be null (see line 3667)
    ......
    06c8994626d1b7  @3667 size = vm ? get_vm_area_size(vm) : va_size(va);
    ......
    06c8994626d1b7  @3689 else if (!(vm->flags & VM_IOREMAP))
                                     ^^^^^^^^^
    Unchecked dereference
    =====================

    This is not a runtime bug because the possible null 'vm' in the
    pointed place could only happen when flags == VMAP_BLOCK.  However, the
    case 'flags == VMAP_BLOCK' should never happen and has been detected
    with WARN_ON.  Please check vm_map_ram() implementation and the earlier
    checking in vread_iter() at below:

                    ~~~~~~~~~~~~~~~~~~~~~~~~~~
                    /*
                     * VMAP_BLOCK indicates a sub-type of vm_map_ram area, need
                     * be set together with VMAP_RAM.
                     */
                    WARN_ON(flags == VMAP_BLOCK);

                    if (!vm && !flags)
                            continue;
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~

    So add checking on whether 'vm' could be null when dereferencing it in
    vread_iter(). This mutes smatch complaint.

    Link: https://lkml.kernel.org/r/ZTCURc8ZQE+KrTvS@MiWiFi-R3L-srv
    Link: https://lkml.kernel.org/r/ZS/2k6DIMd0tZRgK@MiWiFi-R3L-srv
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Closes: https://lore.kernel.org/r/202310171600.WCrsOwFj-lkp@intel.com/
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Philip Li <philip.li@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5619
Signed-off-by: Nico Pache <npache@redhat.com>
2024-04-30 17:51:24 -06:00
Chris von Recklinghausen 8ecd5bd780 mm/vmalloc: do not output a spurious warning when huge vmalloc() fails
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 95a301eefa82057571207edd06ea36218985a75e
Author: Lorenzo Stoakes <lstoakes@gmail.com>
Date:   Mon Jun 5 21:11:07 2023 +0100

    mm/vmalloc: do not output a spurious warning when huge vmalloc() fails

    In __vmalloc_area_node() we always warn_alloc() when an allocation
    performed by vm_area_alloc_pages() fails unless it was due to a pending
    fatal signal.

    However, huge page allocations instigated either by vmalloc_huge() or
    __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or
    kvmalloc_node()) always falls back to order-0 allocations if the huge page
    allocation fails.

    This renders the warning useless and noisy, especially as all callers
    appear to be aware that this may fallback.  This has already resulted in
    at least one bug report from a user who was confused by this (see link).

    Therefore, simply update the code to only output this warning for order-0
    pages when no fatal signal is pending.

    Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410
    Link: https://lkml.kernel.org/r/20230605201107.83298-1-lstoakes@gmail.com
    Fixes: 80b1d8fdfad1 ("mm: vmalloc: correct use of __GFP_NOWARN mask in __vmalloc_area_node()")
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:01:17 -04:00
Chris von Recklinghausen 835edea07f mm: vmalloc: rename addr_to_vb_xarray() function
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit fa1c77c13ca59101c4fbf0ff8bbadd3aaba375f8
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date:   Fri Mar 31 09:37:27 2023 +0200

    mm: vmalloc: rename addr_to_vb_xarray() function

    Short the name of the addr_to_vb_xarray() function to the addr_to_vb_xa().
    This aligns with other internal function abbreviations.

    Link: https://lkml.kernel.org/r/20230331073727.6968-1-urezki@gmail.com
    Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Suggested-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:51 -04:00
Chris von Recklinghausen e58a2e7614 mm: vmalloc: remove a global vmap_blocks xarray
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 062eacf57ad91b5c272f89dc964fd6dd9715ea7d
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date:   Thu Mar 30 21:06:38 2023 +0200

    mm: vmalloc: remove a global vmap_blocks xarray

    A global vmap_blocks-xarray array can be contented under heavy usage of
    the vm_map_ram()/vm_unmap_ram() APIs.  The lock_stat shows that a
    "vmap_blocks.xa_lock" lock is a second in a top-list when it comes to
    contentions:

    <snip>
    ----------------------------------------
    class name con-bounces contentions ...
    ----------------------------------------
    vmap_area_lock:         2554079 2554276 ...
      --------------
      vmap_area_lock        1297948  [<00000000dd41cbaa>] alloc_vmap_area+0x1c7/0x910
      vmap_area_lock        1256330  [<000000009d927bf3>] free_vmap_block+0x4a/0xe0
      vmap_area_lock              1  [<00000000c95c05a7>] find_vm_area+0x16/0x70
      --------------
      vmap_area_lock        1738590  [<00000000dd41cbaa>] alloc_vmap_area+0x1c7/0x910
      vmap_area_lock         815688  [<000000009d927bf3>] free_vmap_block+0x4a/0xe0
      vmap_area_lock              1  [<00000000c1d619d7>] __get_vm_area_node+0xd2/0x170

    vmap_blocks.xa_lock:    862689  862698 ...
      -------------------
      vmap_blocks.xa_lock   378418    [<00000000625a5626>] vm_map_ram+0x359/0x4a0
      vmap_blocks.xa_lock   484280    [<00000000caa2ef03>] xa_erase+0xe/0x30
      -------------------
      vmap_blocks.xa_lock   576226    [<00000000caa2ef03>] xa_erase+0xe/0x30
      vmap_blocks.xa_lock   286472    [<00000000625a5626>] vm_map_ram+0x359/0x4a0
    ...
    <snip>

    that is a result of running vm_map_ram()/vm_unmap_ram() in
    a loop. The test creates 64(on 64 CPUs system) threads and
    each one maps/unmaps 1 page.

    After this change the "xa_lock" can be considered as a noise
    in the same test condition:

    <snip>
    ...
    &xa->xa_lock#1:         10333 10394 ...
      --------------
      &xa->xa_lock#1        5349      [<00000000bbbc9751>] xa_erase+0xe/0x30
      &xa->xa_lock#1        5045      [<0000000018def45d>] vm_map_ram+0x3a4/0x4f0
      --------------
      &xa->xa_lock#1        7326      [<0000000018def45d>] vm_map_ram+0x3a4/0x4f0
      &xa->xa_lock#1        3068      [<00000000bbbc9751>] xa_erase+0xe/0x30
    ...
    <snip>

    Running the test_vmalloc.sh run_test_mask=1024 nr_threads=64 nr_pages=5
    shows around ~8 percent of throughput improvement of vm_map_ram() and
    vm_unmap_ram() APIs.

    This patch does not fix vmap_area_lock/free_vmap_area_lock and
    purge_vmap_area_lock bottle-necks, it is rather a separate rework.

    Link: https://lkml.kernel.org/r/20230330190639.431589-1-urezki@gmail.com
    Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:50 -04:00
Chris von Recklinghausen 98840cec91 mm: vmalloc: convert vread() to vread_iter()
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 4c91c07c93bbbdd7f2d9de2beb7ee5c2a48ad8e7
Author: Lorenzo Stoakes <lstoakes@gmail.com>
Date:   Wed Mar 22 18:57:04 2023 +0000

    mm: vmalloc: convert vread() to vread_iter()

    Having previously laid the foundation for converting vread() to an
    iterator function, pull the trigger and do so.

    This patch attempts to provide minimal refactoring and to reflect the
    existing logic as best we can, for example we continue to zero portions of
    memory not read, as before.

    Overall, there should be no functional difference other than a performance
    improvement in /proc/kcore access to vmalloc regions.

    Now we have eliminated the need for a bounce buffer in read_kcore_iter(),
    we dispense with it, and try to write to user memory optimistically but
    with faults disabled via copy_page_to_iter_nofault().  We already have
    preemption disabled by holding a spin lock.  We continue faulting in until
    the operation is complete.

    Additionally, we must account for the fact that at any point a copy may
    fail (most likely due to a fault not being able to occur), we exit
    indicating fewer bytes retrieved than expected.

    [sfr@canb.auug.org.au: fix sparc64 warning]
      Link: https://lkml.kernel.org/r/20230320144721.663280c3@canb.auug.org.au
    [lstoakes@gmail.com: redo Stephen's sparc build fix]
      Link: https://lkml.kernel.org/r/8506cbc667c39205e65a323f750ff9c11a463798.1679566220.git.lstoakes@gmail.com
    [akpm@linux-foundation.org: unbreak uio.h includes]
    Link: https://lkml.kernel.org/r/941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Reviewed-by: Baoquan He <bhe@redhat.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Liu Shixin <liushixin2@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:36 -04:00
Chris von Recklinghausen db9db78026 mm: prefer xxx_page() alloc/free functions for order-0 pages
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit dcc1be119071f034f3123d3c618d2ef70c80125e
Author: Lorenzo Stoakes <lstoakes@gmail.com>
Date:   Mon Mar 13 12:27:14 2023 +0000

    mm: prefer xxx_page() alloc/free functions for order-0 pages

    Update instances of alloc_pages(..., 0), __get_free_pages(..., 0) and
    __free_pages(..., 0) to use alloc_page(), __get_free_page() and
    __free_page() respectively in core code.

    Link: https://lkml.kernel.org/r/50c48ca4789f1da2a65795f2346f5ae3eff7d665.1678710232.git.lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:13 -04:00
Chris von Recklinghausen d6e0b0591b kasan: remove PG_skip_kasan_poison flag
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 0a54864f8dfb64b64c84c9db6ff70e0e93690a33
Author: Peter Collingbourne <pcc@google.com>
Date:   Thu Mar 9 20:29:14 2023 -0800

    kasan: remove PG_skip_kasan_poison flag

    Code inspection reveals that PG_skip_kasan_poison is redundant with
    kasantag, because the former is intended to be set iff the latter is the
    match-all tag.  It can also be observed that it's basically pointless to
    poison pages which have kasantag=0, because any pages with this tag would
    have been pointed to by pointers with match-all tags, so poisoning the
    pages would have little to no effect in terms of bug detection.
    Therefore, change the condition in should_skip_kasan_poison() to check
    kasantag instead, and remove PG_skip_kasan_poison and associated flags.

    Link: https://lkml.kernel.org/r/20230310042914.3805818-3-pcc@google.com
    Link: https://linux-review.googlesource.com/id/I57f825f2eaeaf7e8389d6cf4597c8a5821359838
    Signed-off-by: Peter Collingbourne <pcc@google.com>
    Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Evgenii Stepanov <eugenis@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:13 -04:00
Aristeu Rozanski 2ad0a080eb mm: vmalloc: avoid warn_alloc noise caused by fatal signal
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit f349b15e183d6956f1b63d6ff57849ff10c7edd5
Author: Yafang Shao <laoar.shao@gmail.com>
Date:   Thu Mar 30 16:26:25 2023 +0000

    mm: vmalloc: avoid warn_alloc noise caused by fatal signal

    There're some suspicious warn_alloc on my test serer, for example,

    [13366.518837] warn_alloc: 81 callbacks suppressed
    [13366.518841] test_verifier: vmalloc error: size 4096, page order 0, failed to allocate pages, mode:0x500dc2(GFP_HIGHUSER|__GFP_ZERO|__GFP_ACCOUNT), nodemask=(null),cpuset=/,mems_allowed=0-1
    [13366.522240] CPU: 30 PID: 722463 Comm: test_verifier Kdump: loaded Tainted: G        W  O       6.2.0+ #638
    [13366.524216] Call Trace:
    [13366.524702]  <TASK>
    [13366.525148]  dump_stack_lvl+0x6c/0x80
    [13366.525712]  dump_stack+0x10/0x20
    [13366.526239]  warn_alloc+0x119/0x190
    [13366.526783]  ? alloc_pages_bulk_array_mempolicy+0x9e/0x2a0
    [13366.527470]  __vmalloc_area_node+0x546/0x5b0
    [13366.528066]  __vmalloc_node_range+0xc2/0x210
    [13366.528660]  __vmalloc_node+0x42/0x50
    [13366.529186]  ? bpf_prog_realloc+0x53/0xc0
    [13366.529743]  __vmalloc+0x1e/0x30
    [13366.530235]  bpf_prog_realloc+0x53/0xc0
    [13366.530771]  bpf_patch_insn_single+0x80/0x1b0
    [13366.531351]  bpf_jit_blind_constants+0xe9/0x1c0
    [13366.531932]  ? __free_pages+0xee/0x100
    [13366.532457]  ? free_large_kmalloc+0x58/0xb0
    [13366.533002]  bpf_int_jit_compile+0x8c/0x5e0
    [13366.533546]  bpf_prog_select_runtime+0xb4/0x100
    [13366.534108]  bpf_prog_load+0x6b1/0xa50
    [13366.534610]  ? perf_event_task_tick+0x96/0xb0
    [13366.535151]  ? security_capable+0x3a/0x60
    [13366.535663]  __sys_bpf+0xb38/0x2190
    [13366.536120]  ? kvm_clock_get_cycles+0x9/0x10
    [13366.536643]  __x64_sys_bpf+0x1c/0x30
    [13366.537094]  do_syscall_64+0x38/0x90
    [13366.537554]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
    [13366.538107] RIP: 0033:0x7f78310f8e29
    [13366.538561] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 17 e0 2c 00 f7 d8 64 89 01 48
    [13366.540286] RSP: 002b:00007ffe2a61fff8 EFLAGS: 00000206 ORIG_RAX: 0000000000000141
    [13366.541031] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f78310f8e29
    [13366.541749] RDX: 0000000000000080 RSI: 00007ffe2a6200b0 RDI: 0000000000000005
    [13366.542470] RBP: 00007ffe2a620010 R08: 00007ffe2a6202a0 R09: 00007ffe2a6200b0
    [13366.543183] R10: 00000000000f423e R11: 0000000000000206 R12: 0000000000407800
    [13366.543900] R13: 00007ffe2a620540 R14: 0000000000000000 R15: 0000000000000000
    [13366.544623]  </TASK>
    [13366.545260] Mem-Info:
    [13366.546121] active_anon:81319 inactive_anon:20733 isolated_anon:0
     active_file:69450 inactive_file:5624 isolated_file:0
     unevictable:0 dirty:10 writeback:0
     slab_reclaimable:69649 slab_unreclaimable:48930
     mapped:27400 shmem:12868 pagetables:4929
     sec_pagetables:0 bounce:0
     kernel_misc_reclaimable:0
     free:15870308 free_pcp:142935 free_cma:0
    [13366.551886] Node 0 active_anon:224836kB inactive_anon:33528kB active_file:175692kB inactive_file:13752kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:59248kB dirty:32kB writeback:0kB shmem:18252kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:4616kB pagetables:10664kB sec_pagetables:0kB all_unreclaimable? no
    [13366.555184] Node 1 active_anon:100440kB inactive_anon:49404kB active_file:102108kB inactive_file:8744kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:50352kB dirty:8kB writeback:0kB shmem:33220kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:3896kB pagetables:9052kB sec_pagetables:0kB all_unreclaimable? no
    [13366.558262] Node 0 DMA free:15360kB boost:0kB min:304kB low:380kB high:456kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
    [13366.560821] lowmem_reserve[]: 0 2735 31873 31873 31873
    [13366.561981] Node 0 DMA32 free:2790904kB boost:0kB min:56028kB low:70032kB high:84036kB reserved_highatomic:0KB active_anon:1936kB inactive_anon:20kB active_file:396kB inactive_file:344kB unevictable:0kB writepending:0kB present:3129200kB managed:2801520kB mlocked:0kB bounce:0kB free_pcp:5188kB local_pcp:0kB free_cma:0kB
    [13366.565148] lowmem_reserve[]: 0 0 29137 29137 29137
    [13366.566168] Node 0 Normal free:28533824kB boost:0kB min:596740kB low:745924kB high:895108kB reserved_highatomic:28672KB active_anon:222900kB inactive_anon:33508kB active_file:175296kB inactive_file:13408kB unevictable:0kB writepending:32kB present:30408704kB managed:29837172kB mlocked:0kB bounce:0kB free_pcp:295724kB local_pcp:0kB free_cma:0kB
    [13366.569485] lowmem_reserve[]: 0 0 0 0 0
    [13366.570416] Node 1 Normal free:32141144kB boost:0kB min:660504kB low:825628kB high:990752kB reserved_highatomic:69632KB active_anon:100440kB inactive_anon:49404kB active_file:102108kB inactive_file:8744kB unevictable:0kB writepending:8kB present:33554432kB managed:33025372kB mlocked:0kB bounce:0kB free_pcp:270880kB local_pcp:46860kB free_cma:0kB
    [13366.573403] lowmem_reserve[]: 0 0 0 0 0
    [13366.574015] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15360kB
    [13366.575474] Node 0 DMA32: 782*4kB (UME) 756*8kB (UME) 736*16kB (UME) 745*32kB (UME) 694*64kB (UME) 653*128kB (UME) 595*256kB (UME) 552*512kB (UME) 454*1024kB (UME) 347*2048kB (UME) 246*4096kB (UME) = 2790904kB
    [13366.577442] Node 0 Normal: 33856*4kB (UMEH) 51815*8kB (UMEH) 42418*16kB (UMEH) 36272*32kB (UMEH) 22195*64kB (UMEH) 10296*128kB (UMEH) 7238*256kB (UMEH) 5638*512kB (UEH) 5337*1024kB (UMEH) 3506*2048kB (UMEH) 1470*4096kB (UME) = 28533784kB
    [13366.580460] Node 1 Normal: 15776*4kB (UMEH) 37485*8kB (UMEH) 29509*16kB (UMEH) 21420*32kB (UMEH) 14818*64kB (UMEH) 13051*128kB (UMEH) 9918*256kB (UMEH) 7374*512kB (UMEH) 5397*1024kB (UMEH) 3887*2048kB (UMEH) 2002*4096kB (UME) = 32141240kB
    [13366.583027] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
    [13366.584380] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
    [13366.585702] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
    [13366.587042] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
    [13366.588372] 87386 total pagecache pages
    [13366.589266] 0 pages in swap cache
    [13366.590327] Free swap  = 0kB
    [13366.591227] Total swap = 0kB
    [13366.592142] 16777082 pages RAM
    [13366.593057] 0 pages HighMem/MovableOnly
    [13366.594037] 357226 pages reserved
    [13366.594979] 0 pages hwpoisoned

    This failure really confuse me as there're still lots of available pages.
    Finally I figured out it was caused by a fatal signal.  When a process is
    allocating memory via vm_area_alloc_pages(), it will break directly even
    if it hasn't allocated the requested pages when it receives a fatal
    signal.  In that case, we shouldn't show this warn_alloc, as it is
    useless.  We only need to show this warning when there're really no enough
    pages.

    Link: https://lkml.kernel.org/r/20230330162625.13604-1-laoar.shao@gmail.com
    Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:26 -04:00
Aristeu Rozanski 4ed990aaa2 mm/vmalloc: skip the uninitilized vmalloc areas
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 30a7a9b17c4b0331ec67aadb4b30ff2a951b4ed5
Author: Baoquan He <bhe@redhat.com>
Date:   Mon Feb 6 16:40:18 2023 +0800

    mm/vmalloc: skip the uninitilized vmalloc areas

    For areas allocated via vmalloc_xxx() APIs, it searches for unmapped area
    to reserve and allocates new pages to map into, please see function
    __vmalloc_node_range().  During the process, flag VM_UNINITIALIZED is set
    in vm->flags to indicate that the pages allocation and mapping haven't
    been done, until clear_vm_uninitialized_flag() is called to clear
    VM_UNINITIALIZED.

    For this kind of area, if VM_UNINITIALIZED is still set, let's ignore it
    in vread() because pages newly allocated and being mapped in that area
    only contains zero data.  reading them out by aligned_vread() is wasting
    time.

    Link: https://lkml.kernel.org/r/20230206084020.174506-6-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Dan Carpenter <error27@gmail.com>
    Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:18 -04:00
Aristeu Rozanski ad90bbd69b mm/vmalloc: explicitly identify vm_map_ram area when shown in /proc/vmcoreinfo
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit bba9697b42ead45687352fdd0fd498735bc4361d
Author: Baoquan He <bhe@redhat.com>
Date:   Mon Feb 6 16:40:17 2023 +0800

    mm/vmalloc: explicitly identify vm_map_ram area when shown in /proc/vmcoreinfo

    Now, by marking VMAP_RAM in vmap_area->flags for vm_map_ram area, we can
    clearly differentiate it with other vmalloc areas.  So identify
    vm_map_area area by checking VMAP_RAM of vmap_area->flags when shown in
    /proc/vmcoreinfo.

    Meanwhile, the code comment above vm_map_ram area checking in s_show() is
    not needed any more, remove it here.

    Link: https://lkml.kernel.org/r/20230206084020.174506-5-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Dan Carpenter <error27@gmail.com>
    Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:18 -04:00
Aristeu Rozanski 21c66b5c58 mm/vmalloc.c: allow vread() to read out vm_map_ram areas
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 06c8994626d1b7d8c26dfd06992d67703a274054
Author: Baoquan He <bhe@redhat.com>
Date:   Mon Feb 6 16:40:16 2023 +0800

    mm/vmalloc.c: allow vread() to read out vm_map_ram areas

    Currently, vread can read out vmalloc areas which is associated with a
    vm_struct.  While this doesn't work for areas created by vm_map_ram()
    interface because it doesn't have an associated vm_struct.  Then in
    vread(), these areas are all skipped.

    Here, add a new function vmap_ram_vread() to read out vm_map_ram areas.
    The area created with vmap_ram_vread() interface directly can be handled
    like the other normal vmap areas with aligned_vread().  While areas which
    will be further subdivided and managed with vmap_block need carefully read
    out page-aligned small regions and zero fill holes.

    Link: https://lkml.kernel.org/r/20230206084020.174506-4-bhe@redhat.com
    Reported-by: Stephen Brennan <stephen.s.brennan@oracle.com>
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Tested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
    Cc: Dan Carpenter <error27@gmail.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:18 -04:00
Aristeu Rozanski 3cc2c09821 mm/vmalloc.c: add flags to mark vm_map_ram area
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 869176a096068056b338b5cc1b0af93106007f5d
Author: Baoquan He <bhe@redhat.com>
Date:   Mon Feb 6 16:40:15 2023 +0800

    mm/vmalloc.c: add flags to mark vm_map_ram area

    Through vmalloc API, a virtual kernel area is reserved for physical
    address mapping.  And vmap_area is used to track them, while vm_struct is
    allocated to associate with the vmap_area to store more information and
    passed out.

    However, area reserved via vm_map_ram() is an exception.  It doesn't have
    vm_struct to associate with vmap_area.  And we can't recognize the
    vmap_area with '->vm == NULL' as a vm_map_ram() area because the normal
    freeing path will set va->vm = NULL before unmapping, please see function
    remove_vm_area().

    Meanwhile, there are two kinds of handling for vm_map_ram area.  One is
    the whole vmap_area being reserved and mapped at one time through
    vm_map_area() interface; the other is the whole vmap_area with
    VMAP_BLOCK_SIZE size being reserved, while mapped into split regions with
    smaller size via vb_alloc().

    To mark the area reserved through vm_map_ram(), add flags field into
    struct vmap_area.  Bit 0 indicates this is vm_map_ram area created through
    vm_map_ram() interface, while bit 1 marks out the type of vm_map_ram area
    which makes use of vmap_block to manage split regions via vb_alloc/free().

    This is a preparation for later use.

    Link: https://lkml.kernel.org/r/20230206084020.174506-3-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Dan Carpenter <error27@gmail.com>
    Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:18 -04:00
Aristeu Rozanski 1a1f57ff8d mm/vmalloc.c: add used_map into vmap_block to track space of vmap_block
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit d76f99548cf0474c3bf75f25fab1778e03ade5f2
Author: Baoquan He <bhe@redhat.com>
Date:   Mon Feb 6 16:40:14 2023 +0800

    mm/vmalloc.c: add used_map into vmap_block to track space of vmap_block

    Patch series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas", v5.

    Problem:
    ***

    Stephen reported vread() will skip vm_map_ram areas when reading out
    /proc/kcore with drgn utility.  Please see below link to get more details.

      /proc/kcore reads 0's for vmap_block
      https://lore.kernel.org/all/87ilk6gos2.fsf@oracle.com/T/#u

    Root cause:
    ***

    The normal vmalloc API uses struct vmap_area to manage the virtual kernel
    area allocated, and associate a vm_struct to store more information and
    pass out.  However, area reserved through vm_map_ram() interface doesn't
    allocate vm_struct to associate with.  So the current code in vread() will
    skip the vm_map_ram area through 'if (!va->vm)' conditional checking.

    Solution:
    ***

    To mark the area reserved through vm_map_ram() interface, add field
    'flags' into struct vmap_area.  Bit 0 indicates this is vm_map_ram area
    created through vm_map_ram() interface, bit 1 marks out the type of
    vm_map_ram area which makes use of vmap_block to manage split regions via
    vb_alloc/free().

    And also add bitmap field 'used_map' into struct vmap_block to mark those
    further subdivided regions being used to differentiate with dirty and free
    regions in vmap_block.

    With the help of above vmap_area->flags and vmap_block->used_map, we can
    recognize and handle vm_map_ram areas successfully.  All these are done in
    patch 1~3.

    Meanwhile, do some improvement on areas related to vm_map_ram areas in
    patch 4, 5.  And also change area flag from VM_ALLOC to VM_IOREMAP in
    patch 6, 7 because this will show them as 'ioremap' in /proc/vmallocinfo,
    and exclude them from /proc/kcore.

    This patch (of 7):

    In one vmap_block area, there could be three types of regions: region
    being used which is allocated through vb_alloc(), dirty region which is
    freed via vb_free() and free region.  Among them, only used region has
    available data.  While there's no way to track those used regions
    currently.

    Here, add bitmap field used_map into vmap_block, and set/clear it during
    allocation or freeing regions of vmap_block area.

    This is a preparation for later use.

    Link: https://lkml.kernel.org/r/20230206084020.174506-1-bhe@redhat.com
    Link: https://lkml.kernel.org/r/20230206084020.174506-2-bhe@redhat.com
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Dan Carpenter <error27@gmail.com>
    Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:18 -04:00
Aristeu Rozanski 952d9b4f0e mm/vmalloc: replace BUG_ON with a simple if statement
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 7e4a32c0e8adafdda6161635c9046e6c1e8b95b5
Author: Hyunmin Lee <hn.min.lee@gmail.com>
Date:   Wed Feb 1 20:51:42 2023 +0900

    mm/vmalloc: replace BUG_ON with a simple if statement

    As per the coding standards, in the event of an abnormal condition that
    should not occur under normal circumstances, the kernel should attempt
    recovery and proceed with execution, rather than halting the machine.

    Specifically, in the alloc_vmap_area() function, use a simple if()
    instead of using BUG_ON() halting the machine.

    Link: https://lkml.kernel.org/r/20230201115142.GA7772@min-iamroot
    Co-developed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
    Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
    Co-developed-by: Jeungwoo Yoo <casionwoo@gmail.com>
    Signed-off-by: Jeungwoo Yoo <casionwoo@gmail.com>
    Co-developed-by: Sangyun Kim <sangyun.kim@snu.ac.kr>
    Signed-off-by: Sangyun Kim <sangyun.kim@snu.ac.kr>
    Signed-off-by: Hyunmin Lee <hn.min.lee@gmail.com>
    Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:17 -04:00
Aristeu Rozanski e214620cfb mm: replace vma->vm_flags direct modifications with modifier calls
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me
Conflicts: dropped stuff we don't support when not applying cleanly, left the rest for sake of saving work

commit 1c71222e5f2393b5ea1a41795c67589eea7e3490
Author: Suren Baghdasaryan <surenb@google.com>
Date:   Thu Jan 26 11:37:49 2023 -0800

    mm: replace vma->vm_flags direct modifications with modifier calls

    Replace direct modifications to vma->vm_flags with calls to modifier
    functions to be able to track flag changes and to keep vma locking
    correctness.

    [akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo]
    Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.com
    Signed-off-by: Suren Baghdasaryan <surenb@google.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
    Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Arjun Roy <arjunroy@google.com>
    Cc: Axel Rasmussen <axelrasmussen@google.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: David Howells <dhowells@redhat.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Greg Thelen <gthelen@google.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Joel Fernandes <joelaf@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Kent Overstreet <kent.overstreet@linux.dev>
    Cc: Laurent Dufour <ldufour@linux.ibm.com>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Minchan Kim <minchan@google.com>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Peter Oskolkov <posk@google.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Punit Agrawal <punit.agrawal@bytedance.com>
    Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Soheil Hassas Yeganeh <soheil@google.com>
    Cc: Song Liu <songliubraving@fb.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:17 -04:00
Aristeu Rozanski b1b895b915 mm: refactor va_remove_mappings
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 9e5fa0ae52fc67dea86f95ea4e3909b3e10a160f
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:51 2023 +0100

    mm: refactor va_remove_mappings

    Move the VM_FLUSH_RESET_PERMS to the caller and rename the function to
    better describe what it is doing.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-11-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 690db13e4b mm: split __vunmap
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 79311c1fe0175941298fb362ba072514e2fe5c54
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:50 2023 +0100

    mm: split __vunmap

    vunmap only needs to find and free the vmap_area and vm_strut, so open
    code that there and merge the rest of the code into vfree.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-10-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski fd568fd951 mm: move debug checks from __vunmap to remove_vm_area
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 17d3ef432dcbe80c134f1f79e2ed1ebd1076eab1
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:49 2023 +0100

    mm: move debug checks from __vunmap to remove_vm_area

    All these checks apply to the free_vm_area interface as well, so move them
    to the common routine.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-9-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 5002a183ce mm: use remove_vm_area in __vunmap
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 75c59ce74e47d3e11aa7666f1877aa64495f7b03
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:48 2023 +0100

    mm: use remove_vm_area in __vunmap

    Use the common helper to find and remove a vmap_area instead of open
    coding it.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-8-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 5e16801120 mm: move __remove_vm_area out of va_remove_mappings
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 39e65b7f63392d70f2f6aff5f4c5c3262f49637e
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:47 2023 +0100

    mm: move __remove_vm_area out of va_remove_mappings

    __remove_vm_area is the only part of va_remove_mappings that requires a
    vmap_area.  Move the call out to the caller and only pass the vm_struct to
    va_remove_mappings.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-7-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 8792e72e33 mm: call vfree instead of __vunmap from delayed_vfree_work
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 5d3d31d6fb17a8eb83af50ea8a0616a3cfde3e58
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:46 2023 +0100

    mm: call vfree instead of __vunmap from delayed_vfree_work

    This adds an extra, never taken, in_interrupt() branch, but will allow to
    cut down the maze of vfree helpers.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-6-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 4968d18ab6 mm: move vmalloc_init and free_work down in vmalloc.c
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 208162f42f958b37147d3c1c5f947c7c1a8b9c41
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:45 2023 +0100

    mm: move vmalloc_init and free_work down in vmalloc.c

    Move these two functions around a bit to avoid forward declarations.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-5-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski ad7ca24b84 mm: remove __vfree_deferred
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 01e2e8394a527644de5192f92f64e1c883a3e493
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:44 2023 +0100

    mm: remove __vfree_deferred

    Fold __vfree_deferred into vfree_atomic, and call vfree_atomic early on
    from vfree if called from interrupt context so that the extra low-level
    helper can be avoided.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-4-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 8c559e6688 mm: remove __vfree
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit f41f036b804d0d920f9b6fd3fca9489dd7afd358
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:43 2023 +0100

    mm: remove __vfree

    __vfree is a subset of vfree that just skips a few checks, and which is
    only used by vfree and an error cleanup path.  Fold __vfree into vfree and
    switch the only other caller to call vfree() instead.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-3-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00
Aristeu Rozanski 42b9cfdd40 mm: reject vmap with VM_FLUSH_RESET_PERMS
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 37f3605e5e7af7de12aeb670c5b94e5a3c8dbf74
Author: Christoph Hellwig <hch@lst.de>
Date:   Sat Jan 21 08:10:42 2023 +0100

    mm: reject vmap with VM_FLUSH_RESET_PERMS

    Patch series "cleanup vfree and vunmap".

    This little series untangles the vfree and vunmap code path a bit.

    This patch (of 10):

    VM_FLUSH_RESET_PERMS is just for use with vmalloc as it is tied to freeing
    the underlying pages.

    Link: https://lkml.kernel.org/r/20230121071051.1143058-1-hch@lst.de
    Link: https://lkml.kernel.org/r/20230121071051.1143058-2-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:12 -04:00