Commit Graph

431 Commits

Author SHA1 Message Date
Rafael Aquini d7c46ee8b0 mm/compaction: fix UBSAN shift-out-of-bounds warning
JIRA: https://issues.redhat.com/browse/RHEL-84184
CVE: CVE-2025-21815

This patch is a backport of the following upstream commit:
commit d1366e74342e75555af2648a2964deb2d5c92200
Author: Liu Shixin <liushixin2@huawei.com>
Date:   Thu Jan 23 10:10:29 2025 +0800

    mm/compaction: fix UBSAN shift-out-of-bounds warning

    syzkaller reported a UBSAN shift-out-of-bounds warning of (1UL << order)
    in isolate_freepages_block().  The bogus compound_order can be any value
    because it is union with flags.  Add back the MAX_PAGE_ORDER check to fix
    the warning.

    Link: https://lkml.kernel.org/r/20250123021029.2826736-1-liushixin2@huawei.com
    Fixes: 3da0272a4c7d ("mm/compaction: correctly return failure with bogus compound_order in strict mode")
    Signed-off-by: Liu Shixin <liushixin2@huawei.com>
    Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Oscar Salvador <osalvador@suse.de>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Kemeng Shi <shikemeng@huaweicloud.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2025-04-18 08:40:01 -04:00
Rafael Aquini 288fab6492 mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
JIRA: https://issues.redhat.com/browse/RHEL-27745
Conflicts:
  * virt/kvm/guest_memfd.c: difference in the hunk due to RHEL missing upstream
    commit 1d23040caa8b ("KVM: guest_memfd: Use AS_INACCESSIBLE when creating
    guest_memfd inode") which would end up being reverted with this follow-up fix.

This patch is a backport of the following upstream commit:
commit 27e6a24a4cf3d25421c0f6ebb7c39f45fc14d20f
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Jul 11 13:56:54 2024 -0400

    mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE

    The flags AS_UNMOVABLE and AS_INACCESSIBLE were both added just for guest_memfd;
    AS_UNMOVABLE is already in existing versions of Linux, while AS_INACCESSIBLE was
    acked for inclusion in 6.11.

    But really, they are the same thing: only guest_memfd uses them, at least for
    now, and guest_memfd pages are unmovable because they should not be
    accessed by the CPU.

    So merge them into one; use the AS_INACCESSIBLE name which is more comprehensive.
    At the same time, this fixes an embarrassing bug where AS_INACCESSIBLE was used
    as a bit mask, despite it being just a bit index.

    The bug was mostly benign, because AS_INACCESSIBLE's bit representation (1010)
    corresponded to setting AS_UNEVICTABLE (which is already set) and AS_ENOSPC
    (except no async writes can happen on the guest_memfd).  So the AS_INACCESSIBLE
    flag simply had no effect.

    Fixes: 1d23040caa8b ("KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode")
    Fixes: c72ceafbd12c ("mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory")
    Cc: linux-mm@kvack.org
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: David Hildenbrand <david@redhat.com>
    Tested-by: Michael Roth <michael.roth@amd.com>
    Reviewed-by: Michael Roth <michael.roth@amd.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:25:25 -05:00
Rafael Aquini c642587310 mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 803de9000f334b771afacb6ff3e78622916668b0
Author: Vlastimil Babka <vbabka@suse.cz>
Date:   Wed Feb 21 12:43:58 2024 +0100

    mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations

    Sven reports an infinite loop in __alloc_pages_slowpath() for costly order
    __GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO.  Such combination
    can happen in a suspend/resume context where a GFP_KERNEL allocation can
    have __GFP_IO masked out via gfp_allowed_mask.

    Quoting Sven:

    1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER)
       with __GFP_RETRY_MAYFAIL set.

    2. page alloc's __alloc_pages_slowpath tries to get a page from the
       freelist. This fails because there is nothing free of that costly
       order.

    3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim,
       which bails out because a zone is ready to be compacted; it pretends
       to have made a single page of progress.

    4. page alloc tries to compact, but this always bails out early because
       __GFP_IO is not set (it's not passed by the snd allocator, and even
       if it were, we are suspending so the __GFP_IO flag would be cleared
       anyway).

    5. page alloc believes reclaim progress was made (because of the
       pretense in item 3) and so it checks whether it should retry
       compaction. The compaction retry logic thinks it should try again,
       because:
        a) reclaim is needed because of the early bail-out in item 4
        b) a zonelist is suitable for compaction

    6. goto 2. indefinite stall.

    (end quote)

    The immediate root cause is confusing the COMPACT_SKIPPED returned from
    __alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be
    indicating a lack of order-0 pages, and in step 5 evaluating that in
    should_compact_retry() as a reason to retry, before incrementing and
    limiting the number of retries.  There are however other places that
    wrongly assume that compaction can happen while we lack __GFP_IO.

    To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO
    evaluation and switch the open-coded test in try_to_compact_pages() to use
    it.

    Also use the new helper in:
    - compaction_ready(), which will make reclaim not bail out in step 3, so
      there's at least one attempt to actually reclaim, even if chances are
      small for a costly order
    - in_reclaim_compaction() which will make should_continue_reclaim()
      return false and we don't over-reclaim unnecessarily
    - in __alloc_pages_slowpath() to set a local variable can_compact,
      which is then used to avoid retrying reclaim/compaction for costly
      allocations (step 5) if we can't compact and also to skip the early
      compaction attempt that we do in some cases

    Link: https://lkml.kernel.org/r/20240221114357.13655-2-vbabka@suse.cz
    Fixes: 3250845d05 ("Revert "mm, oom: prevent premature OOM killer invocation for high order request"")
    Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
    Reported-by: Sven van Ashbrook <svenva@chromium.org>
    Closes: https://lore.kernel.org/all/CAG-rBihs_xMKb3wrMO1%2B-%2Bp4fowP9oy1pa_OTkfxBzPUVOZF%2Bg@mail.gmail.com/
    Tested-by: Karthikeyan Ramasubramanian <kramasub@chromium.org>
    Cc: Brian Geffon <bgeffon@google.com>
    Cc: Curtis Malainey <cujomalainey@chromium.org>
    Cc: Jaroslav Kysela <perex@perex.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Takashi Iwai <tiwai@suse.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:27 -05:00
Rafael Aquini c8c9c0b259 mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
JIRA: https://issues.redhat.com/browse/RHEL-27745
Conflicts:
  * arch/*/Kconfig: all hunks dropped as there were only text blurbs and comments
     being changed with no functional changes whatsoever, and RHEL9 is missing
     several (unrelated) commits to these arches that tranform the text blurbs in
     the way these non-functional hunks were expecting;
  * drivers/accel/qaic/qaic_data.c: hunk dropped due to RHEL-only commit
     083c0cdce2 ("Merge DRM changes from upstream v6.8..v6.9");
  * drivers/gpu/drm/i915/gem/selftests/huge_pages.c: hunk dropped due to RHEL-only
     commit ca8b16c11b ("Merge DRM changes from upstream v6.7..v6.8");
  * drivers/gpu/drm/ttm/tests/ttm_pool_test.c: all hunks dropped due to RHEL-only
     commit ca8b16c11b ("Merge DRM changes from upstream v6.7..v6.8");
  * drivers/video/fbdev/vermilion/vermilion.c: hunk dropped as RHEL9 misses
     commit dbe7e429fe ("vmlfb: framebuffer driver for Intel Vermilion Range");
  * include/linux/pageblock-flags.h: differences due to out-of-order backport
    of upstream commits 72801513b2bf ("mm: set pageblock_order to HPAGE_PMD_ORDER
    in case with !CONFIG_HUGETLB_PAGE but THP enabled"), and 3a7e02c040b1
    ("minmax: avoid overly complicated constant expressions in VM code");
  * mm/mm_init.c: differences on the 3rd, and 4th hunks are due to RHEL
     backport commit 1845b92dcf ("mm: move most of core MM initialization to
     mm/mm_init.c") ignoring the out-of-order backport of commit 3f6dac0fd1b8
     ("mm/page_alloc: make deferred page init free pages in MAX_ORDER blocks")
     thus partially reverting the changes introduced by the latter;

This patch is a backport of the following upstream commit:
commit 5e0a760b44417f7cadd79de2204d6247109558a0
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Thu Dec 28 17:47:04 2023 +0300

    mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER

    commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has
    changed the definition of MAX_ORDER to be inclusive.  This has caused
    issues with code that was not yet upstream and depended on the previous
    definition.

    To draw attention to the altered meaning of the define, rename MAX_ORDER
    to MAX_PAGE_ORDER.

    Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:17 -05:00
Rafael Aquini 9f578eff61 mm, treewide: introduce NR_PAGE_ORDERS
JIRA: https://issues.redhat.com/browse/RHEL-27745
Conflicts:
  * drivers/gpu/drm/*, include/drm/ttm/ttm_pool.h: all hunks dropped due to
    RHEL-only commit ca8b16c11b ("Merge DRM changes from upstream v6.7..v6.8");
  * include/linux/mmzone.h: 3rd hunk dropped due to RHEL-only commit
    afa0ca9cf7 ("Partial backport of mm, treewide: introduce NR_PAGE_ORDERS");

This patch is a backport of the following upstream commit:
commit fd37721803c6e73619108f76ad2e12a9aa5fafaf
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Thu Dec 28 17:47:03 2023 +0300

    mm, treewide: introduce NR_PAGE_ORDERS

    NR_PAGE_ORDERS defines the number of page orders supported by the page
    allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.

    NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
    more natural iteration over them.

    [kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
      Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
    Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:14 -05:00
Rafael Aquini de463c7f50 mm/compaction: factor out code to test if we should run compaction for target order
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit e19a3f595ae47bd8c034b98eb0b28a3877413387
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Sep 1 23:51:41 2023 +0800

    mm/compaction: factor out code to test if we should run compaction for target order

    We always do zone_watermark_ok check and compaction_suitable check
    together to test if compaction for target order should be ran.  Factor
    these code out to remove repeat code.

    Link: https://lkml.kernel.org/r/20230901155141.249860-7-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:09 -05:00
Rafael Aquini aec7fb84d2 mm/compaction: improve comment of is_via_compact_memory
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 9cc17ede5125933ab47f8f359c2cce3aca8ee757
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Sep 1 23:51:40 2023 +0800

    mm/compaction: improve comment of is_via_compact_memory

    We do proactive compaction with order == -1 via
    1. /proc/sys/vm/compact_memory
    2. /sys/devices/system/node/nodex/compact
    3. /proc/sys/vm/compaction_proactiveness
    Add missed situation in which order == -1.

    Link: https://lkml.kernel.org/r/20230901155141.249860-6-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:09 -05:00
Rafael Aquini 2487ddb148 mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 8df4e28c64188911fba33789bf2cb882b3ae524e
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Sep 1 23:51:39 2023 +0800

    mm/compaction: remove repeat compact_blockskip_flush check in reset_isolation_suitable

    We have compact_blockskip_flush check in __reset_isolation_suitable, just
    remove repeat check before __reset_isolation_suitable in
    compact_blockskip_flush.

    Link: https://lkml.kernel.org/r/20230901155141.249860-5-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:08 -05:00
Rafael Aquini 5a52eeec38 mm/compaction: correctly return failure with bogus compound_order in strict mode
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 3da0272a4c7d0d37b47b28e87014f421296fc2be
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Sep 1 23:51:38 2023 +0800

    mm/compaction: correctly return failure with bogus compound_order in strict mode

    In strict mode, we should return 0 if there is any hole in pageblock.  If
    we successfully isolated pages at beginning at pageblock and then have a
    bogus compound_order outside pageblock in next page.  We will abort search
    loop with blockpfn > end_pfn.  Although we will limit blockpfn to end_pfn,
    we will treat it as a successful isolation in strict mode as blockpfn is
    not < end_pfn and return partial isolated pages.  Then
    isolate_freepages_range may success unexpectly with hole in isolated
    range.

    Link: https://lkml.kernel.org/r/20230901155141.249860-4-shikemeng@huaweicloud.com
    Fixes: 9fcd6d2e05 ("mm, compaction: skip compound pages by order in free scanner")
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:07 -05:00
Rafael Aquini 655e9698db mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 4c17989116cb0a6a91f4184077c342a9097b748e
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Sep 1 23:51:37 2023 +0800

    mm/compaction: call list_is_{first}/{last} more intuitively in move_freelist_{head}/{tail}

    We use move_freelist_head after list_for_each_entry_reverse to skip recent
    pages.  And there is no need to do actual move if all freepages are
    searched in list_for_each_entry_reverse, e.g.  freepage point to first
    page in freelist.  It's more intuitively to call list_is_first with list
    entry as the first argument and list head as the second argument to check
    if list entry is the first list entry instead of call list_is_last with
    list entry and list head passed in reverse.

    Similarly, call list_is_last in move_freelist_tail is more intuitively.

    Link: https://lkml.kernel.org/r/20230901155141.249860-3-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:07 -05:00
Rafael Aquini df42558a88 mm/compaction: use correct list in move_freelist_{head}/{tail}
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit bbefa0fc04bab21e85f6b2ee7984c59694366f6a
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Sep 1 23:51:36 2023 +0800

    mm/compaction: use correct list in move_freelist_{head}/{tail}

    Patch series "Fixes and cleanups to compaction", v3.

    This is a series to do fix and clean up to compaction.
    Patch 1-2 fix and clean up freepage list operation.
    Patch 3-4 fix and clean up isolation of freepages
    Patch 7 factor code to check if compaction is needed for allocation order.

    More details can be found in respective patches.

    This patch (of 6):

    The freepage is chained with buddy_list in freelist head. Use buddy_list
    instead of lru to correct the list operation.

    Link: https://lkml.kernel.org/r/20230901155141.249860-1-shikemeng@huaweicloud.com
    Link: https://lkml.kernel.org/r/20230901155141.249860-2-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:06 -05:00
Rafael Aquini 2166b1dd78 mm/compaction: remove unused parameter pgdata of fragmentation_score_wmark
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 8fbb92bd10be26d0feec6bc35332159145c27cc0
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Wed Aug 9 17:49:10 2023 +0800

    mm/compaction: remove unused parameter pgdata of fragmentation_score_wmark

    Parameter pgdat is not used in fragmentation_score_wmark. Just remove it.

    Link: https://lkml.kernel.org/r/20230809094910.3092446-1-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:12 -04:00
Rafael Aquini a02338e73c mm/compaction: only set skip flag if cc->no_set_skip_hint is false
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 18c59d58baa60a8bfaec58d29b6b94877664eed8
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:54 2023 +0800

    mm/compaction: only set skip flag if cc->no_set_skip_hint is false

    Keep the same logic as update_pageblock_skip, only set skip if
    no_set_skip_hint is false which is more reasonable.

    Link: https://lkml.kernel.org/r/20230804110454.2935878-9-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:11 -04:00
Rafael Aquini 17e151666e mm/compaction: remove unnecessary return for void function
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit f82024cbfa3a410d947b588658949a8a391da8a7
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:53 2023 +0800

    mm/compaction: remove unnecessary return for void function

    Remove unnecessary return for void function

    Link: https://lkml.kernel.org/r/20230804110454.2935878-8-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:10 -04:00
Rafael Aquini 73201fd7d5 mm/compaction: correct comment to complete migration failure
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit c3750cc7725af8da06f2f36ddce7adc52a3a51d6
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:52 2023 +0800

    mm/compaction: correct comment to complete migration failure

    Commit cfccd2e63e7e0 ("mm, compaction: finish pageblocks on complete
    migration failure") convert cc->order aligned check to page block order
    aligned check.  Correct comment relevant with it.

    Link: https://lkml.kernel.org/r/20230804110454.2935878-7-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:09 -04:00
Rafael Aquini c38020f8e0 mm/compaction: correct comment of cached migrate pfn update
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit cf043a007e00ae7fe5a4aa5447068fcd13ce031b
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:51 2023 +0800

    mm/compaction: correct comment of cached migrate pfn update

    Commit e380bebe47 ("mm, compaction: keep migration source private to a
    single compaction instance") moved update of async and sync
    compact_cached_migrate_pfn from update_pageblock_skip to
    update_cached_migrate but left the comment behind.  Move the relevant
    comment to correct this.

    Link: https://lkml.kernel.org/r/20230804110454.2935878-6-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:09 -04:00
Rafael Aquini c9c3bae165 mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 0aa8ea3c5d353d5f0aa1e607f8dc5f43bf6cdf05
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:50 2023 +0800

    mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages

    After 90ed667c03fe5 ("Revert "Revert "mm/compaction: fix set skip in
    fast_find_migrateblock"""), we remove skip set in fast_find_migrateblock.
    Correct comment that fast_find_block is used to avoid isolation_suitable
    check for pageblock returned from fast_find_migrateblock because
    fast_find_migrateblock will mark found pageblock skipped.

    Instead, comment that fast_find_block is used to avoid a redundant check
    of fast found pageblock which is already checked skip flag inside
    fast_find_migrateblock.

    Link: https://lkml.kernel.org/r/20230804110454.2935878-5-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:07 -04:00
Rafael Aquini 437ee02605 mm/compaction: skip page block marked skip in isolate_migratepages_block
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 7545e2f20aebf4da413be00384c4245eda5beb4d
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:49 2023 +0800

    mm/compaction: skip page block marked skip in isolate_migratepages_block

    Move migrate_pfn to page block end when block is marked skip to avoid
       unnecessary scan retry of that block from upper caller.  For example,
       compact_zone may wrongly rescan skip page block with finish_pageblock
       set as following:

    1. cc->migrate point to the start of page block

    2. compact_zone record last_migrated_pfn to cc->migrate

    3. compact_zone->isolate_migratepages->isolate_migratepages_block
       tries to scan the block.  The low_pfn maybe moved forward to middle of
       block because of free pages at beginning of block.

    4. we find first lru page could be isolated but block was exclusive
       marked skip.

    5. abort isolate_migratepages_block and make cc->migrate_pfn point to
       found lru page at middle of block.

    6. compact_zone find cc->migrate_pfn and last_migrated_pfn are in the
       same block and wrongly rescan the block with finish_pageblock set.

    Link: https://lkml.kernel.org/r/20230804110454.2935878-4-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:07 -04:00
Rafael Aquini 2da0307d2a mm/compaction: correct last_migrated_pfn update in compact_zone
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 7c0a84bd0dc214a710305fbc0f407b8e7c410762
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Fri Aug 4 19:04:48 2023 +0800

    mm/compaction: correct last_migrated_pfn update in compact_zone

    We record start pfn of last isolated page block with last_migrated_pfn. And
    then:

    1. We check if we mark the page block skip for exclusive access in
       isolate_migratepages_block by test if next migrate pfn is still in last
       isolated page block.  If so, we will set finish_pageblock to do the
       rescan.

    2. We check if a full cc->order block is scanned by test if last scan
       range passes the cc->order block boundary.  If so, we flush the pages
       were freed.

    We treat cc->migrate_pfn before isolate_migratepages as the start pfn of
    last isolated page range.  However, we always align migrate_pfn to page
    block or move to another page block in fast_find_migrateblock or in
    linearly scan forward in isolate_migratepages before do page isolation in
    isolate_migratepages_block.

    Update last_migrated_pfn with pageblock_start_pfn(cc->migrate_pfn - 1)
    after scan to correctly set start pfn of last isolated page range. To
    avoid that:

    1. Miss a rescan with finish_pageblock set as last_migrate_pfn does
       not point to right pageblock and the migrate will not be in pageblock
       of last_migrate_pfn as it should be.

    2. Wrongly issue flush by test cc->order block boundary with wrong
       last_migrate_pfn.

    Link: https://lkml.kernel.org/r/20230804110454.2935878-3-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:06 -04:00
Rafael Aquini 124b37880e mm/compaction: remove unnecessary "else continue" at end of loop in isolate_freepages_block
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 13cfd63f3fec403ca8966079972aac4565fcf379
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Thu Aug 3 17:49:01 2023 +0800

    mm/compaction: remove unnecessary "else continue" at end of loop in isolate_freepages_block

    There is no behavior change to remove "else continue" code at end of scan
    loop.  Just remove it to make code cleaner.

    Link: https://lkml.kernel.org/r/20230803094901.2915942-5-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Kemeng Shi <shikemeng@huawei.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:05 -04:00
Rafael Aquini b5e0f4318d mm/compaction: remove unnecessary cursor page in isolate_freepages_block
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit dc13292cccfd50916af00a471208fb48deb4d72f
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Thu Aug 3 17:49:00 2023 +0800

    mm/compaction: remove unnecessary cursor page in isolate_freepages_block

    The cursor is only used for page forward currently.  We can simply move
    page forward directly to remove unnecessary cursor.

    Link: https://lkml.kernel.org/r/20230803094901.2915942-4-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Kemeng Shi <shikemeng@huawei.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:04 -04:00
Rafael Aquini da93af95ff mm/compaction: merge end_pfn boundary check in isolate_freepages_range
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit a2864a67452ec6e378e57cbe151aad62ccdcc03f
Author: Kemeng Shi <shikemeng@huawei.com>
Date:   Thu Aug 3 17:48:59 2023 +0800

    mm/compaction: merge end_pfn boundary check in isolate_freepages_range

    Merge the end_pfn boundary check for single page block forward and
    multiple page blocks forward to avoid do twice boundary check for multiple
    page blocks forward.

    Link: https://lkml.kernel.org/r/20230803094901.2915942-3-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:04 -04:00
Rafael Aquini 0d58c189d4 mm/compaction: set compact_cached_free_pfn correctly in update_pageblock_skip
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 16951789008dc0029b1e073fb1c20c1abb4c6504
Author: Kemeng Shi <shikemeng@huaweicloud.com>
Date:   Thu Aug 3 17:48:58 2023 +0800

    mm/compaction: set compact_cached_free_pfn correctly in update_pageblock_skip

    Patch series "Fixes and cleanups to compaction", v2.

    This series contains random fixes and cleanups to free page isolation in
    compaction.  This is based on another compact series[1].  More details can
    be found in respective patches.

    This patch (of 4):

    We will set skip to page block of block_start_pfn, it's more reasonable to
    set compact_cached_free_pfn to page block before the block_start_pfn.

    Link: https://lkml.kernel.org/r/20230803094901.2915942-1-shikemeng@huaweicloud.com
    Link: https://lkml.kernel.org/r/20230803094901.2915942-2-shikemeng@huaweicloud.com
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Kemeng Shi <shikemeng@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:03 -04:00
Rafael Aquini 0697042407 mm/compaction: avoid unneeded pageblock_end_pfn when no_set_skip_hint is set
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 3c099a2b0b53d98552cd69d19fd76049bcbafe38
Author: Kemeng Shi <shikemeng@huawei.com>
Date:   Fri Jul 21 23:09:57 2023 +0800

    mm/compaction: avoid unneeded pageblock_end_pfn when no_set_skip_hint is set

    Move pageblock_end_pfn after no_set_skip_hint check to avoid unneeded
    pageblock_end_pfn if no_set_skip_hint is set.

    Link: https://lkml.kernel.org/r/20230721150957.2058634-3-shikemeng@huawei.com
    Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:02 -04:00
Rafael Aquini 9602b7a9a8 mm/compaction: correct comment of candidate pfn in fast_isolate_freepages
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit e6bd14eca207bf822b7c743818ba6e04889348ec
Author: Kemeng Shi <shikemeng@huawei.com>
Date:   Fri Jul 21 23:09:56 2023 +0800

    mm/compaction: correct comment of candidate pfn in fast_isolate_freepages

    Patch series "Two minor cleanups for compaction", v2.

    This series contains two random cleanups for compaction.

    This patch (of 2):

    If no preferred one was not found, we will use candidate page with maximum
    pfn > min_pfn which is saved in high_pfn.  Correct "minimum" to "maximum
    candidate" in comment.

    Link: https://lkml.kernel.org/r/20230721150957.2058634-1-shikemeng@huawei.com
    Link: https://lkml.kernel.org/r/20230721150957.2058634-2-shikemeng@huawei.com
    Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:01 -04:00
Rafael Aquini 0480504544 mm: compaction: skip the memory hole rapidly when isolating free pages
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit e6e0c7673012f42c6fb8d89af71cd7607c93e0a5
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Fri Jul 7 16:51:47 2023 +0800

    mm: compaction: skip the memory hole rapidly when isolating free pages

    Just like commit 9721fd82351d ("mm: compaction: skip memory hole
    rapidly when isolating migratable pages"), I can see it will also take
    more time to skip the larger memory hole (range: 0x1000000000 -
    0x1800000000) when isolating free pages on my machine with below memory
    layout.  So like commit 9721fd82351d, adding a new helper to skip the
    memory hole rapidly, which can reduce the time consumed from about 70us
    to less than 1us.

    [    0.000000] Zone ranges:
    [    0.000000]   DMA      [mem 0x0000000040000000-0x00000000ffffffff]
    [    0.000000]   DMA32    empty
    [    0.000000]   Normal   [mem 0x0000000100000000-0x0000001fa7ffffff]
    [    0.000000] Movable zone start for each node
    [    0.000000] Early memory node ranges
    [    0.000000]   node   0: [mem 0x0000000040000000-0x0000000fffffffff]
    [    0.000000]   node   0: [mem 0x0000001800000000-0x0000001fa3c7ffff]
    [    0.000000]   node   0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff]
    [    0.000000]   node   0: [mem 0x0000001fa4000000-0x0000001fa402ffff]
    [    0.000000]   node   0: [mem 0x0000001fa4030000-0x0000001fa40effff]
    [    0.000000]   node   0: [mem 0x0000001fa40f0000-0x0000001fa73cffff]
    [    0.000000]   node   0: [mem 0x0000001fa73d0000-0x0000001fa745ffff]
    [    0.000000]   node   0: [mem 0x0000001fa7460000-0x0000001fa746ffff]
    [    0.000000]   node   0: [mem 0x0000001fa7470000-0x0000001fa758ffff]
    [    0.000000]   node   0: [mem 0x0000001fa7590000-0x0000001fa7ffffff]

    [shikemeng@huaweicloud.com: avoid missing last page block in section after skip offline sections]
      Link: https://lkml.kernel.org/r/20230804110454.2935878-1-shikemeng@huaweicloud.com
      Link: https://lkml.kernel.org/r/20230804110454.2935878-2-shikemeng@huaweicloud.com
    Link: https://lkml.kernel.org/r/d2ba7e41ee566309b594311207ffca736375fc16.1688715750.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:00 -04:00
Rafael Aquini d07ff5f921 mm: compaction: use the correct type of list for free pages
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 94ec20035b05f842dc08277a5a90fba757088f39
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Fri Jul 7 16:51:46 2023 +0800

    mm: compaction: use the correct type of list for free pages

    Use the page->buddy_list instead of page->lru to clarify the correct type
    of list for free pages.

    Link: https://lkml.kernel.org/r/b21cd8e2e32b9a1d9bc9e43ebf8acaf35e87f8df.1688715750.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Cc: Huang, Ying <ying.huang@intel.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:21:00 -04:00
Rafael Aquini 321560fd9a mm: compaction: convert to use a folio in isolate_migratepages_block()
JIRA: https://issues.redhat.com/browse/RHEL-27742
Conflicts:
  * context conflict on the 2nd hunk due to out-of-order backport of
    upstream' v6.5 commit 493614da0d4e ("mm: compaction: fix endless
    looping over same migrate block"); and conflicts on the 7th, 8th,
    and 9th hunks due to out-of-order backport of upstream's v6.8
    commit 0003e2a41468 ("mm: Add AS_UNMOVABLE to mark mapping as
    completely unmovable")

This patch is a backport of the following upstream commit:
commit 56ae0bb349b4eeb172674d4876f2b6290d505a25
Author: Kefeng Wang <wangkefeng.wang@huawei.com>
Date:   Mon Jun 19 19:07:17 2023 +0800

    mm: compaction: convert to use a folio in isolate_migratepages_block()

    Directly use a folio instead of page_folio() when page successfully
    isolated (hugepage and movable page) and after folio_get_nontail_page(),
    which removes several calls to compound_head().

    Link: https://lkml.kernel.org/r/20230619110718.65679-1-wangkefeng.wang@huawei.com
    Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: James Gowans <jgowans@amazon.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Yu Zhao <yuzhao@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:37:23 -04:00
Rafael Aquini 41ba2401b7 mm: compaction: skip memory hole rapidly when isolating migratable pages
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 9721fd82351d47a37ba982272e128101f24efd7c
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Wed Jun 14 16:40:20 2023 +0800

    mm: compaction: skip memory hole rapidly when isolating migratable pages

    On some machines, the normal zone can have a large memory hole like below
    memory layout, and we can see the range from 0x100000000 to 0x1800000000
    is a hole.  So when isolating some migratable pages, the scanner can meet
    the hole and it will take more time to skip the large hole.  From my
    measurement, I can see the isolation scanner will take 80us ~ 100us to
    skip the large hole [0x100000000 - 0x1800000000].

    So adding a new helper to fast search next online memory section to skip
    the large hole can help to find next suitable pageblock efficiently.  With
    this patch, I can see the large hole scanning only takes < 1us.

    [    0.000000] Zone ranges:
    [    0.000000]   DMA      [mem 0x0000000040000000-0x00000000ffffffff]
    [    0.000000]   DMA32    empty
    [    0.000000]   Normal   [mem 0x0000000100000000-0x0000001fa7ffffff]
    [    0.000000] Movable zone start for each node
    [    0.000000] Early memory node ranges
    [    0.000000]   node   0: [mem 0x0000000040000000-0x0000000fffffffff]
    [    0.000000]   node   0: [mem 0x0000001800000000-0x0000001fa3c7ffff]
    [    0.000000]   node   0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff]
    [    0.000000]   node   0: [mem 0x0000001fa4000000-0x0000001fa402ffff]
    [    0.000000]   node   0: [mem 0x0000001fa4030000-0x0000001fa40effff]
    [    0.000000]   node   0: [mem 0x0000001fa40f0000-0x0000001fa73cffff]
    [    0.000000]   node   0: [mem 0x0000001fa73d0000-0x0000001fa745ffff]
    [    0.000000]   node   0: [mem 0x0000001fa7460000-0x0000001fa746ffff]
    [    0.000000]   node   0: [mem 0x0000001fa7470000-0x0000001fa758ffff]
    [    0.000000]   node   0: [mem 0x0000001fa7590000-0x0000001fa7ffffff]

    [baolin.wang@linux.alibaba.com: limit next_ptn to not exceed cc->free_pfn]
      Link: https://lkml.kernel.org/r/a1d859c28af0c7e85e91795e7473f553eb180a9d.1686813379.git.baolin.wang@linux.alibaba.com
    Link: https://lkml.kernel.org/r/75b4c8ca36bf44ad8c42bf0685ac19d272e426ec.1686705221.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Suggested-by: David Hildenbrand <david@redhat.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: "Huang, Ying" <ying.huang@intel.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:59 -04:00
Rafael Aquini d17cc0e446 mm: compaction: mark kcompactd_run() and kcompactd_stop() __meminit
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 833dfc0090b3f8017ddac82d818b2d8e5ceb61db
Author: Miaohe Lin <linmiaohe@huawei.com>
Date:   Sat Jun 10 11:46:15 2023 +0800

    mm: compaction: mark kcompactd_run() and kcompactd_stop() __meminit

    Add __meminit to kcompactd_run() and kcompactd_stop() to ensure they're
    default to __init when memory hotplug is not enabled.

    Link: https://lkml.kernel.org/r/20230610034615.997813-1-linmiaohe@huawei.com
    Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:44 -04:00
Rafael Aquini 1b09465dae mm: compaction: have compaction_suitable() return bool
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 3cf04937529020e149666f56a41ebdeb226b69ed
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Fri Jun 2 11:12:04 2023 -0400

    mm: compaction: have compaction_suitable() return bool

    Since it only returns COMPACT_CONTINUE or COMPACT_SKIPPED now, a bool
    return value simplifies the callsites.

    Link: https://lkml.kernel.org/r/20230602151204.GD161817@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Suggested-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:26 -04:00
Rafael Aquini 10de7f49da mm: compaction: skip fast freepages isolation if enough freepages are isolated
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit a8d13355c660255266ece529e81e6cb26754941a
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu May 25 20:54:01 2023 +0800

    mm: compaction: skip fast freepages isolation if enough freepages are isolated

    I've observed that fast isolation often isolates more pages than
    cc->migratepages, and the excess freepages will be released back to the
    buddy system.  So skip fast freepages isolation if enough freepages are
    isolated to save some CPU cycles.

    Link: https://lkml.kernel.org/r/f39c2c07f2dba2732fd9c0843572e5bef96f7f67.1685018752.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:04 -04:00
Rafael Aquini 11d3f01673 mm: compaction: add trace event for fast freepages isolation
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 447ba88658faa8dbfd29d557daa38b7d88f460ec
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu May 25 20:54:00 2023 +0800

    mm: compaction: add trace event for fast freepages isolation

    The fast_isolate_freepages() can also isolate freepages, but we can not
    know the fast isolation efficiency to understand the fast isolation
    pressure.  So add a trace event to show some numbers to help to understand
    the efficiency for fast freepages isolation.

    Link: https://lkml.kernel.org/r/78d2932d0160d122c15372aceb3f2c45460a17fc.1685018752.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:04 -04:00
Rafael Aquini 6b13b52514 mm: compaction: only set skip flag if cc->no_set_skip_hint is false
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 8b71b499ff98fdcda7efefc146841a8b4d26813d
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu May 25 20:53:59 2023 +0800

    mm: compaction: only set skip flag if cc->no_set_skip_hint is false

    To keep the same logic as test_and_set_skip(), only set the skip flag if
    cc->no_set_skip_hint is false, which makes code more reasonable.

    Link: https://lkml.kernel.org/r/0eb2cd2407ffb259ae6e3071e10f70f2d41d0f3e.1685018752.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:03 -04:00
Rafael Aquini 145a40a7b5 mm: compaction: skip more fully scanned pageblock
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit cf650342f83ae655c6d05a1a74ae1672459973d0
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu May 25 20:53:58 2023 +0800

    mm: compaction: skip more fully scanned pageblock

    In fast_isolate_around(), it assumes the pageblock is fully scanned if
    cc->nr_freepages < cc->nr_migratepages after trying to isolate some free
    pages, and will set skip flag to avoid scanning in future.  However this
    can miss setting the skip flag for a fully scanned pageblock (returned
    'start_pfn' is equal to 'end_pfn') in the case where cc->nr_freepages is
    larger than cc->nr_migratepages.

    So using the returned 'start_pfn' from isolate_freepages_block() and
    'end_pfn' to decide if a pageblock is fully scanned makes more sense.  It
    can also cover the case where cc->nr_freepages < cc->nr_migratepages,
    which means the 'start_pfn' is usually equal to 'end_pfn' except some
    uncommon fatal error occurs after non-strict mode isolation.

    Link: https://lkml.kernel.org/r/f4efd2fa08735794a6d809da3249b6715ba6ad38.1685018752.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:02 -04:00
Rafael Aquini ee920384e3 mm: compaction: change fast_isolate_freepages() to void type
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 2dbd90054f965c899b9adb62b2d0d215f687d04b
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu May 25 20:53:57 2023 +0800

    mm: compaction: change fast_isolate_freepages() to void type

    No caller cares about the return value of fast_isolate_freepages(), void
    it.

    Link: https://lkml.kernel.org/r/759fca20b22ebf4c81afa30496837b9e0fb2e53b.1685018752.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:01 -04:00
Rafael Aquini 3ede930663 mm: compaction: drop the redundant page validation in update_pageblock_skip()
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 75990f6459b9cf61a94e8a08d0f6a4aa0b8cf3b5
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu May 25 20:53:56 2023 +0800

    mm: compaction: drop the redundant page validation in update_pageblock_skip()

    Patch series "Misc cleanups and improvements for compaction".

    This series cantains some cleanups and improvements for compaction.

    This patch (of 6):

    The caller has validated the page before calling
    update_pageblock_skip(), thus drop the redundant page validation in
    update_pageblock_skip().

    Link: https://lkml.kernel.org/r/5142e15b9295fe8c447dbb39b7907a20177a1413.1685018752.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:01 -04:00
Rafael Aquini 2bf1cb67b4 mm: compaction: drop redundant watermark check in compaction_zonelist_suitable()
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 1c9568e806a589da84b7afbdf0619b2c1f6c102a
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Fri May 19 14:39:59 2023 +0200

    mm: compaction: drop redundant watermark check in compaction_zonelist_suitable()

    The watermark check in compaction_zonelist_suitable(), called from
    should_compact_retry(), is sandwiched between two watermark checks
    already: before, there are freelist attempts as part of direct reclaim and
    direct compaction; after, there is a last-minute freelist attempt in
    __alloc_pages_may_oom().

    The check in compaction_zonelist_suitable() isn't necessary. Kill it.

    Link: https://lkml.kernel.org/r/20230519123959.77335-6-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:45 -04:00
Rafael Aquini 9dca1b8a23 mm: compaction: remove unnecessary is_via_compact_memory() checks
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit f98a497e1f16ee411df72629e32e31cba4cfa9cf
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Fri May 19 14:39:58 2023 +0200

    mm: compaction: remove unnecessary is_via_compact_memory() checks

    Remove from all paths not reachable via /proc/sys/vm/compact_memory.

    Link: https://lkml.kernel.org/r/20230519123959.77335-5-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:44 -04:00
Rafael Aquini 7544c82b76 mm: compaction: refactor __compaction_suitable()
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit e8606320e9af9774fd879e71c940fc9e5fd9b901
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Fri May 19 14:39:57 2023 +0200

    mm: compaction: refactor __compaction_suitable()

    __compaction_suitable() is supposed to check for available migration
    targets.  However, it also checks whether the operation was requested via
    /proc/sys/vm/compact_memory, and whether the original allocation request
    can already succeed.  These don't apply to all callsites.

    Move the checks out to the callers, so that later patches can deal with
    them one by one.  No functional change intended.

    [hannes@cmpxchg.org: fix comment, per Vlastimil]
      Link: https://lkml.kernel.org/r/20230602144942.GC161817@cmpxchg.org
    Link: https://lkml.kernel.org/r/20230519123959.77335-4-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:43 -04:00
Rafael Aquini 58572d1eb0 mm: compaction: avoid GFP_NOFS ABBA deadlock
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 4fbbb3fde3c69879ceebb33a8edd9d867008728b
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Fri May 19 13:13:59 2023 +0200

    mm: compaction: avoid GFP_NOFS ABBA deadlock

    During stress testing with higher-order allocations, a deadlock scenario
    was observed in compaction: One GFP_NOFS allocation was sleeping on
    mm/compaction.c::too_many_isolated(), while all CPUs in the system were
    busy with compactors spinning on buffer locks held by the sleeping
    GFP_NOFS allocation.

    Reclaim is susceptible to this same deadlock; we fixed it by granting
    GFP_NOFS allocations additional LRU isolation headroom, to ensure it makes
    forward progress while holding fs locks that other reclaimers might
    acquire.  Do the same here.

    This code has been like this since compaction was initially merged, and I
    only managed to trigger this with out-of-tree patches that dramatically
    increase the contexts that do GFP_NOFS compaction.  While the issue is
    real, it seems theoretical in nature given existing allocation sites.
    Worth fixing now, but no Fixes tag or stable CC.

    Link: https://lkml.kernel.org/r/20230519111359.40475-1-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Michal Hocko <mhocko@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:41 -04:00
Rafael Aquini 553573f4b1 mm: convert migrate_pages() to work on folios
JIRA: https://issues.redhat.com/browse/RHEL-27742
Conflicts:
  * dropped hunk for Documentation/translations/zh_CN/mm/page_migration.rst.
    This doc file was introduced upstream via pre-v6.0 (v6.0-rc1) merge
    commit 6614a3c3164a ("Merge tag 'mm-stable-2022-08-03' of
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm") which was never
    picked by previous backport attempts.

This patch is a backport of the following upstream commit:
commit 4e096ae1801e24b338e02715c65c3ffa8883ba5d
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sat May 13 01:11:01 2023 +0100

    mm: convert migrate_pages() to work on folios

    Almost all of the callers & implementors of migrate_pages() were already
    converted to use folios.  compaction_alloc() & compaction_free() are
    trivial to convert a part of this patch and not worth splitting out.

    Link: https://lkml.kernel.org/r/20230513001101.276972-1-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:25 -04:00
Rafael Aquini 3d948c1b2f mm: compaction: optimize compact_memory to comply with the admin-guide
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 8b9167cd9ef039d95b65ef9600a7507795173121
Author: Wen Yang <wenyang.linux@foxmail.com>
Date:   Tue Apr 25 23:52:35 2023 +0800

    mm: compaction: optimize compact_memory to comply with the admin-guide

    For the /proc/sys/vm/compact_memory file, the admin-guide states: When 1
    is written to the file, all zones are compacted such that free memory is
    available in contiguous blocks where possible.  This can be important for
    example in the allocation of huge pages although processes will also
    directly compact memory as required

    But it was not strictly followed, writing any value would cause all zones
    to be compacted.

    It has been slightly optimized to comply with the admin-guide.  Enforce
    the 1 on the unlikely chance that the sysctl handler is ever extended to
    do something different.

    Commit ef49843841 ("mm/compaction: remove unused variable
    sysctl_compact_memory") has also been optimized a bit here, as the
    declaration in the external header file has been eliminated, and
    sysctl_compact_memory also needs to be verified.

    [akpm@linux-foundation.org: add __read_mostly, per Mel]
    Link: https://lkml.kernel.org/r/tencent_DFF54DB2A60F3333F97D3F6B5441519B050A@qq.com
    Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: William Lam <william.lam@bytedance.com>
    Cc: Pintu Kumar <pintu@codeaurora.org>
    Cc: Fu Wei <wefu@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:15 -04:00
Chris von Recklinghausen e0152f958e mm: compaction: remove incorrect #ifdef checks
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit b3f312c4815d7acf6c7f88f921544626c31bd24a
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Mar 29 10:02:41 2023 +0200

    mm: compaction: remove incorrect #ifdef checks

    Without CONFIG_SYSCTL, the compiler warns about a few unused functions:

    mm/compaction.c:3076:12: error: 'proc_dointvec_minmax_warn_RT_change' defined but not used [-Werror=unused-function]
    mm/compaction.c:2780:12: error: 'sysctl_compaction_handler' defined but not used [-Werror=unused-function]
    mm/compaction.c:2750:12: error: 'compaction_proactiveness_sysctl_handler' defined but not used [-Werror=unused-function]

    The #ifdef is actually not necessary here, as the alternative
    register_sysctl_init() stub function does not use its argument, which
    lets the compiler drop the rest implicitly, while avoiding the warning.

    Fixes: c521126610c3 ("mm: compaction: move compaction sysctl to its own file")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:47 -04:00
Chris von Recklinghausen 8f4c875305 mm: compaction: move compaction sysctl to its own file
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 48fe8ab8d5a39c7bc49cb41d0ad92c75f48a9550
Author: Minghao Chi <chi.minghao@zte.com.cn>
Date:   Tue Mar 28 14:46:28 2023 +0800

    mm: compaction: move compaction sysctl to its own file

    This moves all compaction sysctls to its own file.

    Move sysctl to where the functionality truly belongs to improve
    readability, reduce merge conflicts, and facilitate maintenance.

    I use x86_defconfig and linux-next-20230327 branch
    $ make defconfig;make all -jn
    CONFIG_COMPACTION=y

    add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
    Function                                     old     new   delta
    vm_compaction                                  -     320    +320
    kcompactd_init                               180     210     +30
    vm_table                                    2112    1856    -256
    Total: Before=21119987, After=21120081, chg +0.00%

    Despite the addition of 94 bytes the patch still seems a worthwile
    cleanup.

    Link: https://lore.kernel.org/lkml/067f7347-ba10-5405-920c-0f5f985c84f4@suse.cz/
    Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:47 -04:00
Chris von Recklinghausen d331dd9134 mm: compaction: fix the possible deadlock when isolating hugetlb pages
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 1c06b6a599b5b7be74a6baffafa00b0f70cbe523
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu Mar 16 19:06:47 2023 +0800

    mm: compaction: fix the possible deadlock when isolating hugetlb pages

    When trying to isolate a migratable pageblock, it can contain several
    normal pages or several hugetlb pages (e.g. CONT-PTE 64K hugetlb on arm64)
    in a pageblock. That means we may hold the lru lock of a normal page to
    continue to isolate the next hugetlb page by isolate_or_dissolve_huge_page()
    in the same migratable pageblock.

    However in the isolate_or_dissolve_huge_page(), it may allocate a new hugetlb
    page and dissolve the old one by alloc_and_dissolve_hugetlb_folio() if the
    hugetlb's refcount is zero. That means we can still enter the direct compaction
    path to allocate a new hugetlb page under the current lru lock, which
    may cause possible deadlock.

    To avoid this possible deadlock, we should release the lru lock when
    trying to isolate a hugetbl page.  Moreover it does not make sense to take
    the lru lock to isolate a hugetlb, which is not in the lru list.

    Link: https://lkml.kernel.org/r/7ab3bffebe59fb419234a68dec1e4572a2518563.1678962352.git.baolin.wang@linux.alibaba.com
    Fixes: 369fa227c2 ("mm: make alloc_contig_range handle free hugetlb pages")
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: William Lam <william.lam@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:29 -04:00
Chris von Recklinghausen f015131161 mm: compaction: consider the number of scanning compound pages in isolate fail path
JIRA: https://issues.redhat.com/browse/RHEL-27741

commit 56d48d8dbefb1cae3aeae54284f7d6f52a41ec23
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Thu Mar 16 19:06:46 2023 +0800

    mm: compaction: consider the number of scanning compound pages in isolate fail path

    commit b717d6b93b54 ("mm: compaction: include compound page count for
    scanning in pageblock isolation") added compound page statistics for
    scanning in pageblock isolation, to make sure the number of scanned pages
    is always larger than the number of isolated pages when isolating
    mirgratable or free pageblock.

    However, when failing to isolate the pages when scanning the migratable or
    free pageblocks, the isolation failure path did not consider the scanning
    statistics of the compound pages, which result in showing the incorrect
    number of scanned pages in tracepoints or in vmstats which will confuse
    people about the page scanning pressure in memory compaction.

    Thus we should take into account the number of scanning pages when failing
    to isolate the compound pages to make the statistics accurate.

    Link: https://lkml.kernel.org/r/73d6250a90707649cc010731aedc27f946d722ed.1678962352.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: William Lam <william.lam@bytedance.com>

    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-04-30 07:00:29 -04:00
Aristeu Rozanski 837cf9f325 mm: change to return bool for isolate_movable_page()
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit cd7755800eb54e8522f5e51f4e71e6494c1f1572
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Wed Feb 15 18:39:37 2023 +0800

    mm: change to return bool for isolate_movable_page()

    Now the isolate_movable_page() can only return 0 or -EBUSY, and no users
    will care about the negative return value, thus we can convert the
    isolate_movable_page() to return a boolean value to make the code more
    clear when checking the movable page isolation state.

    No functional changes intended.

    [akpm@linux-foundation.org: remove unneeded comment, per Matthew]
    Link: https://lkml.kernel.org/r/cb877f73f4fff8d309611082ec740a7065b1ade0.1676424378.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
    Reviewed-by: SeongJae Park <sj@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:24 -04:00
Aristeu Rozanski 354389c93f mm: compaction: avoid fragmentation score calculation for empty zones
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 9e5522715e6941bcfdc08d066a79d6da0f8cec8e
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Tue Jan 10 21:36:22 2023 +0800

    mm: compaction: avoid fragmentation score calculation for empty zones

    There is no need to calculate the fragmentation score for empty zones.

    Link: https://lkml.kernel.org/r/100331ad9d274a9725e687b00d85d75d7e4a17c7.1673342761.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:05 -04:00
Aristeu Rozanski d20f424ebe mm: compaction: add missing kcompactd wakeup trace event
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 8fff8b6f8d0ef7620e06f3f4cfb912171aef6cd5
Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Tue Jan 10 21:36:21 2023 +0800

    mm: compaction: add missing kcompactd wakeup trace event

    Add missing kcompactd wakeup trace event for proactive compaction,
    meanwhile use order = -1 and the highest zone index of the pgdat for the
    kcompactd wakeup trace event by proactive compaction.

    Link: https://lkml.kernel.org/r/cbf8097a2d8a1b6800991f2a21575550d3613ce6.1673342761.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:05 -04:00