Commit Graph

172 Commits

Author SHA1 Message Date
Rafael Aquini 82bfd3e75b mm: ignore data-race in __swap_writepage
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 7b7aca6d7c0f9b2d9400bfc57cb2b23cfbd5134d
Author: Pei Li <peili.dev@gmail.com>
Date:   Thu Jul 11 09:32:30 2024 -0700

    mm: ignore data-race in __swap_writepage

    Syzbot reported a possible data race:

    BUG: KCSAN: data-race in __swap_writepage / scan_swap_map_slots

    read-write to 0xffff888102fca610 of 8 bytes by task 7106 on cpu 1.
    read to 0xffff888102fca610 of 8 bytes by task 7080 on cpu 0.

    While we are in __swap_writepage to read sis->flags, scan_swap_map_slots
    is trying to update it with SWP_SCANNING.

    value changed: 0x0000000000008083 -> 0x0000000000004083.

    While this can be updated non-atomicially, this won't affect
    SWP_SYNCHRONOUS_IO, so we consider this data-race safe.

    This is possibly introduced by commit 3222d8c2a7f8 ("block: remove
    ->rw_page"), where this if branch is introduced.

    Link: https://lkml.kernel.org/r/20240711-bug13-v1-1-cea2b8ae8d76@gmail.com
    Fixes: 3222d8c2a7f8 ("block: remove ->rw_page")
    Signed-off-by: Pei Li <peili.dev@gmail.com>
    Reported-by: syzbot+da25887cc13da6bf3b8c@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=da25887cc13da6bf3b8c
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:25:26 -05:00
Rafael Aquini fe6b91357e zswap: memcontrol: implement zswap writeback disabling
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 501a06fe8e4c185bbda371b8cedbdf1b23a633d8
Author: Nhat Pham <nphamcs@gmail.com>
Date:   Thu Dec 7 11:24:06 2023 -0800

    zswap: memcontrol: implement zswap writeback disabling

    During our experiment with zswap, we sometimes observe swap IOs due to
    occasional zswap store failures and writebacks-to-swap.  These swapping
    IOs prevent many users who cannot tolerate swapping from adopting zswap to
    save memory and improve performance where possible.

    This patch adds the option to disable this behavior entirely: do not
    writeback to backing swapping device when a zswap store attempt fail, and
    do not write pages in the zswap pool back to the backing swap device (both
    when the pool is full, and when the new zswap shrinker is called).

    This new behavior can be opted-in/out on a per-cgroup basis via a new
    cgroup file.  By default, writebacks to swap device is enabled, which is
    the previous behavior.  Initially, writeback is enabled for the root
    cgroup, and a newly created cgroup will inherit the current setting of its
    parent.

    Note that this is subtly different from setting memory.swap.max to 0, as
    it still allows for pages to be stored in the zswap pool (which itself
    consumes swap space in its current form).

    This patch should be applied on top of the zswap shrinker series:

    https://lore.kernel.org/linux-mm/20231130194023.4102148-1-nphamcs@gmail.com/

    as it also disables the zswap shrinker, a major source of zswap
    writebacks.

    For the most part, this feature is motivated by internal parties who
    have already established their opinions regarding swapping - the
    workloads that are highly sensitive to IO, and especially those who are
    using servers with really slow disk performance (for instance, massive
    but slow HDDs).  For these folks, it's impossible to convince them to
    even entertain zswap if swapping also comes as a packaged deal.
    Writeback disabling is quite a useful feature in these situations - on
    a mixed workloads deployment, they can disable writeback for the more
    IO-sensitive workloads, and enable writeback for other background
    workloads.

    For instance, on a server with HDD, I allocate memories and populate
    them with random values (so that zswap store will always fail), and
    specify memory.high low enough to trigger reclaim.  The time it takes
    to allocate the memories and just read through it a couple of times
    (doing silly things like computing the values' average etc.):

    zswap.writeback disabled:
    real 0m30.537s
    user 0m23.687s
    sys 0m6.637s
    0 pages swapped in
    0 pages swapped out

    zswap.writeback enabled:
    real 0m45.061s
    user 0m24.310s
    sys 0m8.892s
    712686 pages swapped in
    461093 pages swapped out

    (the last two lines are from vmstat -s).

    [nphamcs@gmail.com: add a comment about recurring zswap store failures leading to reclaim inefficiency]
      Link: https://lkml.kernel.org/r/20231221005725.3446672-1-nphamcs@gmail.com
    Link: https://lkml.kernel.org/r/20231207192406.3809579-1-nphamcs@gmail.com
    Signed-off-by: Nhat Pham <nphamcs@gmail.com>
    Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
    Acked-by: Chris Li <chrisl@kernel.org>
    Cc: Dan Streetman <ddstreet@ieee.org>
    Cc: David Heidelberg <david@ixit.cz>
    Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Seth Jennings <sjenning@redhat.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Cc: Zefan Li <lizefan.x@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:12 -05:00
Rafael Aquini 0a546fc1e9 mm: convert swap_readpage() to swap_read_folio()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit c9bdf768dd9319d2d80a334646e2c8116af9e430
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:39 2023 +0000

    mm: convert swap_readpage() to swap_read_folio()

    All callers have a folio, so pass it in, saving two calls to
    compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-11-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:06 -05:00
Rafael Aquini 2f1b94a81c mm: convert swap_page_sector() to swap_folio_sector()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 3a61e6f668120ee2c7840b91891c858d575d07e2
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:38 2023 +0000

    mm: convert swap_page_sector() to swap_folio_sector()

    All callers have a folio, so pass it in.  Saves a couple of calls to
    compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-10-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:05 -05:00
Rafael Aquini 1c190f39c9 mm: pass a folio to swap_readpage_bdev_async()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 3c3ebd82e0d1e77df7a3906e79b42d8f0793bdd7
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:37 2023 +0000

    mm: pass a folio to swap_readpage_bdev_async()

    Make it plain that this takes the head page (which before this point
    was just an assumption, but is now enforced by the compiler).

    Link: https://lkml.kernel.org/r/20231213215842.671461-9-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:04 -05:00
Rafael Aquini 907e640e0d mm: pass a folio to swap_readpage_bdev_sync()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 2c184d821eec55f9ea3c98c67dc2b0c5ec827c87
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:36 2023 +0000

    mm: pass a folio to swap_readpage_bdev_sync()

    Make it plain that this takes the head page (which before this point
    was just an assumption, but is now enforced by the compiler).

    Link: https://lkml.kernel.org/r/20231213215842.671461-8-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:04 -05:00
Rafael Aquini bab9660698 mm: pass a folio to swap_readpage_fs()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 64a24e55e3f462836ee618be480bd1b0b018e557
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:35 2023 +0000

    mm: pass a folio to swap_readpage_fs()

    Saves a call to compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-7-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:03 -05:00
Rafael Aquini 7c9c39ca32 mm: pass a folio to swap_writepage_bdev_async()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit ee1b1d9b46f206ffdef5ebe4086d925a5c43805b
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:34 2023 +0000

    mm: pass a folio to swap_writepage_bdev_async()

    Saves a call to compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-6-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:02 -05:00
Rafael Aquini 108f1f4414 mm: pass a folio to swap_writepage_bdev_sync()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 6de62c7bc4bc3444ce63490640efae965b637fe6
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:33 2023 +0000

    mm: pass a folio to swap_writepage_bdev_sync()

    Saves a call to compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-5-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:02 -05:00
Rafael Aquini 5ab0000f8c mm: pass a folio to swap_writepage_fs()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit bfcd44d5f816b442feb27f59e9312ce38ac4b3cf
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:32 2023 +0000

    mm: pass a folio to swap_writepage_fs()

    Saves several calls to compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-4-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:01 -05:00
Rafael Aquini c52617d6b5 mm: pass a folio to __swap_writepage()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit b99b4e0d9d7f29b428bacd7a61188b2abf340c1e
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Wed Dec 13 21:58:31 2023 +0000

    mm: pass a folio to __swap_writepage()

    Both callers now have a folio, so pass that in instead of the page.
    Removes a few hidden calls to compound_head().

    Link: https://lkml.kernel.org/r/20231213215842.671461-3-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:00 -05:00
Rafael Aquini 468870e459 mm: memcg: add THP swap out info for anonymous reclaim
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 811244a501b967b00fecb1ae906d5dc6329c91e0
Author: Xin Hao <vernhao@tencent.com>
Date:   Thu Sep 14 00:49:37 2023 +0800

    mm: memcg: add THP swap out info for anonymous reclaim

    At present, we support per-memcg reclaim strategy, however we do not know
    the number of transparent huge pages being reclaimed, as we know the
    transparent huge pages need to be splited before reclaim them, and they
    will bring some performance bottleneck effect.  for example, when two
    memcg (A & B) are doing reclaim for anonymous pages at same time, and 'A'
    memcg is reclaiming a large number of transparent huge pages, we can
    better analyze that the performance bottleneck will be caused by 'A'
    memcg.  therefore, in order to better analyze such problems, there add THP
    swap out info for per-memcg.

    [akpm@linux-foundation.orgL fix swap_writepage_fs(), per Johannes]
      Link: https://lkml.kernel.org/r/20230913213343.GB48476@cmpxchg.org
    Link: https://lkml.kernel.org/r/20230913164938.16918-1-vernhao@tencent.com
    Signed-off-by: Xin Hao <vernhao@tencent.com>
    Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Muchun Song <songmuchun@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:17 -05:00
Rafael Aquini 627b169e30 mm/page_io: convert bio_associate_blkg_from_page() to take in a folio
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 98630cfdc4221e1455e13c1bd423d029c888dca6
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:51 2023 +0800

    mm/page_io: convert bio_associate_blkg_from_page() to take in a folio

    Convert bio_associate_blkg_from_page() to take in a folio. We can remove
    two implicit calls to compound_head() by taking in a folio.

    Link: https://lkml.kernel.org/r/20230721034451.16412-11-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:21 -04:00
Rafael Aquini b98b0aab3a mm/page_io: convert count_swpout_vm_event() to take in a folio
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 9b72b134eedc6fbdf7b59c9c4764a57d14b2fea7
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:50 2023 +0800

    mm/page_io: convert count_swpout_vm_event() to take in a folio

    Convert count_swpout_vm_event() to take in a folio. We can remove five
    implicit calls to compound_head() by taking in a folio.

    Link: https://lkml.kernel.org/r/20230721034451.16412-10-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:20 -04:00
Rafael Aquini a77f34bd6e mm/page_io: use a folio in swap_writepage_bdev_async()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 2675251d5037c308a03f8ad1545b4169522cb950
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:49 2023 +0800

    mm/page_io: use a folio in swap_writepage_bdev_async()

    Saves one implicit call to compound_head().

    Link: https://lkml.kernel.org/r/20230721034451.16412-9-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:20 -04:00
Rafael Aquini cb0d21dfcb mm/page_io: use a folio in swap_writepage_bdev_sync()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit f54fcaabd34b98921ec12501d0507e1fa1ae831b
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:48 2023 +0800

    mm/page_io: use a folio in swap_writepage_bdev_sync()

    Saves one implicit call to compound_head().

    Link: https://lkml.kernel.org/r/20230721034451.16412-8-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:19 -04:00
Rafael Aquini 621c1bebb0 mm/page_io: use a folio in sio_read_complete()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 6a8c068774ad7634b43bebd97182141765398835
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:47 2023 +0800

    mm/page_io: use a folio in sio_read_complete()

    Saves one implicit call to compound_head().

    Link: https://lkml.kernel.org/r/20230721034451.16412-7-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:18 -04:00
Rafael Aquini 38e6e3a019 mm/page_io: use a folio in __end_swap_bio_read()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit bc74b53f29e1025a08e97f8e507968608a567f26
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:46 2023 +0800

    mm/page_io: use a folio in __end_swap_bio_read()

    Saves one implicit call to compound_head().

    Link: https://lkml.kernel.org/r/20230721034451.16412-6-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:17 -04:00
Rafael Aquini 94e6e3fe4a mm/page_io: use a folio in __end_swap_bio_write()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit a3ed1e9b63a2703caab4fe63ddb560991a5f618c
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:45 2023 +0800

    mm/page_io: use a folio in __end_swap_bio_write()

    Saves two implicit call to compound_head().

    Link: https://lkml.kernel.org/r/20230721034451.16412-5-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:16 -04:00
Rafael Aquini 8fb189acc1 mm/page_io: remove unneeded SetPageError()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 9962ed64bd2154863ab3b63b15a2b55e39dc7117
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:43 2023 +0800

    mm/page_io: remove unneeded SetPageError()

    Nobody checks the PageError()/folio_test_error() for the page/folio in
    __end_swap_bio_read/write() and sio_write_complete(). Therefore, we
    don't need to set the error flag. Just drop it.

    Link: https://lkml.kernel.org/r/20230721034451.16412-3-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:16 -04:00
Rafael Aquini 4fe0cc0f8c mm/page_io: remove unneeded ClearPageUptodate()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 479c33049116f2d138b4dfec328961881cc26b33
Author: ZhangPeng <zhangpeng362@huawei.com>
Date:   Fri Jul 21 11:44:42 2023 +0800

    mm/page_io: remove unneeded ClearPageUptodate()

    Patch series "Convert several functions in page_io.c to use a folio", v4.

    Convert several functions in page_io.c to use a folio, which can remove
    several implicit calls to compound_head().

    This patch (of 10):

    The VM_BUG_ON_FOLIO in swap_readpage() ensures that the page is already
    !uptodate in __end_swap_bio_read() and sio_read_complete().  Just remove
    unneeded ClearPageUptodate().

    Link: https://lkml.kernel.org/r/20230721034451.16412-1-zhangpeng362@huawei.com
    Link: https://lkml.kernel.org/r/20230721034451.16412-2-zhangpeng362@huawei.com
    Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
    Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:15 -04:00
Rafael Aquini 38d194482b swap: use __bio_add_page to add page to bio
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit cb58bf91b138c1a8b18cca9503308789e26e3522
Author: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Date:   Wed May 31 04:50:24 2023 -0700

    swap: use __bio_add_page to add page to bio

    The swap code only adds a single page to a newly created bio. So use
    __bio_add_page() to add the page which is guaranteed to succeed in this
    case.

    This brings us closer to marking bio_add_page() as __must_check.

    Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Link: https://lore.kernel.org/r/5bdafd9de806b2dab92302b30eb7a3a5f10c37d9.1685532726.git.johannes.thumshirn@wdc.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:36:18 -04:00
Rafael Aquini 76f43ee2b0 zswap: make zswap_load() take a folio
JIRA: https://issues.redhat.com/browse/RHEL-40684

This patch is a backport of the following upstream commit:
commit ca54f6d89d60abf3e7dea68c95dfd442eeece212
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sat Jul 15 05:23:43 2023 +0100

    zswap: make zswap_load() take a folio

    Only convert a few easy parts of this function to use the folio passed in;
    convert back to struct page for the majority of it.  Removes three hidden
    calls to compound_head().

    Link: https://lkml.kernel.org/r/20230715042343.434588-6-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Nhat Pham <nphamcs@gmail.com>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <aquini@redhat.com>
2024-06-28 12:24:10 -04:00
Rafael Aquini aa7c382435 swap: remove some calls to compound_head() in swap_readpage()
JIRA: https://issues.redhat.com/browse/RHEL-40684

This patch is a backport of the following upstream commit:
commit fbcec6a3a09b309900f1ecef8954721d93555abd
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sat Jul 15 05:23:42 2023 +0100

    swap: remove some calls to compound_head() in swap_readpage()

    Replace six implicit calls to compound_head() with one call to
    page_folio().

    Link: https://lkml.kernel.org/r/20230715042343.434588-5-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Nhat Pham <nphamcs@gmail.com>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <aquini@redhat.com>
2024-06-28 12:24:09 -04:00
Rafael Aquini 1d5ae1a6ef zswap: make zswap_store() take a folio
JIRA: https://issues.redhat.com/browse/RHEL-40684

This patch is a backport of the following upstream commit:
commit 34f4c198bfbe86612c368eb122002787acecaa93
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sat Jul 15 05:23:40 2023 +0100

    zswap: make zswap_store() take a folio

    Patch series "Followup folio conversions for zswap".

    With frontswap killed, it's worth converting the zswap_load() and
    zswap_store() functions to take a folio instead of a page pointer.  They
    aren't converted to support large folios, but there are a lot of
    unnecessary calls to compound_head() that are removed by these patches.

    This patch (of 4):

    Only convert a few easy parts of this function to use the folio passed in;
    convert back to struct page for the majority of it.  This does remove a
    few hidden calls to compound_head().

    Link: https://lkml.kernel.org/r/20230715042343.434588-1-willy@infradead.org
    Link: https://lkml.kernel.org/r/20230715042343.434588-3-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Nhat Pham <nphamcs@gmail.com>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <aquini@redhat.com>
2024-06-28 12:24:07 -04:00
Rafael Aquini b2ca727888 mm: kill frontswap
JIRA: https://issues.redhat.com/browse/RHEL-40684
Conflicts:
    * mm/zswap.c: minor context differences due to missing upstream v6.6
          commit b8cf32dc6e8c ("mm: zswap: multiple zpools support")

This patch is a backport of the following upstream commit:
commit 42c06a0e8ebe95b81e5fb41c6556ff22d9255b0c
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Mon Jul 17 12:02:27 2023 -0400

    mm: kill frontswap

    The only user of frontswap is zswap, and has been for a long time.  Have
    swap call into zswap directly and remove the indirection.

    [hannes@cmpxchg.org: remove obsolete comment, per Yosry]
      Link: https://lkml.kernel.org/r/20230719142832.GA932528@cmpxchg.org
    [fengwei.yin@intel.com: don't warn if none swapcache folio is passed to zswap_load]
      Link: https://lkml.kernel.org/r/20230810095652.3905184-1-fengwei.yin@intel.com
    Link: https://lkml.kernel.org/r/20230717160227.GA867137@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
    Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Acked-by: Nhat Pham <nphamcs@gmail.com>
    Acked-by: Yosry Ahmed <yosryahmed@google.com>
    Acked-by: Christoph Hellwig <hch@lst.de>
    Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <aquini@redhat.com>
2024-06-28 12:24:06 -04:00
Lucas Zampieri 5739ae2afe Merge: Rebase kexec/kdump to upstream kernel v6.5
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4053

```
JIRA: https://issues.redhat.com/browse/RHEL-32199
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

This rebase kexec/kdump of rhel9 kernel to v6.5 of mainline kernel. This is for rhel9.5. Last time rebase was done in rhel9.2 and synchronized to v6.0.

Signed-off-by: Baoquan He <bhe@redhat.com>
```

Approved-by: Vladis Dronov <vdronov@redhat.com>
Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Lenny Szubowicz <lszubowi@redhat.com>
Approved-by: Lichen Liu <lichliu@redhat.com>
Approved-by: Tao Liu <ltao@redhat.com>
Approved-by: Pingfan Liu <piliu@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-05-27 13:52:25 +00:00
Baoquan He 78ef223d06 use less confusing names for iov_iter direction initializers
JIRA: https://issues.redhat.com/browse/RHEL-32199

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Note: The core of this patch was already backported in commit 0d33f8e1f3.
      And later commit a9e6d7970e back ported cifs part. This patch tries
      to back port the remaining parts which doesn't have conficts. This
      change is easing code reading and understanding, not related to
      functionality or features, hence leave those parts to module developer
      to back port when dependency is met.

commit de4eda9de2d957ef2d6a8365a01e26a435e958cb
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Thu Sep 15 20:25:47 2022 -0400

    use less confusing names for iov_iter direction initializers

    READ/WRITE proved to be actively confusing - the meanings are
    "data destination, as used with read(2)" and "data source, as
    used with write(2)", but people keep interpreting those as
    "we read data from it" and "we write data to it", i.e. exactly
    the wrong way.

    Call them ITER_DEST and ITER_SOURCE - at least that is harder
    to misinterpret...

    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Baoquan He <bhe@redhat.com>
2024-05-15 13:56:26 +08:00
Aristeu Rozanski ec5aa488a7 swap: use bvec_set_page to initialize bvecs
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit 8976fa6d79d70502181fa16b5e023645c0f44ec4
Author: Christoph Hellwig <hch@lst.de>
Date:   Fri Feb 3 16:06:30 2023 +0100

    swap: use bvec_set_page to initialize bvecs

    Use the bvec_set_page helper to initialize bvecs.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20230203150634.3199647-20-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:13 -04:00
Aristeu Rozanski 39148fe173 page_io: remove buffer_head include
JIRA: https://issues.redhat.com/browse/RHEL-27740
Tested: by me

commit c10d91194d5d630a0befc7bc116aba3cfda8a226
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Thu Dec 15 21:43:56 2022 +0000

    page_io: remove buffer_head include

    page_io never uses buffer_heads to do I/O.

    Link: https://lkml.kernel.org/r/20221215214402.3522366-7-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-04-29 14:33:00 -04:00
Ming Lei 58bc3d322c block: remove ->rw_page
JIRA: https://issues.redhat.com/browse/RHEL-29262
Conflicts: small context difference for brd & zram, because we have
        ported commit 3f89ac587baa ("block/drivers: remove dead clear of random flag")

commit 3222d8c2a7f888bf38b845b125e9470b12108a4d
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 25 14:34:36 2023 +0100

    block: remove ->rw_page

    The ->rw_page method is a special purpose bypass of the usual bio handling
    path that is limited to single-page reads and writes and synchronous which
    causes a lot of extra code in the drivers, callers and the block layer.

    The only remaining user is the MM swap code.  Switch that swap code to
    simply submit a single-vec on-stack bio an synchronously wait on it based
    on a newly added QUEUE_FLAG_SYNCHRONOUS flag set by the drivers that
    currently implement ->rw_page instead.  While this touches one extra cache
    line and executes extra code, it simplifies the block layer and drivers
    and ensures that all feastures are properly supported by all drivers, e.g.
    right now ->rw_page bypassed cgroup writeback entirely.

    [akpm@linux-foundation.org: fix comment typo, per Dan]
    Link: https://lkml.kernel.org/r/20230125133436.447864-8-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-03-19 10:06:25 +08:00
Ming Lei ff54e751c1 mm: factor out a swap_writepage_bdev helper
JIRA: https://issues.redhat.com/browse/RHEL-29262

commit 05cda97ecb7046f4192a921741aae33b300dd628
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 25 14:34:35 2023 +0100

    mm: factor out a swap_writepage_bdev helper

    Split the block device case from swap_readpage into a separate helper,
    following the abstraction for file based swap.

    Link: https://lkml.kernel.org/r/20230125133436.447864-7-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-03-15 21:14:58 +08:00
Ming Lei 67d23d2b08 mm: remove the __swap_writepage return value
JIRA: https://issues.redhat.com/browse/RHEL-29262

commit e3e2762bd3c5e02780618fc42f5b0049a3bedb30
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 25 14:34:34 2023 +0100

    mm: remove the __swap_writepage return value

    __swap_writepage always returns 0.

    Link: https://lkml.kernel.org/r/20230125133436.447864-6-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-03-15 21:14:58 +08:00
Ming Lei 3fb812998f mm: use an on-stack bio for synchronous swapin
JIRA: https://issues.redhat.com/browse/RHEL-29262

commit 9b4e30bd7309222f74a5198f44bd45feea024b00
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 25 14:34:33 2023 +0100

    mm: use an on-stack bio for synchronous swapin

    Optimize the synchronous swap in case by using an on-stack bio instead of
    allocating one using bio_alloc.

    Link: https://lkml.kernel.org/r/20230125133436.447864-5-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-03-15 21:14:58 +08:00
Ming Lei 177bf3100c mm: factor out a swap_readpage_bdev helper
JIRA: https://issues.redhat.com/browse/RHEL-29262

commit 14bd75f57400dba0e75eaee4dcb44ac52a46253f
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 25 14:34:32 2023 +0100

    mm: factor out a swap_readpage_bdev helper

    Split the block device case from swap_readpage into a separate helper,
    following the abstraction for file based swap and frontswap.

    Link: https://lkml.kernel.org/r/20230125133436.447864-4-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-03-15 21:14:58 +08:00
Ming Lei 3cfe604280 mm: remove the swap_readpage return value
JIRA: https://issues.redhat.com/browse/RHEL-29262

commit a8c1408f870ef5308088b02c76082136b2c514ad
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 25 14:34:31 2023 +0100

    mm: remove the swap_readpage return value

    swap_readpage always returns 0, and no caller checks the return value.

    [akpm@linux-foundation.org: fix void-returning swap_readpage() stub, per Keith]
    Link: https://lkml.kernel.org/r/20230125133436.447864-3-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-03-15 21:14:58 +08:00
Chris von Recklinghausen 5e960286bd swap: convert swap_writepage() to use a folio
JIRA: https://issues.redhat.com/browse/RHEL-1848

commit 71fa1a533d2e027a3df98fd065605bebab42d7bf
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Fri Sep 2 20:46:36 2022 +0100

    swap: convert swap_writepage() to use a folio

    Removes many calls to compound_head().

    Link: https://lkml.kernel.org/r/20220902194653.1739778-41-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-10-20 06:14:01 -04:00
Chris von Recklinghausen 996da1ae64 mm/swap: remove the end_write_func argument to __swap_writepage
JIRA: https://issues.redhat.com/browse/RHEL-1848

commit cf1e3fe4975c4bd6a6a14428700c5a2c36528a7b
Author: Christoph Hellwig <hch@lst.de>
Date:   Thu Aug 11 16:17:41 2022 +0200

    mm/swap: remove the end_write_func argument to __swap_writepage

    The argument is always set to end_swap_bio_write, so remove the argument
    and mark end_swap_bio_write static.

    Link: https://lkml.kernel.org/r/20220811141741.660214-1-hch@lst.de
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Cc: Seth Jennings <sjenning@redhat.com>
    Cc: Dan Streetman <ddstreet@ieee.org>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-10-20 06:13:26 -04:00
Nico Pache 957d3c1160 mm/page_io: count submission time as thrashing delay for delayacct
commit 3a9bb7b1879bef057a5dbff1dac1fa1411638064
Author: Yang Yang <yang.yang29@zte.com.cn>
Date:   Mon Aug 15 07:28:37 2022 +0000

    mm/page_io: count submission time as thrashing delay for delayacct

    Once upon a time, we only support accounting thrashing of page cache.
    Then Joonsoo introduced workingset detection for anonymous pages and we
    gained the ability to account thrashing of them[1].

    Likes PSI, we count submission time as thrashing delay because when the
    device is congested, or the submitting cgroup IO-throttled, submission can
    be a significant part of overall IO time.

    Without this patch, swap thrashing through frontswap or some block
    device supporting rw_page operation isn't measured correctly.

    This patch is based on "delayacct: support re-entrance detection of
    thrashing accounting".

    [1] commit aae466b005 ("mm/swap: implement workingset detection for anonymous LRU")

    Link: https://lkml.kernel.org/r/20220815072835.74876-1-yang.yang29@zte.com.cn
    Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
    Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
    Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
    Reviewed-by: wangyong <wang.yong12@zte.com.cn>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168372
Signed-off-by: Nico Pache <npache@redhat.com>
2023-06-14 15:11:00 -06:00
Chris von Recklinghausen ea466a5c0a MM: handle THP in swap_*page_fs() - count_vm_events()
Bugzilla: https://bugzilla.redhat.com/2160210

commit 6341a446a0e66355d729b663d7c8ca28ad6d1442
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:49 2022 -0700

    MM: handle THP in swap_*page_fs() - count_vm_events()

    We need to use count_swpout_vm_event() for sio_write_complete() to get
    correct counting.

    Note that THP swap in (if it ever happens) is current accounted 1 for each
    page, whether HUGE or normal.  This is different from swap-out accounting.

    This patch should be squashed into
        MM: handle THP in swap_*page_fs()

    Link: https://lkml.kernel.org/r/165146948934.24404.5909750610552745025@noble.neil.brown.name
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reported-by: Miaohe Lin <linmiaohe@huawei.com>
    Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Huang Ying <ying.huang@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:01 -04:00
Chris von Recklinghausen 0b9d657d68 mm: handle THP in swap_*page_fs()
Bugzilla: https://bugzilla.redhat.com/2160210

commit a1a0dfd56f97738c1974976309bbf38bb5a21132
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:49 2022 -0700

    mm: handle THP in swap_*page_fs()

    Pages passed to swap_readpage()/swap_writepage() are not necessarily all
    the same size - there may be transparent-huge-pages involves.

    The BIO paths of swap_*page() handle this correctly, but the SWP_FS_OPS
    path does not.

    So we need to use thp_size() to find the size, not just assume PAGE_SIZE,
    and we need to track the total length of the request, not just assume it
    is "page * PAGE_SIZE".

    Link: https://lkml.kernel.org/r/165119301488.15698.9457662928942765453.stgit@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reported-by: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: David Howells <dhowells@redhat.com>
    Cc: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:01 -04:00
Chris von Recklinghausen ae948f1a01 mm: submit multipage write for SWP_FS_OPS swap-space
Bugzilla: https://bugzilla.redhat.com/2160210

commit 2282679fb20bf036a714ed49fadd0230c278a203
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:49 2022 -0700

    mm: submit multipage write for SWP_FS_OPS swap-space

    swap_writepage() is given one page at a time, but may be called repeatedly
    in succession.

    For block-device swapspace, the blk_plug functionality allows the multiple
    pages to be combined together at lower layers.  That cannot be used for
    SWP_FS_OPS as blk_plug may not exist - it is only active when
    CONFIG_BLOCK=y.  Consequently all swap reads over NFS are single page
    reads.

    With this patch we pass a pointer-to-pointer via the wbc.  swap_writepage
    can store state between calls - much like the pointer passed explicitly to
    swap_readpage.  After calling swap_writepage() some number of times, the
    state will be passed to swap_write_unplug() which can submit the combined
    request.

    Link: https://lkml.kernel.org/r/164859778128.29473.5191868522654408537.stgit@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: David Howells <dhowells@redhat.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:00 -04:00
Chris von Recklinghausen 9ecb054eeb mm: submit multipage reads for SWP_FS_OPS swap-space
Bugzilla: https://bugzilla.redhat.com/2160210

commit 5169b844b7dd5934cd4f22ab66de0cc669abf0b0
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:49 2022 -0700

    mm: submit multipage reads for SWP_FS_OPS swap-space

    swap_readpage() is given one page at a time, but may be called repeatedly
    in succession.

    For block-device swap-space, the blk_plug functionality allows the
    multiple pages to be combined together at lower layers.  That cannot be
    used for SWP_FS_OPS as blk_plug may not exist - it is only active when
    CONFIG_BLOCK=y.  Consequently all swap reads over NFS are single page
    reads.

    With this patch we pass in a pointer-to-pointer when swap_readpage can
    store state between calls - much like the effect of blk_plug.  After
    calling swap_readpage() some number of times, the state will be passed to
    swap_read_unplug() which can submit the combined request.

    Link: https://lkml.kernel.org/r/164859778127.29473.14059420492644907783.stgit@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: David Howells <dhowells@redhat.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:00 -04:00
Chris von Recklinghausen 6757ac11cf mm: perform async writes to SWP_FS_OPS swap-space using ->swap_rw
Bugzilla: https://bugzilla.redhat.com/2160210

commit 7eadabc05d45ecedc0e8906d1db46bc8cfeb02af
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:48 2022 -0700

    mm: perform async writes to SWP_FS_OPS swap-space using ->swap_rw

    This patch switches swap-out to SWP_FS_OPS swap-spaces to use ->swap_rw
    and makes the writes asynchronous, like they are for other swap spaces.

    To make it async we need to allocate the kiocb struct from a mempool.
    This may block, but won't block as long as waiting for the write to
    complete.  At most it will wait for some previous swap IO to complete.

    Link: https://lkml.kernel.org/r/164859778126.29473.12399585304843922231.stgit@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: David Howells <dhowells@redhat.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:00 -04:00
Chris von Recklinghausen 2fd69ebf77 mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space
Conflicts: mm/page_io.c - We already have
	0f312591d656 ("mm: Convert swap_readpage to call read_folio instead of readpage")
	so this causes a merge conflict. Just replace the chunk with
	"ret = swap_readpage_fs(page);"

Bugzilla: https://bugzilla.redhat.com/2160210

commit e1209d3a7a67c281260ba9989621060ba7328b8c
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:48 2022 -0700

    mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space

    swap currently uses ->readpage to read swap pages.  This can only request
    one page at a time from the filesystem, which is not most efficient.

    swap uses ->direct_IO for writes which while this is adequate is an
    inappropriate over-loading.  ->direct_IO may need to had handle allocate
    space for holes or other details that are not relevant for swap.

    So this patch introduces a new address_space operation: ->swap_rw.  In
    this patch it is used for reads, and a subsequent patch will switch writes
    to use it.

    No filesystem yet supports ->swap_rw, but that is not a problem because
    no filesystem actually works with filesystem-based swap.
    Only two filesystems set SWP_FS_OPS:
    - cifs sets the flag, but ->direct_IO always fails so swap cannot work.
    - nfs sets the flag, but ->direct_IO calls generic_write_checks()
      which has failed on swap files for several releases.

    To ensure that a NULL ->swap_rw isn't called, ->activate_swap() for both
    NFS and cifs are changed to fail if ->swap_rw is not set.  This can be
    removed if/when the function is added.

    Future patches will restore swap-over-NFS functionality.

    To submit an async read with ->swap_rw() we need to allocate a structure
    to hold the kiocb and other details.  swap_readpage() cannot handle
    transient failure, so we create a mempool to provide the structures.

    Link: https://lkml.kernel.org/r/164859778125.29473.13430559328221330589.stgi
t@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: David Howells <dhowells@redhat.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:00 -04:00
Chris von Recklinghausen 8e4de79ff9 mm: drop swap_dirty_folio
Bugzilla: https://bugzilla.redhat.com/2160210

commit 4c4a763406ef903b78334bd2ccea168d2f7a741a
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:47 2022 -0700

    mm: drop swap_dirty_folio

    folios that are written to swap are owned by the MM subsystem - not any
    filesystem.

    When such a folio is passed to a filesystem to be written out to a
    swap-file, the filesystem handles the data, but the folio itself does not
    belong to the filesystem.  So calling the filesystem's ->dirty_folio()
    address_space operation makes no sense.  This is for folios in the given
    address space, and a folio to be written to swap does not exist in the
    given address space.

    So drop swap_dirty_folio() which calls the address-space's
    ->dirty_folio(), and always use noop_dirty_folio(), which is appropriate
    for folios being swapped out.

    Link: https://lkml.kernel.org/r/164859778123.29473.6900942583784889976.stgit@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: David Howells <dhowells@redhat.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:00 -04:00
Chris von Recklinghausen f942ace7a2 mm: create new mm/swap.h header file
Bugzilla: https://bugzilla.redhat.com/2160210

commit 014bb1de4fc17d54907d54418126a9a9736f4aff
Author: NeilBrown <neilb@suse.de>
Date:   Mon May 9 18:20:47 2022 -0700

    mm: create new mm/swap.h header file

    Patch series "MM changes to improve swap-over-NFS support".

    Assorted improvements for swap-via-filesystem.

    This is a resend of these patches, rebased on current HEAD.  The only
    substantial changes is that swap_dirty_folio has replaced
    swap_set_page_dirty.

    Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem.  It
    has previously worked for NFS but that broke a few releases back.  This
    series changes to use a new ->swap_rw rather than ->readpage and
    ->direct_IO.  It also makes other improvements.

    There is a companion series already in linux-next which fixes various
    issues with NFS.  Once both series land, a final patch is needed which
    changes NFS over to use ->swap_rw.

    This patch (of 10):

    Many functions declared in include/linux/swap.h are only used within mm/

    Create a new "mm/swap.h" and move some of these declarations there.
    Remove the redundant 'extern' from the function declarations.

    [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h]
    Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown
    Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Tested-by: David Howells <dhowells@redhat.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:19:00 -04:00
Chris von Recklinghausen 842856c4cc mm: Convert swap_readpage to call read_folio instead of readpage
Bugzilla: https://bugzilla.redhat.com/2160210

commit 0f312591d656c1d81bf2cf2a5642af478397a5dc
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Fri Apr 29 11:51:22 2022 -0400

    mm: Convert swap_readpage to call read_folio instead of readpage

    This commit is split out so it can be dropped when resolving
    conflicts with Neil Brown's series to stop calling ->readpage in
    the swap code.

    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:18:59 -04:00
Chris von Recklinghausen 8075102dcc mm: fix unexpected zeroed page mapping with zram swap
Bugzilla: https://bugzilla.redhat.com/2120352

commit e914d8f00391520ecc4495dd0ca0124538ab7119
Author: Minchan Kim <minchan@kernel.org>
Date:   Thu Apr 14 19:13:46 2022 -0700

    mm: fix unexpected zeroed page mapping with zram swap

    Two processes under CLONE_VM cloning, user process can be corrupted by
    seeing zeroed page unexpectedly.

          CPU A                        CPU B

      do_swap_page                do_swap_page
      SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
      swap_readpage valid data
        swap_slot_free_notify
          delete zram entry
                                  swap_readpage zeroed(invalid) data
                                  pte_lock
                                  map the *zero data* to userspace
                                  pte_unlock
      pte_lock
      if (!pte_same)
        goto out_nomap;
      pte_unlock
      return and next refault will
      read zeroed data

    The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't
    increase the refcount of swap slot at copy_mm so it couldn't catch up
    whether it's safe or not to discard data from backing device.  In the
    case, only the lock it could rely on to synchronize swap slot freeing is
    page table lock.  Thus, this patch gets rid of the swap_slot_free_notify
    function.  With this patch, CPU A will see correct data.

          CPU A                        CPU B

      do_swap_page                do_swap_page
      SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
                                  swap_readpage original data
                                  pte_lock
                                  map the original data
                                  swap_free
                                    swap_range_free
                                      bd_disk->fops->swap_slot_free_notify
      swap_readpage read zeroed data
                                  pte_unlock
      pte_lock
      if (!pte_same)
        goto out_nomap;
      pte_unlock
      return
      on next refault will see mapped data by CPU B

    The concern of the patch would increase memory consumption since it
    could keep wasted memory with compressed form in zram as well as
    uncompressed form in address space.  However, most of cases of zram uses
    no readahead and do_swap_page is followed by swap_free so it will free
    the compressed form from in zram quickly.

    Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com
    Fixes: 0bcac06f27 ("mm, swap: skip swapcache for swapin of synchronous device")
    Reported-by: Ivan Babrou <ivan@cloudflare.com>
    Tested-by: Ivan Babrou <ivan@cloudflare.com>
    Signed-off-by: Minchan Kim <minchan@kernel.org>
    Cc: Nitin Gupta <ngupta@vflare.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: <stable@vger.kernel.org>    [4.14+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:28:06 -04:00
Chris von Recklinghausen 416b14e6c8 mm: page_io: fix psi memory pressure error on cold swapins
Bugzilla: https://bugzilla.redhat.com/2120352

commit d8c47cc7bf602ef73384a00869a70148146c1191
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Tue Mar 22 14:46:30 2022 -0700

    mm: page_io: fix psi memory pressure error on cold swapins

    Once upon a time, all swapins counted toward memory pressure[1].  Then
    Joonsoo introduced workingset detection for anonymous pages and we gained
    the ability to distinguish hot from cold swapins[2][3].  But we failed to
    update swap_readpage() accordingly, and now we account partial memory
    pressure in the swapin path of cold memory.

    Not for all situations - which adds more inconsistency: paths using the
    conventional submit_bio() and lock_page() route will not see much pressure
    - unless storage itself is heavily congested and the bio submissions
    stall.  ZRAM and ZSWAP do most of the work directly from swap_readpage()
    and will see all swapins reflected as pressure.

    IOW, a workload doing cold swapins could see little to no pressure
    reported with on-disk swap, but potentially high pressure with a zram or
    zswap backend.  That confuses any psi-based health monitoring, load
    shedding, proactive reclaim, or userspace OOM killing schemes that might
    be in place for the workload.

    Restore consistency by making all swapin stall accounting conditional on
    the page actually being part of the workingset.

    [1] commit 937790699b ("mm/page_io.c: annotate refault stalls from swap_readpage")
    [2] commit aae466b005 ("mm/swap: implement workingset detection for anonymous LRU")
    [3] commit cad8320b4b ("mm/swap: don't SetPageWorkingset unconditionally during swapin")

    Link: https://lkml.kernel.org/r/20220214214921.419687-1-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Reported-by: CGEL <cgel.zte@gmail.com>
    Acked-by: Minchan Kim <minchan@kernel.org>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:54 -04:00