Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Rafael Aquini	e010e31c9b	mm: Fix missing folio invalidation calls during truncation JIRA: https://issues.redhat.com/browse/RHEL-27745 This patch is a backport of the following upstream commit: commit 0aa2e1b2fb7a75aa4b5b4347055ccfea6f091769 Author: David Howells <dhowells@redhat.com> Date: Fri Aug 23 21:08:09 2024 +0100 mm: Fix missing folio invalidation calls during truncation When AS_RELEASE_ALWAYS is set on a mapping, the ->release_folio() and ->invalidate_folio() calls should be invoked even if PG_private and PG_private_2 aren't set. This is used by netfslib to keep track of the point above which reads can be skipped in favour of just zeroing pagecache locally. There are a couple of places in truncation in which invalidation is only called when folio_has_private() is true. Fix these to check folio_needs_release() instead. Without this, the generic/075 and generic/112 xfstests (both fsx-based tests) fail with minimum folio size patches applied[1]. Fixes: b4fa966f03b7 ("mm, netfs, fscache: stop read optimisation when folio removed from pagecache") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20240815090849.972355-1-kernel@pankajraghav.com/ [1] Link: https://lore.kernel.org/r/20240823200819.532106-2-dhowells@redhat.com Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Pankaj Raghav <p.raghav@samsung.com> cc: Jeff Layton <jlayton@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: netfs@lists.linux.dev cc: linux-mm@kvack.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-12-09 12:25:34 -05:00
Rafael Aquini	288fab6492	mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE JIRA: https://issues.redhat.com/browse/RHEL-27745 Conflicts: * virt/kvm/guest_memfd.c: difference in the hunk due to RHEL missing upstream commit 1d23040caa8b ("KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode") which would end up being reverted with this follow-up fix. This patch is a backport of the following upstream commit: commit 27e6a24a4cf3d25421c0f6ebb7c39f45fc14d20f Author: Paolo Bonzini <pbonzini@redhat.com> Date: Thu Jul 11 13:56:54 2024 -0400 mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE The flags AS_UNMOVABLE and AS_INACCESSIBLE were both added just for guest_memfd; AS_UNMOVABLE is already in existing versions of Linux, while AS_INACCESSIBLE was acked for inclusion in 6.11. But really, they are the same thing: only guest_memfd uses them, at least for now, and guest_memfd pages are unmovable because they should not be accessed by the CPU. So merge them into one; use the AS_INACCESSIBLE name which is more comprehensive. At the same time, this fixes an embarrassing bug where AS_INACCESSIBLE was used as a bit mask, despite it being just a bit index. The bug was mostly benign, because AS_INACCESSIBLE's bit representation (1010) corresponded to setting AS_UNEVICTABLE (which is already set) and AS_ENOSPC (except no async writes can happen on the guest_memfd). So the AS_INACCESSIBLE flag simply had no effect. Fixes: 1d23040caa8b ("KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode") Fixes: c72ceafbd12c ("mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory") Cc: linux-mm@kvack.org Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Michael Roth <michael.roth@amd.com> Reviewed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-12-09 12:25:25 -05:00
Rafael Aquini	9af1ce9e46	mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory JIRA: https://issues.redhat.com/browse/RHEL-27745 Conflicts: * include/linux/pagemap.h: context differences due to RHEL out-of-order backports of commits 762321dab9a7 ("filemap: add a per-mapping stable writes flag"), and 0003e2a41468 ("mm: Add AS_UNMOVABLE to mark mapping as completely unmovable"), causing the latter to end up missing the conflict resolution with the former and its fixing through merge commit 136292522e43 (which we do here). This patch is a backport of the following upstream commit: commit c72ceafbd12cf95e088681ae5e535ef1a78bf0ed Author: Michael Roth <michael.roth@amd.com> Date: Fri Mar 29 16:24:42 2024 -0500 mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory filemap users like guest_memfd may use page cache pages to allocate/manage memory that is only intended to be accessed by guests via hardware protections like encryption. Writes to memory of this sort in common paths like truncation may cause unexpected behavior such as writing garbage instead of zeros when attempting to zero pages, or worse, triggering hardware protections that are considered fatal as far as the kernel is concerned. Introduce a new address_space flag, AS_INACCESSIBLE, and use this initially to prevent zero'ing of pages during truncation, with the understanding that it is up to the owner of the mapping to handle this specially if needed. This is admittedly a rather blunt solution, but it seems like there are no other places that should take into account the flag to keep its promise. Link: https://lore.kernel.org/lkml/ZR9LYhpxTaTk6PJX@google.com/ Cc: Matthew Wilcox <willy@infradead.org> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240329212444.395559-5-michael.roth@amd.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-12-09 12:25:10 -05:00
Rafael Aquini	d2f2fe5ae8	mm: remove invalidate_inode_page() JIRA: https://issues.redhat.com/browse/RHEL-27745 This patch is a backport of the following upstream commit: commit 2033c98cce666b0d125ae956613ab5111bb8d202 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Nov 8 18:28:09 2023 +0000 mm: remove invalidate_inode_page() All callers are now converted to call mapping_evict_folio(). Link: https://lkml.kernel.org/r/20231108182809.602073-7-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-12-09 12:23:38 -05:00
Rafael Aquini	3c11b7e193	mm: make mapping_evict_folio() the preferred way to evict clean folios JIRA: https://issues.redhat.com/browse/RHEL-27745 This patch is a backport of the following upstream commit: commit 1e12cbb9f69541181afab6b1ff358b4f1dd3e253 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Nov 8 18:28:04 2023 +0000 mm: make mapping_evict_folio() the preferred way to evict clean folios Patch series "Fix fault handler's handling of poisoned tail pages". Since introducing the ability to have large folios in the page cache, it's been possible to have a hwpoisoned tail page returned from the fault handler. We handle this situation poorly; failing to remove the affected page from use. This isn't a minimal patch to fix it, it's a full conversion of all the code surrounding it. This patch (of 6): invalidate_inode_page() does very little beyond calling mapping_evict_folio(). Move the check for mapping being NULL into mapping_evict_folio() and make it available to the rest of the MM for use in the next few patches. Link: https://lkml.kernel.org/r/20231108182809.602073-1-willy@infradead.org Link: https://lkml.kernel.org/r/20231108182809.602073-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-12-09 12:23:34 -05:00
Rafael Aquini	5d6754d7f7	mm: invalidation check mapping before folio_contains JIRA: https://issues.redhat.com/browse/RHEL-27743 This patch is a backport of the following upstream commit: commit aa5b9178c01905d7691512b366cf2886dfe2680c Author: Hugh Dickins <hughd@google.com> Date: Tue Aug 8 21:36:12 2023 -0700 mm: invalidation check mapping before folio_contains Enabling tmpfs "direct IO" exposes it to invalidate_inode_pages2_range(), which when swapping can hit the VM_BUG_ON_FOLIO(!folio_contains()): the folio has been moved from page cache to swap cache (with folio->mapping reset to NULL), but the folio_index() embedded in folio_contains() sees swapcache, and so returns the swapcache_index() - whereas folio->index would be the right one to check against the index from mapping's xarray. There are different ways to fix this, but my preference is just to order the checks in invalidate_inode_pages2_range() the same way that they are in __filemap_get_folio() and find_lock_entries() and filemap_fault(): check folio->mapping before folio_contains(). Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <f0b31772-78d7-f198-6482-9f25aab8c13f@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-10-01 11:21:35 -04:00
Rafael Aquini	89b7c01962	mm: increase usage of folio_next_index() helper JIRA: https://issues.redhat.com/browse/RHEL-27743 This patch is a backport of the following upstream commit: commit 87b11f862254396a93636f0998377ac3f6648f5f Author: Sidhartha Kumar <sidhartha.kumar@oracle.com> Date: Tue Jun 27 10:43:49 2023 -0700 mm: increase usage of folio_next_index() helper Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using the existing helper folio_next_index(). Link: https://lkml.kernel.org/r/20230627174349.491803-1-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Christoph Hellwig <hch@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-10-01 11:17:23 -04:00
Rafael Aquini	25e4aa840e	mm: remove references to pagevec JIRA: https://issues.redhat.com/browse/RHEL-27742 This patch is a backport of the following upstream commit: commit 1fec6890bf2247ecc93f5491c2d3f33c333d5c6e Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Jun 21 17:45:56 2023 +0100 mm: remove references to pagevec Most of these should just refer to the LRU cache rather than the data structure used to implement the LRU cache. Link: https://lkml.kernel.org/r/20230621164557.3510324-13-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-09-05 20:37:32 -04:00
Rafael Aquini	d0df95d609	mm: rename invalidate_mapping_pagevec to mapping_try_invalidate JIRA: https://issues.redhat.com/browse/RHEL-27742 This patch is a backport of the following upstream commit: commit 1a0fc811f5f5addf54499826bd1b6e34e917491c Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Jun 21 17:45:55 2023 +0100 mm: rename invalidate_mapping_pagevec to mapping_try_invalidate We don't use pagevecs for the LRU cache any more, and we don't know that the failed invalidations were due to the folio being in an LRU cache. So rename it to be more accurate. Link: https://lkml.kernel.org/r/20230621164557.3510324-12-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-09-05 20:37:31 -04:00
Chris von Recklinghausen	c0446613df	mm: return an ERR_PTR from __filemap_get_folio Conflicts: fs/nilfs2/page.c - We already have f6e0e1734424 ("nilfs2: Convert nilfs_copy_back_pages() to use filemap_get_folios()") so use folios instead of pages fs/smb/client/cifsfs.c - The backport of 7b2404a886f8 ("cifs: Fix flushing, invalidation and file size with copy_file_range()") cited the lack of this patch as a conflict. Fix it. JIRA: https://issues.redhat.com/browse/RHEL-27741 commit 66dabbb65d673aef40dd17bf62c042be8f6d4a4b Author: Christoph Hellwig <hch@lst.de> Date: Tue Mar 7 15:34:10 2023 +0100 mm: return an ERR_PTR from __filemap_get_folio Instead of returning NULL for all errors, distinguish between: - no entry found and not asked to allocated (-ENOENT) - failed to allocate memory (-ENOMEM) - would block (-EAGAIN) so that callers don't have to guess the error based on the passed in flags. Also pass through the error through the direct callers: filemap_get_folio, filemap_lock_folio filemap_grab_folio and filemap_get_incore_folio. [hch@lst.de: fix null-pointer deref] Link: https://lkml.kernel.org/r/20230310070023.GA13563@lst.de Link: https://lkml.kernel.org/r/20230310043137.GA1624890@u2004 Link: https://lkml.kernel.org/r/20230307143410.28031-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nilfs2] Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2024-04-30 07:00:24 -04:00
Audra Mitchell	fb208bc6ad	filemap: find_get_entries() now updates start offset JIRA: https://issues.redhat.com/browse/RHEL-27739 This patch is a backport of the following upstream commit: commit 9fb6beea79c6e7c959adf4fb7b94cf9a6028b941 Author: Vishal Moola (Oracle) <vishal.moola@gmail.com> Date: Mon Oct 17 09:18:00 2022 -0700 filemap: find_get_entries() now updates start offset Initially, find_get_entries() was being passed in the start offset as a value. That left the calculation of the offset to the callers. This led to complexity in the callers trying to keep track of the index. Now find_get_entries() takes in a pointer to the start offset and updates the value to be directly after the last entry found. If no entry is found, the offset is not changed. This gets rid of multiple hacky calculations that kept track of the start offset. Link: https://lkml.kernel.org/r/20221017161800.2003-3-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Audra Mitchell <audra@redhat.com>	2024-04-09 09:42:50 -04:00
Audra Mitchell	765c2fd97b	filemap: find_lock_entries() now updates start offset JIRA: https://issues.redhat.com/browse/RHEL-27739 Conflicts: Context conflict due to out of order backport: `9efa394ef3` ("tmpfs: fix data loss from failed fallocate") This patch is a backport of the following upstream commit: commit 3392ca121872dd8c33015c7703d4981c78819be3 Author: Vishal Moola (Oracle) <vishal.moola@gmail.com> Date: Mon Oct 17 09:17:59 2022 -0700 filemap: find_lock_entries() now updates start offset Patch series "Rework find_get_entries() and find_lock_entries()", v3. Originally the callers of find_get_entries() and find_lock_entries() were keeping track of the start index themselves as they traverse the search range. This resulted in hacky code such as in shmem_undo_range(): index = folio->index + folio_nr_pages(folio) - 1; where the - 1 is only present to stay in the right spot after incrementing index later. This sort of calculation was also being done on every folio despite not even using index later within that function. These patches change find_get_entries() and find_lock_entries() to calculate the new index instead of leaving it to the callers so we can avoid all these complications. This patch (of 2): Initially, find_lock_entries() was being passed in the start offset as a value. That left the calculation of the offset to the callers. This led to complexity in the callers trying to keep track of the index. Now find_lock_entries() takes in a pointer to the start offset and updates the value to be directly after the last entry found. If no entry is found, the offset is not changed. This gets rid of multiple hacky calculations that kept track of the start offset. Link: https://lkml.kernel.org/r/20221017161800.2003-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20221017161800.2003-2-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Audra Mitchell <audra@redhat.com>	2024-04-09 09:42:50 -04:00
Chris von Recklinghausen	d680c6a64b	folio-compat: remove lru_cache_add() JIRA: https://issues.redhat.com/browse/RHEL-1848 commit 6e1ca48d0669b0f5efcbaa051b23cd8e651a1614 Author: Vishal Moola (Oracle) <vishal.moola@gmail.com> Date: Tue Nov 1 10:53:26 2022 -0700 folio-compat: remove lru_cache_add() There are no longer any callers of lru_cache_add(), so remove it. This saves 79 bytes of kernel text. Also cleanup some comments such that they reference the new folio_add_lru() instead. Link: https://lkml.kernel.org/r/20221101175326.13265-6-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2023-10-20 06:15:30 -04:00
Chris von Recklinghausen	a2c59e7f9d	mm: add split_folio() JIRA: https://issues.redhat.com/browse/RHEL-1848 commit d788f5b374c2ba204fed57e39acf2452acc24812 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Fri Sep 2 20:46:00 2022 +0100 mm: add split_folio() This wrapper removes a need to use split_huge_page(&folio->page). Convert two callers. Link: https://lkml.kernel.org/r/20220902194653.1739778-5-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2023-10-20 06:13:51 -04:00
Dave Wysochanski	924daddc03	mm: merge folio_has_private()/filemap_release_folio() call pairs Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2209756 Patch series "mm, netfs, fscache: Stop read optimisation when folio removed from pagecache", v7. This fixes an optimisation in fscache whereby we don't read from the cache for a particular file until we know that there's data there that we don't have in the pagecache. The problem is that I'm no longer using PG_fscache (aka PG_private_2) to indicate that the page is cached and so I don't get a notification when a cached page is dropped from the pagecache. The first patch merges some folio_has_private() and filemap_release_folio() pairs and introduces a helper, folio_needs_release(), to indicate if a release is required. The second patch is the actual fix. Following Willy's suggestions[1], it adds an AS_RELEASE_ALWAYS flag to an address_space that will make filemap_release_folio() always call ->release_folio(), even if PG_private/PG_private_2 aren't set. folio_needs_release() is altered to add a check for this. This patch (of 2): Make filemap_release_folio() check folio_has_private(). Then, in most cases, where a call to folio_has_private() is immediately followed by a call to filemap_release_folio(), we can get rid of the test in the pair. There are a couple of sites in mm/vscan.c that this can't so easily be done. In shrink_folio_list(), there are actually three cases (something different is done for incompletely invalidated buffers), but filemap_release_folio() elides two of them. In shrink_active_list(), we don't have have the folio lock yet, so the check allows us to avoid locking the page unnecessarily. A wrapper function to check if a folio needs release is provided for those places that still need to do it in the mm/ directory. This will acquire additional parts to the condition in a future patch. After this, the only remaining caller of folio_has_private() outside of mm/ is a check in fuse. Link: https://lkml.kernel.org/r/20230628104852.3391651-1-dhowells@redhat.com Link: https://lkml.kernel.org/r/20230628104852.3391651-2-dhowells@redhat.com Reported-by: Rohith Surabattula <rohiths.msft@gmail.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steve French <sfrench@samba.org> Cc: Shyam Prasad N <nspmangalore@gmail.com> Cc: Rohith Surabattula <rohiths.msft@gmail.com> Cc: Dave Wysochanski <dwysocha@redhat.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Xiubo Li <xiubli@redhat.com> Cc: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 0201ebf274a306a6ebb95e5dc2d6a0a27c737cac) Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2023-09-13 18:19:41 -04:00
Chris von Recklinghausen	c991a31fce	mm: Remove __delete_from_page_cache() Bugzilla: https://bugzilla.redhat.com/2160210 commit 6ffcd825e7d0416d78fd41cd5b7856a78122cc8c Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Tue Jun 28 20:41:40 2022 -0400 mm: Remove __delete_from_page_cache() This wrapper is no longer used. Remove it and all references to it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2023-03-24 11:19:15 -04:00
Chris von Recklinghausen	50de8bd114	fs: Remove aops->invalidatepage Bugzilla: https://bugzilla.redhat.com/2160210 commit f50015a596fa106bf642bd85fbf6e6b52cc913d0 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Feb 9 20:21:51 2022 +0000 fs: Remove aops->invalidatepage With all users migrated to ->invalidate_folio, remove the old operation. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2023-03-24 11:18:47 -04:00
Chris von Recklinghausen	ff89a3a83f	fs: Turn block_invalidatepage into block_invalidate_folio Bugzilla: https://bugzilla.redhat.com/2120352 commit 7ba13abbd31ee9265e88d7dc029c0f786e665192 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Feb 9 20:21:34 2022 +0000 fs: Turn block_invalidatepage into block_invalidate_folio Remove special-casing of a NULL invalidatepage, since there is no more block_invalidatepage. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2022-10-12 07:27:48 -04:00
Chris von Recklinghausen	f053c9cc69	mm: remove cleancache Conflicts: Drop changes to fs/btrfs/extent_io.c, fs/ntfs3/ntfs_fs.h - We don't build them Bugzilla: https://bugzilla.redhat.com/2120352 commit 0a4ee518185e902758191d968600399f3bc2be31 Author: Christoph Hellwig <hch@lst.de> Date: Fri Jan 21 22:14:34 2022 -0800 mm: remove cleancache Patch series "remove Xen tmem leftovers". Since the removal of the Xen tmem driver in 2019, the cleancache hooks are entirely unused, as are large parts of frontswap. This series against linux-next (with the folio changes included) removes cleancaches, and cuts down frontswap to the bits actually used by zswap. This patch (of 13): The cleancache subsystem is unused since the removal of Xen tmem driver in commit `814bbf49dc` ("xen: remove tmem driver"). [akpm@linux-foundation.org: remove now-unreachable code] Link: https://lkml.kernel.org/r/20211224062246.1258487-1-hch@lst.de Link: https://lkml.kernel.org/r/20211224062246.1258487-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2022-10-12 07:27:42 -04:00
Chris von Recklinghausen	3a3ee2bede	vfs: keep inodes with page cache off the inode shrinker LRU Conflicts: mm/filemap.c - We already have 452e9e6992fe ("filemap: Add filemap_remove_folio and __filemap_remove_folio") so just add the spin_lock call. mm/truncate.c - The backport of 51dcbdac28d4 ("mm: Convert find_lock_entries() to use a folio_batch") listed the lack of this patch as a conflict. Keep the ''fbatch->nr = j;' line. mm/vmscan.c - We already have be7c07d60e13 ("mm/vmscan: Convert __remove_mapping() to take a folio") so change a couple of lines from 'if (!PageSwapCache(page))' to 'if (!folio_test_swapcache(folio))' Bugzilla: https://bugzilla.redhat.com/2120352 commit 51b8c1fe250d1bd70c1722dc3c414f5cff2d7cca Author: Johannes Weiner <hannes@cmpxchg.org> Date: Mon Nov 8 18:31:24 2021 -0800 vfs: keep inodes with page cache off the inode shrinker LRU Historically (pre-2.5), the inode shrinker used to reclaim only empty inodes and skip over those that still contained page cache. This caused problems on highmem hosts: struct inode could put fill lowmem zones before the cache was getting reclaimed in the highmem zones. To address this, the inode shrinker started to strip page cache to facilitate reclaiming lowmem. However, this comes with its own set of problems: the shrinkers may drop actively used page cache just because the inodes are not currently open or dirty - think working with a large git tree. It further doesn't respect cgroup memory protection settings and can cause priority inversions between containers. Nowadays, the page cache also holds non-resident info for evicted cache pages in order to detect refaults. We've come to rely heavily on this data inside reclaim for protecting the cache workingset and driving swap behavior. We also use it to quantify and report workload health through psi. The latter in turn is used for fleet health monitoring, as well as driving automated memory sizing of workloads and containers, proactive reclaim and memory offloading schemes. The consequences of dropping page cache prematurely is that we're seeing subtle and not-so-subtle failures in all of the above-mentioned scenarios, with the workload generally entering unexpected thrashing states while losing the ability to reliably detect it. To fix this on non-highmem systems at least, going back to rotating inodes on the LRU isn't feasible. We've tried (commit `a76cf1a474` ("mm: don't reclaim inodes with many attached pages")) and failed (commit `69056ee6a8` ("Revert "mm: don't reclaim inodes with many attached pages"")). The issue is mostly that shrinker pools attract pressure based on their size, and when objects get skipped the shrinkers remember this as deferred reclaim work. This accumulates excessive pressure on the remaining inodes, and we can quickly eat into heavily used ones, or dirty ones that require IO to reclaim, when there potentially is plenty of cold, clean cache around still. Instead, this patch keeps populated inodes off the inode LRU in the first place - just like an open file or dirty state would. An otherwise clean and unused inode then gets queued when the last cache entry disappears. This solves the problem without reintroducing the reclaim issues, and generally is a bit more scalable than having to wade through potentially hundreds of thousands of busy inodes. Locking is a bit tricky because the locks protecting the inode state (i_lock) and the inode LRU (lru_list.lock) don't nest inside the irq-safe page cache lock (i_pages.xa_lock). Page cache deletions are serialized through i_lock, taken before the i_pages lock, to make sure depopulated inodes are queued reliably. Additions may race with deletions, but we'll check again in the shrinker. If additions race with the shrinker itself, we're protected by the i_lock: if find_inode() or iput() win, the shrinker will bail on the elevated i_count or I_REFERENCED; if the shrinker wins and goes ahead with the inode, it will set I_FREEING and inhibit further igets(), which will cause the other side to create a new instance of the inode instead. Link: https://lkml.kernel.org/r/20210614211904.14420-4-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>	2022-10-12 07:27:31 -04:00
Aristeu Rozanski	528faa0405	mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit c56109dd35c9204cd6c49d2116ef36e5044ef867 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sun Feb 13 17:22:10 2022 -0500 mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() We can save a function call by combining these two functions, which are identical except for the return value. Also move the prototype to mm/internal.h. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:17 -04:00
Aristeu Rozanski	b86ef54bdd	mm: Turn deactivate_file_page() into deactivate_file_folio() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 261b6840ed10419ac2f554e515592d59dd5c82cf Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sun Feb 13 16:40:24 2022 -0500 mm: Turn deactivate_file_page() into deactivate_file_folio() This function has one caller which already has a reference to the page, so we don't need to use get_page_unless_zero(). Also move the prototype to mm/internal.h. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:17 -04:00
Aristeu Rozanski	fc7fcd9b05	mm/truncate: Convert __invalidate_mapping_pages() to use a folio Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit b4545f46533b7e69cb20e05c9fe987be76e1a3da Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sun Feb 13 16:38:07 2022 -0500 mm/truncate: Convert __invalidate_mapping_pages() to use a folio Now we can call mapping_evict_folio() instead of invalidate_inode_page() and save a few calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:17 -04:00
Aristeu Rozanski	5a8509634f	mm/truncate: Split invalidate_inode_page() into mapping_evict_folio() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit d6c75dc22c755c567838f12f12a16f2a323ebd4e Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sun Feb 13 15:22:28 2022 -0500 mm/truncate: Split invalidate_inode_page() into mapping_evict_folio() Some of the callers already have the address_space and can avoid calling folio_mapping() and checking if the folio was already truncated. Also add kernel-doc and fix the return type (in case we ever support folios larger than 4TB). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:17 -04:00
Aristeu Rozanski	275becb2f3	mm: Convert remove_mapping() to take a folio Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 5100da38ef3c33d9ad8b60b29c2b671249bf7e1d Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sat Feb 12 22:48:55 2022 -0500 mm: Convert remove_mapping() to take a folio Add kernel-doc and return the number of pages removed in order to get the statistics right in __invalidate_mapping_pages(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:17 -04:00
Aristeu Rozanski	88d103c7f7	mm/truncate: Replace page_mapped() call in invalidate_inode_page() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit e41c81d0d30e1a6ebf408feaf561f80cac4457dc Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sat Feb 12 17:43:16 2022 -0500 mm/truncate: Replace page_mapped() call in invalidate_inode_page() folio_mapped() is expensive because it has to check each page's mapcount field. A cheaper check is whether there are any extra references to the page, other than the one we own, one from the page private data and the ones held by the page cache. The call to remove_mapping() will fail in any case if it cannot freeze the refcount, but failing here avoids cycling the i_pages spinlock. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:17 -04:00
Aristeu Rozanski	8ce712a2b5	mm/truncate: Convert invalidate_inode_page() to use a folio Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 4418481396b2caeded6d0eed11ac9052ab4c0763 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sat Feb 12 17:39:10 2022 -0500 mm/truncate: Convert invalidate_inode_page() to use a folio This saves a number of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:16 -04:00
Aristeu Rozanski	e7d8da2114	mm/truncate: Inline invalidate_complete_page() into its one caller Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites Conflicts: differences due missing simplification from 43b93121056c52 commit 1b8ddbeeb9b819e62b7190115023ce858a159f5c Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sat Feb 12 15:27:42 2022 -0500 mm/truncate: Inline invalidate_complete_page() into its one caller invalidate_inode_page() is the only caller of invalidate_complete_page() and inlining it reveals that the first check is unnecessary (because we hold the page locked, and we just retrieved the mapping from the page). Actually, it does make a difference, in that tail pages no longer fail at this check, so it's now possible to remove a tail page from a mapping. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:16 -04:00
Aristeu Rozanski	36d1439564	fs: Add aops->launder_folio Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites Conflicts: we didn't backport is_partially_uptodate() (`8ab22b9abb`) commit affa80e8c6a1df473694c2087259901872309cc4 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Feb 9 20:21:52 2022 +0000 fs: Add aops->launder_folio Since the only difference between ->launder_page and ->launder_folio is the type of the pointer, these can safely use a union without affecting bisectability. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:14 -04:00
Aristeu Rozanski	0f8e8a5ec4	fs: Add invalidate_folio() aops method Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 128d1f8241d62ab014eef6dd4ef9bb977dbeadb2 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Feb 9 20:21:32 2022 +0000 fs: Add invalidate_folio() aops method This is used in preference to invalidatepage, if defined. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:13 -04:00
Aristeu Rozanski	2858b612a7	fs: Turn do_invalidatepage() into folio_invalidate() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites Conflicts: context due missing 0a4ee518185e9027 commit 5ad6b2bdaaea712486145fa5a78ec24d25289071 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Feb 9 20:21:28 2022 +0000 fs: Turn do_invalidatepage() into folio_invalidate() Take a folio instead of a page, fix the types of the offset & length, and export it to filesystems. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:13 -04:00
Aristeu Rozanski	4a1432e31d	truncate,shmem: Handle truncates that split large folios Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit b9a8a4195c7d3a51235a4fc974a46ad4e9689ffd Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed May 27 17:59:22 2020 -0400 truncate,shmem: Handle truncates that split large folios Handle folio splitting in the parts of the truncation functions which already handle partial pages. Factor all that code out into a new function called truncate_inode_partial_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:11 -04:00
Aristeu Rozanski	07d2b88c7e	truncate: Convert invalidate_inode_pages2_range to folios Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit f6357c3a9d3ea5a00c5bf52845b633d649da6722 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Thu May 20 08:17:44 2021 -0400 truncate: Convert invalidate_inode_pages2_range to folios If we're going to unmap a folio, we have to be sure to unmap the entire folio, not just the part of it which lies after the search index. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:11 -04:00
Aristeu Rozanski	cc07877b40	mm: Remove pagevec_remove_exceptionals() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 1613fac9aaf840af76faa747ea428a714af98dbd Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Tue Dec 7 14:28:49 2021 -0500 mm: Remove pagevec_remove_exceptionals() All of its callers now call folio_batch_remove_exceptionals(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:11 -04:00
Aristeu Rozanski	59e51bcf24	mm: Convert find_lock_entries() to use a folio_batch Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites Conflicts: context due missing 51b8c1fe250d1bd70c17 commit 51dcbdac28d4dde915f78adf08bb3fac87f516e9 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Tue Dec 7 14:15:07 2021 -0500 mm: Convert find_lock_entries() to use a folio_batch find_lock_entries() already only returned the head page of folios, so convert it to return a folio_batch instead of a pagevec. That cascades through converting truncate_inode_pages_range() to delete_from_page_cache_batch() and page_cache_delete_batch(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:11 -04:00
Aristeu Rozanski	433ab58b7a	filemap: Return only folios from find_get_entries() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 0e499ed3d7a216706e02eeded562627d3e69dcfd Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Tue Sep 1 23:17:50 2020 -0400 filemap: Return only folios from find_get_entries() The callers have all been converted to work on folios, so convert find_get_entries() to return a batch of folios instead of pages. We also now return multiple large folios in a single call. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:10 -04:00
Aristeu Rozanski	2e4f1700b5	truncate: Add invalidate_complete_folio2() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites Conflicts: context due missing 51b8c1fe250d1bd70c17 commit 78f426608f21c997975adb96641b7ac82d4d15b1 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Wed Jul 28 15:52:34 2021 -0400 truncate: Add invalidate_complete_folio2() Convert invalidate_complete_page2() to invalidate_complete_folio2(). Use filemap_free_folio() to free the page instead of calling ->freepage manually. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:10 -04:00
Aristeu Rozanski	59c269217b	truncate: Convert invalidate_inode_pages2_range() to use a folio Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit fae9bc4a90176868cbbbecc693acb0ff2607818d Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Thu Dec 2 23:25:01 2021 -0500 truncate: Convert invalidate_inode_pages2_range() to use a folio If we're going to unmap a folio, we have to be sure to unmap the entire folio, not just the part of it which lies after the search index. We cannot yet remove the struct page from invalidate_inode_pages2_range() because the page pointer in the pvec might be a shadow/dax/swap entry instead of actually a page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:10 -04:00
Aristeu Rozanski	7b60f6f55b	truncate: Skip known-truncated indices Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit ccbbf761d440b0d5afcbf232db37435dc38d6161 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Fri Nov 26 13:25:38 2021 -0500 truncate: Skip known-truncated indices If we've truncated an entire folio, we can skip over all the indices covered by this folio. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:10 -04:00
Aristeu Rozanski	0a91f1bd33	truncate,shmem: Add truncate_inode_folio() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 1e84a3d997b74c33491899e31d48774f252213ab Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Thu Dec 2 16:01:55 2021 -0500 truncate,shmem: Add truncate_inode_folio() Convert all callers of truncate_inode_page() to call truncate_inode_folio() instead, and move the declaration to mm/internal.h. Move the assertion that the caller is not passing in a tail page to generic_error_remove_page(). We can't entirely remove the struct page from the callers yet because the page pointer in the pvec might be a shadow/dax/swap entry instead of actually a page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:10 -04:00
Aristeu Rozanski	4f77d61363	mm: Add unmap_mapping_folio() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit 3506659e18a61ae525f3b9b4f5af23b4b149d4db Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sun Nov 28 14:53:35 2021 -0500 mm: Add unmap_mapping_folio() Convert both callers of unmap_mapping_page() to call unmap_mapping_folio() instead. Also move zap_details from linux/mm.h to mm/memory.c Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:10 -04:00
Aristeu Rozanski	2f0cf8ca83	truncate: Add truncate_cleanup_folio() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083861 Tested: by me with multiple test suites commit efe99bba2862aef24f1b05b786f6bf5acb076209 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Fri Nov 26 13:58:10 2021 -0500 truncate: Add truncate_cleanup_folio() Convert both callers of truncate_cleanup_page() to use truncate_cleanup_folio() instead. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>	2022-07-10 10:44:09 -04:00
Rafael Aquini	5b5c28b990	fs: inode: count invalidated shadow pages in pginodesteal Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2023396 This patch is a backport of the following upstream commit: commit 7ae12c809f6a31d3da7b96339dbefa141884c711 Author: Johannes Weiner <hannes@cmpxchg.org> Date: Thu Sep 2 14:53:24 2021 -0700 fs: inode: count invalidated shadow pages in pginodesteal pginodesteal is supposed to capture the impact that inode reclaim has on the page cache state. Currently, it doesn't consider shadow pages that get dropped this way, even though this can have a significant impact on paging behavior, memory pressure calculations etc. To improve visibility into these effects, make sure shadow pages get counted when they get dropped through inode reclaim. This changes the return value semantics of invalidate_mapping_pages() semantics slightly, but the only two users are the inode shrinker itsel and a usb driver that logs it for debugging purposes. Link: https://lkml.kernel.org/r/20210614211904.14420-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rafael Aquini <aquini@redhat.com>	2021-11-29 11:40:53 -05:00
Rafael Aquini	7ac6d2bbef	mm: remove irqsave/restore locking from contexts with irqs enabled Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2023396 This patch is a backport of the following upstream commit: commit 3047250972ff935b1d7a0629fa3acb04c12dcc07 Author: Johannes Weiner <hannes@cmpxchg.org> Date: Thu Sep 2 14:53:18 2021 -0700 mm: remove irqsave/restore locking from contexts with irqs enabled The page cache deletion paths all have interrupts enabled, so no need to use irqsafe/irqrestore locking variants. They used to have irqs disabled by the memcg lock added in commit `c4843a7593` ("memcg: add per cgroup dirty page accounting"), but that has since been replaced by memcg taking the page lock instead, commit `0a31bc97c8` ("mm: memcontrol: rewrite uncharge AP"). Link: https://lkml.kernel.org/r/20210614211904.14420-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rafael Aquini <aquini@redhat.com>	2021-11-29 11:40:44 -05:00
Rafael Aquini	4d9de0c3d3	mm: Protect operations adding pages to page cache with invalidate_lock Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2023396 This patch is a backport of the following upstream commit: commit 730633f0b7f951726e87f912a6323641f674ae34 Author: Jan Kara <jack@suse.cz> Date: Thu Jan 28 19:19:45 2021 +0100 mm: Protect operations adding pages to page cache with invalidate_lock Currently, serializing operations such as page fault, read, or readahead against hole punching is rather difficult. The basic race scheme is like: fallocate(FALLOC_FL_PUNCH_HOLE) read / fault / .. truncate_inode_pages_range() <create pages in page cache here> <update fs block mapping and free blocks> Now the problem is in this way read / page fault / readahead can instantiate pages in page cache with potentially stale data (if blocks get quickly reused). Avoiding this race is not simple - page locks do not work because we want to make sure there are no pages in given range. inode->i_rwsem does not work because page fault happens under mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes the performance for mixed read-write workloads suffer. So create a new rw_semaphore in the address_space - invalidate_lock - that protects adding of pages to page cache for page faults / reads / readahead. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Rafael Aquini <aquini@redhat.com>	2021-11-29 11:40:22 -05:00
Rafael Aquini	5a88d17b6c	mm: Fix comments mentioning i_mutex Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2023396 This patch is a backport of the following upstream commit: commit 9608703e488cf7a711c42c7ccd981c32377f7b78 Author: Jan Kara <jack@suse.cz> Date: Mon Apr 12 15:50:21 2021 +0200 mm: Fix comments mentioning i_mutex inode->i_mutex has been replaced with inode->i_rwsem long ago. Fix comments still mentioning i_mutex. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Rafael Aquini <aquini@redhat.com>	2021-11-29 11:40:21 -05:00
Hugh Dickins	22061a1ffa	mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page() There is a race between THP unmapping and truncation, when truncate sees pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared it, but before its page_remove_rmap() gets to decrement compound_mapcount: generating false "BUG: Bad page cache" reports that the page is still mapped when deleted. This commit fixes that, but not in the way I hoped. The first attempt used try_to_unmap(page, TTU_SYNC\|TTU_IGNORE_MLOCK) instead of unmap_mapping_range() in truncate_cleanup_page(): it has often been an annoyance that we usually call unmap_mapping_range() with no pages locked, but there apply it to a single locked page. try_to_unmap() looks more suitable for a single locked page. However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page): it is used to insert THP migration entries, but not used to unmap THPs. Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB needs are different, I'm too ignorant of the DAX cases, and couldn't decide how far to go for anon+swap. Set that aside. The second attempt took a different tack: make no change in truncate.c, but modify zap_huge_pmd() to insert an invalidated huge pmd instead of clearing it initially, then pmd_clear() between page_remove_rmap() and unlocking at the end. Nice. But powerpc blows that approach out of the water, with its serialize_against_pte_lookup(), and interesting pgtable usage. It would need serious help to get working on powerpc (with a minor optimization issue on s390 too). Set that aside. Just add an "if (page_mapped(page)) synchronize_rcu();" or other such delay, after unmapping in truncate_cleanup_page()? Perhaps, but though that's likely to reduce or eliminate the number of incidents, it would give less assurance of whether we had identified the problem correctly. This successful iteration introduces "unmap_mapping_page(page)" instead of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route, with an addition to details. Then zap_pmd_range() watches for this case, and does spin_unlock(pmd_lock) if so - just like page_vma_mapped_walk() now does in the PVMW_SYNC case. Not pretty, but safe. Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to assert its interface; but currently that's only used to make sure that page->mapping is stable, and zap_pmd_range() doesn't care if the page is locked or not. Along these lines, in invalidate_inode_pages2_range() move the initial unmap_mapping_range() out from under page lock, before then calling unmap_mapping_page() under page lock if still mapped. Link: https://lkml.kernel.org/r/a2a4a148-cdd8-942c-4ef8-51b77f643dbe@google.com Fixes: `fc127da085` ("truncate: handle file thp") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jan Kara <jack@suse.cz> Cc: Jue Wang <juew@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Wang Yugui <wangyugui@e16-tech.com> Cc: Zi Yan <ziy@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-06-16 09:24:42 -07:00
Matthew Wilcox (Oracle)	46be67b424	mm: stop accounting shadow entries We no longer need to keep track of how many shadow entries are present in a mapping. This saves a few writes to the inode and memory barriers. Link: https://lkml.kernel.org/r/20201026151849.24232-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-05 11:27:19 -07:00
Matthew Wilcox (Oracle)	7716506ada	mm: introduce and use mapping_empty() Patch series "Remove nrexceptional tracking", v2. We actually use nrexceptional for very little these days. It's a minor pain to keep in sync with nrpages, but the pain becomes much bigger with the THP patches because we don't know how many indices a shadow entry occupies. It's easier to just remove it than keep it accurate. Also, we save 8 bytes per inode which is nothing to sneeze at; on my laptop, it would improve shmem_inode_cache from 22 to 23 objects per 16kB, and inode_cache from 26 to 27 objects. Combined, that saves a megabyte of memory from a combined usage of 25MB for both caches. Unfortunately, ext4 doesn't cross a magic boundary, so it doesn't save any memory for ext4. This patch (of 4): Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Link: https://lkml.kernel.org/r/20201026151849.24232-1-willy@infradead.org Link: https://lkml.kernel.org/r/20201026151849.24232-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-05 11:27:19 -07:00
Matthew Wilcox (Oracle)	a656a20241	mm: remove pagevec_lookup_entries pagevec_lookup_entries() is now just a wrapper around find_get_entries() so remove it and convert all its callers. Link: https://lkml.kernel.org/r/20201112212641.27837-15-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-02-26 09:40:59 -08:00

1 2 3 4

191 Commits