Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Bill O'Donnell	d7fddd5eaa	mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind JIRA: https://issues.redhat.com/browse/RHEL-12888 Conflicts: difference from upstream mm/memory-failure.c commit fa422b353d212373fb2b2857a5ea5a6fa4876f9c Author: Shiyang Ruan <ruansy.fnst@fujitsu.com> Date: Mon Oct 23 15:20:46 2023 +0800 mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind Now, if we suddenly remove a PMEM device(by calling unbind) which contains FSDAX while programs are still accessing data in this device, e.g.: ``` $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind ``` it could come into an unacceptable state: 1. device has gone but mount point still exists, and umount will fail with "target is busy" 2. programs will hang and cannot be killed 3. may crash with NULL pointer dereference To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we are going to remove the whole device, and make sure all related processes could be notified so that they could end up gracefully. This patch is inspired by Dan's "mm, dax, pmem: Introduce dev_pagemap_failure()"[1]. With the help of dax_holder and ->notify_failure() mechanism, the pmem driver is able to ask filesystem on it to unmap all files in use, and notify processes who are using those files. Call trace: trigger unbind -> unbind_store() -> ... (skip) -> devres_release_all() -> kill_dax() -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE) -> xfs_dax_notify_failure() `-> freeze_super() // freeze (kernel call) `-> do xfs rmap ` -> mf_dax_kill_procs() ` -> collect_procs_fsdax() // all associated processes ` -> unmap_and_kill() ` -> invalidate_inode_pages2_range() // drop file's cache `-> thaw_super() // thaw (both kernel & user call) Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove event. Use the exclusive freeze/thaw[2] to lock the filesystem to prevent new dax mapping from being created. Do not shutdown filesystem directly if configuration is not supported, or if failure range includes metadata area. Make sure all files and processes(not only the current progress) are handled correctly. Also drop the cache of associated files before pmem is removed. [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@dwillia2-desk3.amr.corp.intel.com/ [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/ Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2024-04-05 11:58:49 -05:00
Bill O'Donnell	a4a194c4b0	xfs: correct calculation for agend and blockcount JIRA: https://issues.redhat.com/browse/RHEL-12888 commit 3c90c01e49342b166e5c90ec2c85b220be15a20e Author: Shiyang Ruan <ruansy.fnst@fujitsu.com> Date: Wed Sep 13 18:29:42 2023 +0800 xfs: correct calculation for agend and blockcount The agend should be "start + length - 1", then, blockcount should be "end + 1 - start". Correct 2 calculation mistakes. Also, rename "agend" to "range_agend" because it's not the end of the AG per se; it's the end of the dead region within an AG's agblock space. Fixes: 5cf32f63b0f4 ("xfs: fix the calculation for "end" and "length"") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2024-04-05 11:58:48 -05:00
Bill O'Donnell	6e5032d556	xfs: fix the calculation for "end" and "length" JIRA: https://issues.redhat.com/browse/RHEL-12888 commit 5cf32f63b0f4c520460c1a5dd915dc4f09085f29 Author: Shiyang Ruan <ruansy.fnst@fujitsu.com> Date: Thu Jun 29 17:40:30 2023 -0700 xfs: fix the calculation for "end" and "length" The value of "end" should be "start + length - 1". Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2024-04-05 11:45:50 -05:00
Bill O'Donnell	26a4936de4	xfs: fix up for "xfs: pass perag to xfs_alloc_read_agf()" Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2218635 Conflicts: pick only the change to xfs_notify_failure.c that was included in this merge commit. commit 6614a3c3164a5df2b54abb0b3559f51041cf705b Merge: 74cae210a335 360614c01f81 ... [ XFS merge from hell as per Darrick Wong in https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ] Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2023-07-05 15:34:25 -05:00
Bill O'Donnell	3871255f2c	xfs: on memory failure, only shut down fs after scanning all mappings Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730 commit e033f40be262c4d227f8fbde52856e1d8646872b Author: Darrick J. Wong <djwong@kernel.org> Date: Tue Oct 4 16:40:01 2022 +1100 xfs: on memory failure, only shut down fs after scanning all mappings xfs_dax_failure_fn is used to scan the filesystem during a memory failure event to look for memory mappings to revoke. Unfortunately, if it encounters an rmap record for filesystem metadata, it will shut down the filesystem and the scan immediately. This means that we don't complete the mapping revocation scan and instead leave live mappings to failed memory. Fix the function to defer the shutdown until after we've finished culling mappings. While we're at it, add the usual "xfs_" prefix to struct failure_info, and actually initialize mf_flags. Fixes: 6f643c57d57c ("xfs: implement ->notify_failure() for XFS") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2023-06-16 10:35:48 -05:00
Bill O'Donnell	be9bfce15f	xfs: fix SB_BORN check in xfs_dax_notify_failure() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730 commit fd63612ae81159bd7e59762de478889315463ee8 Author: Dan Williams <dan.j.williams@intel.com> Date: Fri Aug 26 10:18:01 2022 -0700 xfs: fix SB_BORN check in xfs_dax_notify_failure() The SB_BORN flag is stored in the vfs superblock, not xfs_sb. Link: https://lkml.kernel.org/r/166153428094.2758201.7936572520826540019.stgit@dwillia2-xfh.jf.intel.com Fixes: 6f643c57d57c ("xfs: implement ->notify_failure() for XFS") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2023-06-16 10:35:47 -05:00
Bill O'Donnell	5c90d99915	xfs: quiet notify_failure EOPNOTSUPP cases Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730 commit b14d067e850c19921cec2200bd8d179edf6a1aa6 Author: Dan Williams <dan.j.williams@intel.com> Date: Fri Aug 26 10:17:54 2022 -0700 xfs: quiet notify_failure EOPNOTSUPP cases Patch series "mm, xfs, dax: Fixes for memory_failure() handling". I failed to run the memory error injection section of the ndctl test suite on linux-next prior to the merge window and as a result some bugs were missed. While the new enabling targeted reflink enabled XFS filesystems the bugs cropped up in the surrounding cases of DAX error injection on ext4-fsdax and device-dax. One new assumption / clarification in this set is the notion that if a filesystem's ->notify_failure() handler returns -EOPNOTSUPP, then it must be the case that the fsdax usage of page->index and page->mapping are valid. I am fairly certain this is true for xfs_dax_notify_failure(), but would appreciate another set of eyes. This patch (of 4): XFS always registers dax_holder_operations regardless of whether the filesystem is capable of handling the notifications. The expectation is that if the notify_failure handler cannot run then there are no scenarios where it needs to run. In other words the expected semantic is that page->index and page->mapping are valid for memory_failure() when the conditions that cause -EOPNOTSUPP in xfs_dax_notify_failure() are present. A fallback to the generic memory_failure() path is expected so do not warn when that happens. Link: https://lkml.kernel.org/r/166153426798.2758201.15108211981034512993.stgit@dwillia2-xfh.jf.intel.com Link: https://lkml.kernel.org/r/166153427440.2758201.6709480562966161512.stgit@dwillia2-xfh.jf.intel.com Fixes: 6f643c57d57c ("xfs: implement ->notify_failure() for XFS") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2023-06-16 10:35:47 -05:00
Bill O'Donnell	c6108fc126	xfs: implement ->notify_failure() for XFS Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730 Conflicts: change to xfs_perag_get() and xfs_perag_put() api from previous out of order patch from upstream fa044ae70 xfs: pass perag to xfs_read_agf required changes to xfs_notify_failure.c commit 6f643c57d57c56d4677bc05f1fca2ef3f249797c Author: Shiyang Ruan <ruansy.fnst@fujitsu.com> Date: Fri Jun 3 13:37:30 2022 +0800 xfs: implement ->notify_failure() for XFS Introduce xfs_notify_failure.c to handle failure related works, such as implement ->notify_failure(), register/unregister dax holder in xfs, and so on. If the rmap feature of XFS enabled, we can query it to find files and metadata which are associated with the corrupt data. For now all we do is kill processes with that file mapped into their address spaces, but future patches could actually do something about corrupt metadata. After that, the memory failure needs to notify the processes who are using those files. Link: https://lkml.kernel.org/r/20220603053738.1218681-7-ruansy.fnst@fujitsu.com Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.wiliams@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>	2023-06-16 10:34:27 -05:00

8 Commits