Commit Graph

277 Commits

Author SHA1 Message Date
Benjamin Marzinski b40a482beb dm raid: fix spelling errors in raid_ctr()
JIRA: https://issues.redhat.com/browse/RHEL-84906
Upstream Status: kernel/git/torvalds/linux.git

commit 193700b9b218a77222a1b5cd53206c17c40f786b
Author: liujing <liujing@cmss.chinamobile.com>
Date:   Mon Dec 9 11:10:55 2024 +0800

    dm raid: fix spelling errors in raid_ctr()

    Fix the respective spelling errors in raid_ctr() function.

    Signed-off-by: liujing <liujing@cmss.chinamobile.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2025-04-08 11:46:32 -04:00
Benjamin Marzinski 381ff09e66 dm: fix spelling errors
JIRA: https://issues.redhat.com/browse/RHEL-84906
Upstream Status: kernel/git/torvalds/linux.git

commit 0a92e5cdeef9fa4cba8bef6cd1d91cff6b5d300b
Author: Shen Lichuan <shenlichuan@vivo.com>
Date:   Tue Sep 24 15:21:11 2024 +0200

    dm: fix spelling errors

    Fixed some confusing spelling errors that were currently identified,
    the details are as follows:

    -in the code comments:
            dm-cache-target.c: 1371:        exclussive      ==> exclusive
            dm-raid.c: 2522:                repective       ==> respective

    Signed-off-by: Shen Lichuan <shenlichuan@vivo.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2025-04-08 11:46:30 -04:00
Rado Vrbovsky 819dc55a1f Merge: CVE-2024-43820: dm-raid: Fix WARN_ON_ONCE check for sync_thread in raid_resume
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5037

JIRA: https://issues.redhat.com/browse/RHEL-54875
CVE: CVE-2024-43820

```
dm-raid: Fix WARN_ON_ONCE check for sync_thread in raid_resume

rm-raid devices will occasionally trigger the following warning when
being resumed after a table load because DM_RECOVERY_RUNNING is set:

WARNING: CPU: 7 PID: 5660 at drivers/md/dm-raid.c:4105 raid_resume+0xee/0x100 [dm_raid]

The failing check is:
WARN_ON_ONCE(test_bit(MD_RECOVERY_RUNNING, &mddev->recovery));

This check is designed to make sure that the sync thread isn't
registered, but md_check_recovery can set MD_RECOVERY_RUNNING without
the sync_thread ever getting registered. Instead of checking if
MD_RECOVERY_RUNNING is set, check if sync_thread is non-NULL.

Fixes: 16c4770c75b1 ("dm-raid: really frozen sync_thread during suspend")
Suggested-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
(cherry picked from commit 3199a34bfaf7561410e0be1e33a61eba870768fc)
```

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>

Approved-by: Heinz Mauelshagen <heinzm@redhat.com>
Approved-by: Xiao Ni <xni@redhat.com>
Approved-by: Nigel Croxon <ncroxon@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-27 11:19:26 +00:00
Benjamin Marzinski 5318b764b4 dm raid: fix stripes adding reshape size issues
JIRA: https://issues.redhat.com/browse/RHEL-34750
Upstream Status: kernel/git/torvalds/linux.git

commit d176fadb9e783c152d0820a50f84882b6c5ae314
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Jul 9 13:56:38 2024 +0200

    dm raid: fix stripes adding reshape size issues

    Adding stripes to an existing raid4/5/6/10 mapped device grows its
    capacity though it'll be only made available _after_ the respective
    reshape finished as of MD kernel reshape semantics.  Such reshaping
    involves moving a window forward starting at BOD reading content
    from previous lesser stripes and writing them back in the new
    layout with more stripes.  Once that process finishes at end of
    previous data, the grown size may be announced and used.  In order
    to avoid writing over any existing data in place, out-of-place space
    is added to the beginning of each data device by lvm2 before starting
    the reshape process. That reshape space wasn't taken into acount for
    data device size calculation.

    Fixes resulting from above:

    - correct event handling conditions in do_table_event() to set
      the device's capacity after the stripe adding reshape ended

    - subtract mentioned out-of-place space doing data device and
      array size calculations

    - conditionally set capacity as of superblock in preresume

    Testing:

    - passes all LVM2 RAID tests including new lvconvert-raid-reshape-size.sh one

    Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-10-21 15:58:34 -04:00
Benjamin Marzinski 6a19c3c346 dm raid: move _get_reshape_sectors() as prerequisite to fixing reshape size issues
JIRA: https://issues.redhat.com/browse/RHEL-34750
Upstream Status: kernel/git/torvalds/linux.git

commit 453496b899b5f62ff193bca46097f0f7211cec46
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Jul 9 13:56:12 2024 +0200

    dm raid: move _get_reshape_sectors() as prerequisite to fixing reshape size issues

    rs_set_dev_and_array_sectors() needs this function to
    calculate device and array size properly in case leg data
    devices have out-of-place reshape space allocated.

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-10-21 15:58:34 -04:00
Benjamin Marzinski f10ee92d9a dm: stop using blk_limits_io_{min,opt}
JIRA: https://issues.redhat.com/browse/RHEL-59523
Upstream Status: kernel/git/torvalds/linux.git
Conflicts: Hunk patching drivers/md/dm-vdo/dm-vdo-target.c removed due
           to missing upstream commit 03d1e20fa16e0 ("dm vdo: add the
           top-level DM target"). dm-vdo is not part of the kernel
           package in rhel-9.

commit 0a94a469a4f02bdcc223517fd578810ffc21c548
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jul 3 15:12:08 2024 +0200

    dm: stop using blk_limits_io_{min,opt}

    Remove use of the blk_limits_io_{min,opt} and assign the values directly
    to the queue_limits structure.  For the io_opt this is a completely
    mechanical change, for io_min it removes flooring the limit to the
    physical and logical block size in the particular caller.  But as
    blk_validate_limits will do the same later when actually applying the
    limits, there still is no change in overall behavior.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-10-21 15:58:33 -04:00
Benjamin Marzinski 9b3de17d0b dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list
JIRA: https://issues.redhat.com/browse/RHEL-59523
Upstream Status: kernel/git/torvalds/linux.git

commit fa34e5893ff2d5b0174c124a29e1be6d0426a169
Author: Mike Snitzer <snitzer@kernel.org>
Date:   Wed Feb 7 15:51:24 2024 -0500

    dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list

    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-10-21 15:58:31 -04:00
Nigel Croxon 55c5b015d1 md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
JIRA: https://issues.redhat.com/browse/RHEL-61196

commit 77c09640eea56dbfed069ac67b1cd79397d41be8
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Mon Aug 26 15:44:45 2024 +0800

    md/md-bitmap: merge md_bitmap_resize() into bitmap_operations

    So that the implementation won't be exposed, and it'll be possible
    to invent a new bitmap by replacing bitmap_operations.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Link: https://lore.kernel.org/r/20240826074452.1490072-36-yukuai1@huaweicloud.com
    Signed-off-by: Song Liu <song@kernel.org>

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-10-01 15:13:38 -04:00
Nigel Croxon 1210b6c5b5 md/md-bitmap: pass in mddev directly for md_bitmap_resize()
JIRA: https://issues.redhat.com/browse/RHEL-61196

commit e1791dae6cbd65e5102dca40b8adadef1d89c1b9
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Mon Aug 26 15:44:44 2024 +0800

    md/md-bitmap: pass in mddev directly for md_bitmap_resize()

    And move the condition "if (mddev->bitmap)" into md_bitmap_resize() as
    well, on the one hand make code cleaner, on the other hand try not to
    access bitmap directly.

    Since we are here, also change the parameter 'init' from int to bool.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Link: https://lore.kernel.org/r/20240826074452.1490072-35-yukuai1@huaweicloud.com
    Signed-off-by: Song Liu <song@kernel.org>

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-10-01 15:13:38 -04:00
Nigel Croxon 6dcbf10abf md/md-bitmap: merge md_bitmap_load() into bitmap_operations
JIRA: https://issues.redhat.com/browse/RHEL-61196

commit e1e490805958617327be14eaf0ed31d71adc2c54
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Mon Aug 26 15:44:24 2024 +0800

    md/md-bitmap: merge md_bitmap_load() into bitmap_operations

    So that the implementation won't be exposed, and it'll be possible
    to invent a new bitmap by replacing bitmap_operations.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Link: https://lore.kernel.org/r/20240826074452.1490072-15-yukuai1@huaweicloud.com
    Signed-off-by: Song Liu <song@kernel.org>

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-10-01 15:11:49 -04:00
Nigel Croxon 2b75f88396 md: replace last_sync_action with new enum type
JIRA: https://issues.redhat.com/browse/RHEL-46615

commit d249e541887a966df37544f7c4d301cdee0f0e27
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Jun 11 21:22:48 2024 +0800

md: replace last_sync_action with new enum type

The only difference is that "none" is removed and initial
last_sync_action will be idle.

On the one hand, this value is introduced by commit c4a3955145
("MD: Remember the last sync operation that was performed"), and the
usage described in commit message is not affected. On the other hand,
last_sync_action is not used in mdadm or mdmon, and none of the tests
that I can find.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240611132251.1967786-10-yukuai1@huaweicloud.com
(cherry picked from commit d249e541887a966df37544f7c4d301cdee0f0e27)
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-08-28 13:25:03 -04:00
CKI Backport Bot cf897e0d77 dm-raid: Fix WARN_ON_ONCE check for sync_thread in raid_resume
JIRA: https://issues.redhat.com/browse/RHEL-54875
CVE: CVE-2024-43820

commit 3199a34bfaf7561410e0be1e33a61eba870768fc
Author: Benjamin Marzinski <bmarzins@redhat.com>
Date:   Tue Jul 2 17:02:48 2024 +0200

    dm-raid: Fix WARN_ON_ONCE check for sync_thread in raid_resume

    rm-raid devices will occasionally trigger the following warning when
    being resumed after a table load because DM_RECOVERY_RUNNING is set:

    WARNING: CPU: 7 PID: 5660 at drivers/md/dm-raid.c:4105 raid_resume+0xee/0x100 [dm_raid]

    The failing check is:
    WARN_ON_ONCE(test_bit(MD_RECOVERY_RUNNING, &mddev->recovery));

    This check is designed to make sure that the sync thread isn't
    registered, but md_check_recovery can set MD_RECOVERY_RUNNING without
    the sync_thread ever getting registered. Instead of checking if
    MD_RECOVERY_RUNNING is set, check if sync_thread is non-NULL.

    Fixes: 16c4770c75b1 ("dm-raid: really frozen sync_thread during suspend")
    Suggested-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
    Reviewed-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-08-19 13:44:27 +00:00
Benjamin Marzinski 54081cf105 dm raid: fix false positive for requeue needed during reshape
JIRA: https://issues.redhat.com/browse/RHEL-34599
Upstream Status: kernel/git/torvalds/linux.git
Conflict: Context change due to usptream patch 41425f96d7aa
          ("dm-raid456, md/raid456: fix a deadlock for dm-raid456 while
          io concurrent with reshape") having already been applied as
          5266d3b3fb.

commit b25b8f4b8ecef0f48c05f0c3572daeabefe16526
Author: Ming Lei <ming.lei@redhat.com>
Date:   Mon Mar 11 13:42:55 2024 -0400

    dm raid: fix false positive for requeue needed during reshape

    An empty flush doesn't have a payload, so it should never be looked at
    when considering to possibly requeue a bio for the case when a reshape
    is in progress.

    Fixes: 9dbd1aa3a8 ("dm raid: add reshaping support to the target")
    Reported-by: Patrick Plenefisch <simonpatp@gmail.com>
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-04-29 17:59:40 -04:00
Benjamin Marzinski 9be36d29bb dm raid: Annotate struct raid_set with __counted_by
JIRA: https://issues.redhat.com/browse/RHEL-34599
Upstream Status: kernel/git/torvalds/linux.git

commit e3260d90c8f35c03ce182bfd2eeea75805586c25
Author: Kees Cook <keescook@chromium.org>
Date:   Fri Sep 15 13:03:36 2023 -0700

    dm raid: Annotate struct raid_set with __counted_by

    Prepare for the coming implementation by GCC and Clang of the __counted_by
    attribute. Flexible array members annotated with __counted_by can have
    their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
    (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
    functions).

    As found with Coccinelle[1], add __counted_by for struct raid_set.

    [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

    Cc: Alasdair Kergon <agk@redhat.com>
    Cc: Mike Snitzer <snitzer@kernel.org>
    Cc: dm-devel@redhat.com
    Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
    Link: https://lore.kernel.org/r/20230915200335.never.098-kees@kernel.org
    Signed-off-by: Kees Cook <keescook@chromium.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-04-29 15:11:26 -04:00
Benjamin Marzinski 86a2ed3f5c dm-raid: delay flushing event_work() after reconfig_mutex is released
JIRA: https://issues.redhat.com/browse/RHEL-30951
Upstream Status: kernel/git/torvalds/linux.git

commit db29d79b34d9593179de5f868be45c650923e7b4
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Fri Nov 24 15:59:53 2023 +0800

    dm-raid: delay flushing event_work() after reconfig_mutex is released

    After commit db5e653d7c9f ("md: delay choosing sync action to
    md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however,
    in order to make sure event_work is done, __md_stop() will flush
    workqueue with reconfig_mutex grabbed, hence if sync_work is still
    pending, deadlock will be triggered.

    Fortunately, former pacthes to fix stopping sync_thread already make sure
    all sync_work is done already, hence such deadlock is not possible
    anymore. However, in order not to cause confusions for people by this
    implicit dependency, delay flushing event_work to dm-raid where
    'reconfig_mutex' is not held, and add some comments to emphasize that
    the workqueue can't be flushed with 'reconfig_mutex'.

    Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()")
    Depends-on: f52f5c71f3d4 ("md: fix stopping sync thread")
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Acked-by: Xiao Ni <xni@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2024-04-03 13:56:11 -04:00
Nigel Croxon 36cc045a20 dm-raid: fix lockdep waring in "pers->hot_add_disk"
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Mar 5 15:23:06 2024 +0800

dm-raid: fix lockdep waring in "pers->hot_add_disk"

The lockdep assert is added by commit a448af25becf ("md/raid10: remove
rcu protection to access rdev from conf") in print_conf(). And I didn't
notice that dm-raid is calling "pers->hot_add_disk" without holding
'reconfig_mutex'.

"pers->hot_add_disk" read and write many fields that is protected by
'reconfig_mutex', and raid_resume() already grab the lock in other
contex. Hence fix this problem by protecting "pers->host_add_disk"
with the lock.

Fixes: 9092c02d94 ("DM RAID: Add ability to restore transiently failed devices on resume")
Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 12:17:16 -04:00
Nigel Croxon 942ac57c00 md/dm-raid: don't call md_reap_sync_thread() directly
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit cd32b27a66db8776d8b8e82ec7d7dde97a8693b0
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Mar 5 15:23:03 2024 +0800

md/dm-raid: don't call md_reap_sync_thread() directly

Currently md_reap_sync_thread() is called from raid_message() directly
without holding 'reconfig_mutex', this is definitely unsafe because
md_reap_sync_thread() can change many fields that is protected by
'reconfig_mutex'.

However, hold 'reconfig_mutex' here is still problematic because this
will cause deadlock, for example, commit 130443d60b1b ("md: refactor
idle/frozen_sync_thread() to fix deadlock").

Fix this problem by using stop_sync_thread() to unregister sync_thread,
like md/raid did.

Fixes: be83651f00 ("DM RAID: Add message/status support for changing sync action")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240305072306.2562024-7-yukuai1@huaweicloud.com
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 12:17:16 -04:00
Nigel Croxon 5266d3b3fb dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit 41425f96d7aa59bc865f60f5dda3d7697b555677
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Mar 5 15:23:05 2024 +0800

dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape

For raid456, if reshape is still in progress, then IO across reshape
position will wait for reshape to make progress. However, for dm-raid,
in following cases reshape will never make progress hence IO will hang:

1) the array is read-only;
2) MD_RECOVERY_WAIT is set;
3) MD_RECOVERY_FROZEN is set;

After commit c467e97f079f ("md/raid6: use valid sector values to determine
if an I/O should wait on the reshape") fix the problem that IO across
reshape position doesn't wait for reshape, the dm-raid test
shell/lvconvert-raid-reshape.sh start to hang:

[root@fedora ~]# cat /proc/979/stack
[<0>] wait_woken+0x7d/0x90
[<0>] raid5_make_request+0x929/0x1d70 [raid456]
[<0>] md_handle_request+0xc2/0x3b0 [md_mod]
[<0>] raid_map+0x2c/0x50 [dm_raid]
[<0>] __map_bio+0x251/0x380 [dm_mod]
[<0>] dm_submit_bio+0x1f0/0x760 [dm_mod]
[<0>] __submit_bio+0xc2/0x1c0
[<0>] submit_bio_noacct_nocheck+0x17f/0x450
[<0>] submit_bio_noacct+0x2bc/0x780
[<0>] submit_bio+0x70/0xc0
[<0>] mpage_readahead+0x169/0x1f0
[<0>] blkdev_readahead+0x18/0x30
[<0>] read_pages+0x7c/0x3b0
[<0>] page_cache_ra_unbounded+0x1ab/0x280
[<0>] force_page_cache_ra+0x9e/0x130
[<0>] page_cache_sync_ra+0x3b/0x110
[<0>] filemap_get_pages+0x143/0xa30
[<0>] filemap_read+0xdc/0x4b0
[<0>] blkdev_read_iter+0x75/0x200
[<0>] vfs_read+0x272/0x460
[<0>] ksys_read+0x7a/0x170
[<0>] __x64_sys_read+0x1c/0x30
[<0>] do_syscall_64+0xc6/0x230
[<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74

This is because reshape can't make progress.

For md/raid, the problem doesn't exist because register new sync_thread
doesn't rely on the IO to be done any more:

1) If array is read-only, it can switch to read-write by ioctl/sysfs;
2) md/raid never set MD_RECOVERY_WAIT;
3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
   'reconfig_mutex', hence it can be cleared and reshape can continue by
   sysfs api 'sync_action'.

However, I'm not sure yet how to avoid the problem in dm-raid yet. This
patch on the one hand make sure raid_message() can't change
sync_thread() through raid_message() after presuspend(), on the other
hand detect the above 3 cases before wait for IO do be done in
dm_suspend(), and let dm-raid requeue those IO.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
(cherry picked from commit 41425f96d7aa59bc865f60f5dda3d7697b555677)
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 12:17:16 -04:00
Nigel Croxon 52927dcd01 dm-raid: add a new helper prepare_suspend() in md_personality
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit 5625ff8b72b0e5c13b0fc1fc1f198155af45f729
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Mar 5 15:23:04 2024 +0800

    dm-raid: add a new helper prepare_suspend() in md_personality

    There are no functional changes for now, prepare to fix a deadlock for
    dm-raid456.

    Cc: stable@vger.kernel.org # v6.7+
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Xiao Ni <xni@redhat.com>
    Acked-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20240305072306.2562024-8-yukuai1@huaweicloud.com

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 12:17:15 -04:00
Nigel Croxon ac151291d8 dm-raid: really frozen sync_thread during suspend
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit 16c4770c75b1223998adbeb7286f9a15c65fba73
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Mar 5 15:23:02 2024 +0800

    dm-raid: really frozen sync_thread during suspend

    1) commit f52f5c71f3d4 ("md: fix stopping sync thread") remove
       MD_RECOVERY_FROZEN from __md_stop_writes() and doesn't realize that
       dm-raid relies on __md_stop_writes() to frozen sync_thread
       indirectly. Fix this problem by adding MD_RECOVERY_FROZEN in
       md_stop_writes(), and since stop_sync_thread() is only used for
       dm-raid in this case, also move stop_sync_thread() to
       md_stop_writes().
    2) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen,
       it only prevent new sync_thread to start, and it can't stop the
       running sync thread; In order to frozen sync_thread, after seting the
       flag, stop_sync_thread() should be used.
    3) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use
       it as condition for md_stop_writes() in raid_postsuspend() doesn't
       look correct. Consider that reentrant stop_sync_thread() do nothing,
       always call md_stop_writes() in raid_postsuspend().
    4) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime,
       and if MD_RECOVERY_FROZEN is cleared while the array is suspended,
       new sync_thread can start unexpected. Fix this by disallow
       raid_message() to change sync_thread status during suspend.

    Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the
    test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(),
    and with previous fixes, the test won't hang there anymore, however, the
    test will still fail and complain that ext4 is corrupted. And with this
    patch, the test won't hang due to stop_sync_thread() or fail due to ext4
    is corrupted anymore. However, there is still a deadlock related to
    dm-raid456 that will be fixed in following patches.

    Reported-by: Mikulas Patocka <mpatocka@redhat.com>
    Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/
    Fixes: 1af2048a3e ("dm raid: fix deadlock caused by premature md_stop_writes()")
    Fixes: 9dbd1aa3a8 ("dm raid: add reshaping support to the target")
    Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
    Cc: stable@vger.kernel.org # v6.7+
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Xiao Ni <xni@redhat.com>
    Acked-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20240305072306.2562024-6-yukuai1@huaweicloud.com

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 12:17:15 -04:00
Nigel Croxon aa7df82c65 md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit 2b16a52549d51937a98d82b07b4d83dce6c43683
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Oct 10 23:19:58 2023 +0800

    md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

    Now that the old apis are removed, __mddev_suspend/resume() can be
    renamed to their original names.

    This is done by:

    sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch]
    sed -i "s/__mddev_resume/mddev_resume/g" *.[ch]

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20231010151958.145896-20-yukuai1@huaweicloud.com

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 11:19:07 -04:00
Nigel Croxon 596237ac5d md/dm-raid: use new apis to suspend array
JIRA: https://issues.redhat.com/browse/RHEL-26279

commit 4eb3327aa28f3a737c2d3f7e35e83575f1d52283
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue Oct 10 23:19:45 2023 +0800

md/dm-raid: use new apis to suspend array

Convert to use new apis, the old apis will be removed eventually.

These are not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20231010151958.145896-7-yukuai1@huaweicloud.com
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2024-03-22 11:19:07 -04:00
Benjamin Marzinski f046860c74 dm raid: protect md_stop() with 'reconfig_mutex'
JIRA: https://issues.redhat.com/browse/RHEL-12342
JIRA: https://issues.redhat.com/browse/RHEL-12435
Upstream Status: kernel/git/torvalds/linux.git

commit 7d5fff8982a2199d49ec067818af7d84d4f95ca0
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Sat Jul 8 17:21:53 2023 +0800

    dm raid: protect md_stop() with 'reconfig_mutex'

    __md_stop_writes() and __md_stop() will modify many fields that are
    protected by 'reconfig_mutex', and all the callers will grab
    'reconfig_mutex' except for md_stop().

    Also, update md_stop() to make certain 'reconfig_mutex' is held using
    lockdep_assert_held().

    Fixes: 9d09e663d5 ("dm: raid456 basic support")
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-11-07 11:27:43 -05:00
Benjamin Marzinski 5906dbe5a9 dm raid: clean up four equivalent goto tags in raid_ctr()
JIRA: https://issues.redhat.com/browse/RHEL-12342
JIRA: https://issues.redhat.com/browse/RHEL-12435
Upstream Status: kernel/git/torvalds/linux.git

commit e74c874eabe2e9173a8fbdad616cd89c70eb8ffd
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Sat Jul 8 17:21:52 2023 +0800

    dm raid: clean up four equivalent goto tags in raid_ctr()

    There are four equivalent goto tags in raid_ctr(), clean them up to
    use just one.

    There is no functional change and this is preparation to fix
    raid_ctr()'s unprotected md_stop().

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-11-07 11:27:43 -05:00
Benjamin Marzinski 2a0af6a81f dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
JIRA: https://issues.redhat.com/browse/RHEL-12342
Upstream Status: kernel/git/torvalds/linux.git

commit bae3028799dc4f1109acc4df37c8ff06f2d8f1a0
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Sat Jul 8 17:21:51 2023 +0800

    dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths

    In the error paths 'bad_stripe_cache' and 'bad_check_reshape',
    'reconfig_mutex' is still held after raid_ctr() returns.

    Fixes: 9dbd1aa3a8 ("dm raid: add reshaping support to the target")
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-11-07 11:27:43 -05:00
Nigel Croxon 5bdac57635 md: initialize 'active_io' while allocating mddev
JIRA: https://issues.redhat.com/browse/RHEL-12455

commit d58eff83bd3c6166944f6b159544438385d48549
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Fri Aug 25 11:09:50 2023 +0800

    md: initialize 'active_io' while allocating mddev

    'active_io' is used for mddev_suspend() and it's initialized in
    md_run(), this restrict that 'reconfig_mutex' must be held and
    "mddev->pers" must be set before calling mddev_suspend().

    Initialize 'active_io' early so that mddev_suspend() is safe to call
    once mddev is allocated, this will be helpful to refactor
    mddev_suspend() in following patches.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20230825030956.1527023-2-yukuai1@huaweicloud.com

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2023-10-10 11:46:57 -04:00
Nigel Croxon 1aed99b3aa Revert "md: unlock mddev before reap sync_thread in action_store"
JIRA: https://issues.redhat.com/browse/RHEL-3359

commit a865b96c513bcaeec49669010d67c40aa8e58619
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Mon May 29 21:20:32 2023 +0800

    Revert "md: unlock mddev before reap sync_thread in action_store"

    This reverts commit 9dfbdafda3b34e262e43e786077bab8e476a89d1.

    Because it will introduce a defect that sync_thread can be running while
    MD_RECOVERY_RUNNING is cleared, which will cause some unexpected problems,
    for example:

    list_add corruption. prev->next should be next (ffff0001ac1daba0), but was ffff0000ce1a02a0. (prev=ffff0000ce1a02a0).
    Call trace:
     __list_add_valid+0xfc/0x140
     insert_work+0x78/0x1a0
     __queue_work+0x500/0xcf4
     queue_work_on+0xe8/0x12c
     md_check_recovery+0xa34/0xf30
     raid10d+0xb8/0x900 [raid10]
     md_thread+0x16c/0x2cc
     kthread+0x1a4/0x1ec
     ret_from_fork+0x10/0x18

    This is because work is requeued while it's still inside workqueue:

    t1:                     t2:
    action_store
     mddev_lock
      if (mddev->sync_thread)
       mddev_unlock
       md_unregister_thread
       // first sync_thread is done
                            md_check_recovery
                             mddev_try_lock
                             /*
                              * once MD_RECOVERY_DONE is set, new sync_thread
                              * can start.
                              */
                             set_bit(MD_RECOVERY_RUNNING, &mddev->recovery)
                             INIT_WORK(&mddev->del_work, md_start_sync)
                             queue_work(md_misc_wq, &mddev->del_work)
                              test_and_set_bit(WORK_STRUCT_PENDING_BIT, ...)
                              // set pending bit
                              insert_work
                               list_add_tail
                             mddev_unlock
       mddev_lock_nointr
       md_reap_sync_thread
       // MD_RECOVERY_RUNNING is cleared
     mddev_unlock

    t3:

    // before queued work started from t2
    md_check_recovery
     // MD_RECOVERY_RUNNING is not set, a new sync_thread can be started
     INIT_WORK(&mddev->del_work, md_start_sync)
      work->data = 0
      // work pending bit is cleared
     queue_work(md_misc_wq, &mddev->del_work)
      insert_work
       list_add_tail
       // list is corrupted

    The above commit is reverted to fix the problem, the deadlock this
    commit tries to fix will be fixed in following patches.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20230529132037.2124527-2-yukuai1@huaweicloud.com

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2023-09-26 14:39:59 -04:00
Ming Lei 4d86232764 dm-raid: remove useless checking in raid_message()
JIRA: https://issues.redhat.com/browse/RHEL-1516

commit 955a257d69e44cea09b0375b8f2f3d4d9fcf7b4e
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Tue May 23 10:10:14 2023 +0800

    dm-raid: remove useless checking in raid_message()

    md_wakeup_thread() handle the case that pass in md_thread is NULL, there
    is no need to check this.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20230523021017.3048783-3-yukuai1@huaweicloud.com

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2023-09-18 17:59:23 +08:00
Benjamin Marzinski 21a9d77466 dm: add helper macro for simple DM target module init and exit
Bugzilla: https://bugzilla.redhat.com/2189971
Upstream Status: kernel/git/torvalds/linux.git

commit 3664ff82dae1ef9f14f7763d3dd30565e7ef9e14
Author: Yangtao Li <frank.li@vivo.com>
Date:   Mon Apr 10 00:43:37 2023 +0800

    dm: add helper macro for simple DM target module init and exit

    Eliminate duplicate boilerplate code for simple modules that contain
    a single DM target driver without any additional setup code.

    Add a new module_dm() macro, which replaces the module_init() and
    module_exit() with template functions that call dm_register_target()
    and dm_unregister_target() respectively.

    Signed-off-by: Yangtao Li <frank.li@vivo.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-04-26 16:47:43 -05:00
Benjamin Marzinski 055bf33463 dm raid: remove unused d variable
Bugzilla: https://bugzilla.redhat.com/2189971
Upstream Status: kernel/git/torvalds/linux.git

commit 306fbc2e041c227be7c934efe8a49ddb87bd31f1
Author: Tom Rix <trix@redhat.com>
Date:   Thu Mar 30 17:27:53 2023 -0400

    dm raid: remove unused d variable

    clang with W=1 reports
    drivers/md/dm-raid.c:2212:15: error: variable
      'd' set but not used [-Werror,-Wunused-but-set-variable]
            unsigned int d;
                         ^
    This variable is not used so remove it.

    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-04-26 16:47:43 -05:00
Benjamin Marzinski 6598f6e422 dm: fix suspect indent whitespace
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 23fda2effbb1f2f1d2fb0640a4729e6d08ad6e6e
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Feb 7 23:06:47 2023 +0100

    dm: fix suspect indent whitespace

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:19 -05:00
Benjamin Marzinski 1a6d4d95c2 dm: add missing blank line after declarations/fix those
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit b30f1607146c736684c069fe92dc39607d77d15f
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Feb 7 20:48:51 2023 +0100

    dm: add missing blank line after declarations/fix those

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:16 -05:00
Benjamin Marzinski 0bcf7b0eae dm: correct block comments format.
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit a4a82ce3d24d4409143a7b7b980072ada6e20b2a
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Thu Jan 26 15:48:30 2023 +0100

    dm: correct block comments format.

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:12 -05:00
Benjamin Marzinski 3c14ee06f1 dm: address indent/space issues
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 255e2646496fcbf836a3dfe1b535692f09f11b45
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Jan 25 23:31:55 2023 +0100

    dm: address indent/space issues

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:12 -05:00
Benjamin Marzinski 58109e01be dm: avoid initializing static variables
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 2f06cd12e11422e4a44ad4cb856c3ef0be9bd208
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Mon Jan 30 21:28:24 2023 +0100

    dm: avoid initializing static variables

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:12 -05:00
Benjamin Marzinski d8a60f4024 dm: change "unsigned" to "unsigned int"
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 86a3238c7b9b759cb864f4f768ab2e24687dc0e6
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Jan 25 21:14:58 2023 +0100

    dm: change "unsigned" to "unsigned int"

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:11 -05:00
Benjamin Marzinski da3466f503 dm: add missing SPDX-License-Indentifiers
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 3bd940030752a33ff665eefdd74a1cdb74a4f9b0
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Jan 25 21:00:44 2023 +0100

    dm: add missing SPDX-License-Indentifiers

    'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has
    deprecated its use.

    Suggested-by: John Wiele <jwiele@redhat.com>
    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:10 -05:00
Benjamin Marzinski 1b1dcba695 dm raid: fix some spelling mistakes in comments
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit efdd3c3375aa09dd52284b645cbf0eb367f0e258
Author: Yu Zhe <yuzhe@nfschina.com>
Date:   Mon Feb 6 11:27:39 2023 +0800

    dm raid: fix some spelling mistakes in comments

    Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:09 -05:00
Benjamin Marzinski ac5e333b66 dm raid: fix typo in analyse_superblocks code comment
Bugzilla: https://bugzilla.redhat.com/2138462
Upstream Status: kernel/git/torvalds/linux.git

commit 96fccdce97ce647d5c7bf1db0d3159cc90774054
Author: Jiangshan Yi <yijiangshan@kylinos.cn>
Date:   Mon Sep 5 10:45:52 2022 +0800

    dm raid: fix typo in analyse_superblocks code comment

    Reported-by: k2ci <kernel-bot@kylinos.cn>
    Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-11-04 11:16:12 -05:00
Benjamin Marzinski 1925da6fe2 dm raid: delete the redundant word 'that' in comment
Bugzilla: https://bugzilla.redhat.com/2138462
Upstream Status: kernel/git/torvalds/linux.git

commit cea446630feab57f49d47abccf206e9725019cce
Author: Jilin Yuan <yuanjilin@cdjrlc.com>
Date:   Tue Aug 30 23:33:45 2022 +0800

    dm raid: delete the redundant word 'that' in comment

    Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-11-04 11:16:11 -05:00
Benjamin Marzinski 7a319ea092 dm: fix dm-raid crash if md_handle_request() splits bio
Bugzilla: https://bugzilla.redhat.com/2138462
Upstream Status: kernel/git/torvalds/linux.git

commit 9dd1cd3220eca534f2d47afad7ce85f4c40118d8
Author: Mike Snitzer <snitzer@kernel.org>
Date:   Wed Jul 20 13:58:04 2022 -0400

    dm: fix dm-raid crash if md_handle_request() splits bio

    Commit ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone")
    introduced the optimization to _not_ perform bio_associate_blkg()'s
    relatively costly work when DM core clones its bio. But in doing so it
    exposed the possibility for DM's cloned bio to alter DM target
    behavior (e.g. crash) if a target were to issue IO without first
    calling bio_set_dev().

    The DM raid target can trigger an MD crash due to its need to split
    the DM bio that is passed to md_handle_request(). The split will
    recurse to submit_bio_noacct() using a bio with an uninitialized
    ->bi_blkg. This NULL bio->bi_blkg causes blk_throtl_bio() to
    dereference a NULL blkg_to_tg(bio->bi_blkg).

    Fix this in DM core by adding a new 'needs_bio_set_dev' target flag that
    will make alloc_tio() call bio_set_dev() on behalf of the target.
    dm-raid is the only target that requires this flag. bio_set_dev()
    initializes the DM cloned bio's ->bi_blkg, using bio_associate_blkg,
    before passing the bio to md_handle_request().

    Long-term fix would be to audit and refactor MD code to rely on DM to
    split its bio, using dm_accept_partial_bio(), but there are MD raid
    personalities (e.g. raid1 and raid10) whose implementation are tightly
    coupled to handling the bio splitting inline.

    Fixes: ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone")
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-11-04 11:16:01 -05:00
Benjamin Marzinski 72c19cadbd dm raid: remove redundant "the" in parse_raid_params() comment
Bugzilla: https://bugzilla.redhat.com/2138462
Upstream Status: kernel/git/torvalds/linux.git

commit ce92fc4b8bc077b562ca945adbde0bca21caefb3
Author: Jiang Jian <jiangjian@cdjrlc.com>
Date:   Tue Jun 21 19:32:34 2022 +0800

    dm raid: remove redundant "the" in parse_raid_params() comment

    Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-11-04 11:15:57 -05:00
Nigel Croxon dc8a76d0bb md: unlock mddev before reap sync_thread in action_store
Bugzilla: https://bugzilla.redhat.com/2113822

commit 9dfbdafda3b34e262e43e786077bab8e476a89d1
Author: Guoqing Jiang <guoqing.jiang@linux.dev>
Date:   Tue Jun 21 11:11:29 2022 +0800

    md: unlock mddev before reap sync_thread in action_store

    Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread
    with reconfig_mutex held") fixed is related with action_store path, other
    callers which reap sync_thread didn't need to be changed.

    Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
    bug with belows.

    1. unlock mddev before md_reap_sync_thread in action_store.
    2. save reshape_position before unlock, then restore it to ensure position
       not changed accidentally by others.

    Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
    Signed-off-by: Song Liu <song@kernel.org>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2022-10-27 14:41:55 -04:00
Nigel Croxon 4d65742213 Revert "md: don't unregister sync_thread with reconfig_mutex held"
Bugzilla: https://bugzilla.redhat.com/2113822

commit d0a180341fe00cd0bd1cc259d196dc255c13f229
Author: Guoqing Jiang <guoqing.jiang@linux.dev>
Date:   Tue Jun 7 10:03:56 2022 +0800

    Revert "md: don't unregister sync_thread with reconfig_mutex held"

    The 07reshape5intr test is broke because of below path.

        md_reap_sync_thread
                -> mddev_unlock
                -> md_unregister_thread(&mddev->sync_thread)

    And md_check_recovery is triggered by,

    mddev_unlock -> md_wakeup_thread(mddev->thread)

    then mddev->reshape_position is set to MaxSector in raid5_finish_reshape
    since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
    feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
    reshape_position can't be updated accordingly.

    Fixes: 8b48ec23cc51a ("md: don't unregister sync_thread with reconfig_mutex held")
    Reported-by: Logan Gunthorpe <logang@deltatee.com>
    Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
    Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2022-10-27 14:41:52 -04:00
Ming Lei dc66a6e5b1 md/core: Combine two sync_page_io() arguments
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2118511

commit 4ce4c73f662bdb0ae5bfb058bc7ec6f6829ca078
Author: Bart Van Assche <bvanassche@acm.org>
Date:   Thu Jul 14 11:06:57 2022 -0700

    md/core: Combine two sync_page_io() arguments

    Improve uniformity in the kernel of handling of request operation and
    flags by passing these as a single argument.

    Cc: Song Liu <song@kernel.org>
    Signed-off-by: Bart Van Assche <bvanassche@acm.org>
    Link: https://lore.kernel.org/r/20220714180729.1065367-32-bvanassche@acm.org
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2022-10-12 09:20:20 +08:00
Benjamin Marzinski d038a43f34 dm raid: fix address sanitizer warning in raid_resume
Bugzilla: https://bugzilla.redhat.com/2115117
Upstream Status: kernel/git/torvalds/linux.git

commit 7dad24db59d2d2803576f2e3645728866a056dab
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Sun Jul 24 14:33:52 2022 -0400

    dm raid: fix address sanitizer warning in raid_resume

    There is a KASAN warning in raid_resume when running the lvm test
    lvconvert-raid.sh. The reason for the warning is that mddev->raid_disks
    is greater than rs->raid_disks, so the loop touches one entry beyond
    the allocated length.

    Cc: stable@vger.kernel.org
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-08-04 12:30:48 -05:00
Benjamin Marzinski e6ffe0c9b3 dm raid: fix address sanitizer warning in raid_status
Bugzilla: https://bugzilla.redhat.com/2115117
Upstream Status: kernel/git/torvalds/linux.git

commit 1fbeea217d8f297fe0e0956a1516d14ba97d0396
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Sun Jul 24 14:31:35 2022 -0400

    dm raid: fix address sanitizer warning in raid_status

    There is this warning when using a kernel with the address sanitizer
    and running this testsuite:
    https://gitlab.com/cki-project/kernel-tests/-/tree/main/storage/swraid/scsi_raid

    ==================================================================
    BUG: KASAN: slab-out-of-bounds in raid_status+0x1747/0x2820 [dm_raid]
    Read of size 4 at addr ffff888079d2c7e8 by task lvcreate/13319
    CPU: 0 PID: 13319 Comm: lvcreate Not tainted 5.18.0-0.rc3.<snip> #1
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    Call Trace:
     <TASK>
     dump_stack_lvl+0x6a/0x9c
     print_address_description.constprop.0+0x1f/0x1e0
     print_report.cold+0x55/0x244
     kasan_report+0xc9/0x100
     raid_status+0x1747/0x2820 [dm_raid]
     dm_ima_measure_on_table_load+0x4b8/0xca0 [dm_mod]
     table_load+0x35c/0x630 [dm_mod]
     ctl_ioctl+0x411/0x630 [dm_mod]
     dm_ctl_ioctl+0xa/0x10 [dm_mod]
     __x64_sys_ioctl+0x12a/0x1a0
     do_syscall_64+0x5b/0x80

    The warning is caused by reading conf->max_nr_stripes in raid_status. The
    code in raid_status reads mddev->private, casts it to struct r5conf and
    reads the entry max_nr_stripes.

    However, if we have different raid type than 4/5/6, mddev->private
    doesn't point to struct r5conf; it may point to struct r0conf, struct
    r1conf, struct r10conf or struct mpconf. If we cast a pointer to one
    of these structs to struct r5conf, we will be reading invalid memory
    and KASAN warns about it.

    Fix this bug by reading struct r5conf only if raid type is 4, 5 or 6.

    Cc: stable@vger.kernel.org
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-08-04 12:30:48 -05:00
Nigel Croxon 62d99b9ee3 md: don't unregister sync_thread with reconfig_mutex held
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2105293
Upstream Status: kernel.org Linus tree
Tested: ran QE MD Test suite and has passed the sanity test.

Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.

And it could cause deadlock problem for raid5 as follows:

1. process A tried to reap sync thread with reconfig_mutex held after echo
   idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
   which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
   to hold reconfig_mutex.

More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

And add one parameter to md_reap_sync_thread since it could be called by
dm-raid which doesn't hold reconfig_mutex.

Reported-and-tested-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <song@kernel.org>
(cherry picked from commit 8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934)
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
2022-07-13 12:07:17 -04:00
Benjamin Marzinski 3a0ccae4a8 dm raid: fix accesses beyond end of raid member array
Bugzilla: https://bugzilla.redhat.com/2090507
Upstream Status: kernel/git/device-mapper/linux-dm.git

commit 332bd0778775d0cf105c4b9e03e460b590749916
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Jun 28 00:37:22 2022 +0200

    dm raid: fix accesses beyond end of raid member array

    On dm-raid table load (using raid_ctr), dm-raid allocates an array
    rs->devs[rs->raid_disks] for the raid device members. rs->raid_disks
    is defined by the number of raid metadata and image tupples passed
    into the target's constructor.

    In the case of RAID layout changes being requested, that number can be
    different from the current number of members for existing raid sets as
    defined in their superblocks. Example RAID layout changes include:
    - raid1 legs being added/removed
    - raid4/5/6/10 number of stripes changed (stripe reshaping)
    - takeover to higher raid level (e.g. raid5 -> raid6)

    When accessing array members, rs->raid_disks must be used in control
    loops instead of the potentially larger value in rs->md.raid_disks.
    Otherwise it will cause memory access beyond the end of the rs->devs
    array.

    Fix this by changing code that is prone to out-of-bounds access.
    Also fix validate_raid_redundancy() to validate all devices that are
    added. Also, use braces to help clean up raid_iterate_devices().

    The out-of-bounds memory accesses was discovered using KASAN.

    This commit was verified to pass all LVM2 RAID tests (with KASAN
    enabled).

    Cc: stable@vger.kernel.org
    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-06-27 19:12:41 -05:00
Ming Lei b144a611f6 block: remove QUEUE_FLAG_DISCARD
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083917
Conflicts: deal with xfs conflict because we don't backport commit ("
0560f31a09e5 xfs: convert mount flags to features"), and what we need
is just to replace blk_queue_discard() with bdev_max_discard_sectors().

commit 70200574cc229f6ba038259e8142af2aa09e6976
Author: Christoph Hellwig <hch@lst.de>
Date:   Fri Apr 15 06:52:55 2022 +0200

    block: remove QUEUE_FLAG_DISCARD

    Just use a non-zero max_discard_sectors as an indicator for discard
    support, similar to what is done for write zeroes.

    The only places where needs special attention is the RAID5 driver,
    which must clear discard support for security reasons by default,
    even if the default stacking rules would allow for it.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
    Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
    Acked-by: Coly Li <colyli@suse.de> [bcache]
    Acked-by: David Sterba <dsterba@suse.com> [btrfs]
    Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
    Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2022-06-22 08:58:02 +08:00