Commit Graph

171 Commits

Author SHA1 Message Date
Benjamin Marzinski 9aeb838b32 dm: push error reporting down to dm_register_target()
Bugzilla: https://bugzilla.redhat.com/2189971
Upstream Status: kernel/git/torvalds/linux.git

commit b362c733ed7bf312ed729847bc26ba89febc556e
Author: Yangtao Li <frank.li@vivo.com>
Date:   Sat Mar 18 21:16:33 2023 +0800

    dm: push error reporting down to dm_register_target()

    Simplifies each DM target's init method by making dm_register_target()
    responsible for its error reporting (on behalf of targets).

    Signed-off-by: Yangtao Li <frank.li@vivo.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-04-26 16:47:42 -05:00
Benjamin Marzinski 50de73e147 dm cache: add cond_resched() to various workqueue loops
Bugzilla: https://bugzilla.redhat.com/2153270
Upstream Status: kernel/git/torvalds/linux.git

commit 76227f6dc805e9e960128bcc6276647361e0827c
Author: Mike Snitzer <snitzer@kernel.org>
Date:   Thu Feb 16 15:31:08 2023 -0500

    dm cache: add cond_resched() to various workqueue loops

    Otherwise on resource constrained systems these workqueues may be too
    greedy.

    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:21 -05:00
Benjamin Marzinski 5c8e990196 dm: declare variables static when sensible
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 774f13ac2b567207f04eb34d25188f5daec57f9e
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Feb 7 23:15:36 2023 +0100

    dm: declare variables static when sensible

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:19 -05:00
Benjamin Marzinski 6598f6e422 dm: fix suspect indent whitespace
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 23fda2effbb1f2f1d2fb0640a4729e6d08ad6e6e
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Tue Feb 7 23:06:47 2023 +0100

    dm: fix suspect indent whitespace

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:19 -05:00
Benjamin Marzinski 4f64393275 dm: add missing empty lines
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git
Conflicts: Context changes due to missing upstream commit e511c4a3d2a1f
           ("dax: introduce DAX_RECOVERY_WRITE dax access mode")

commit 0ef0b4717aa6849d251b23ae1efe93ca93af540b
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Feb 1 23:42:29 2023 +0100

    dm: add missing empty lines

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:14 -05:00
Benjamin Marzinski 509b95cfb9 dm: avoid spaces before function arguments or in favour of tabs
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 8ca817c43e12847be182e0bbff9b59398373a3b8
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Feb 1 22:31:43 2023 +0100

    dm: avoid spaces before function arguments or in favour of tabs

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:14 -05:00
Benjamin Marzinski 0bcf7b0eae dm: correct block comments format.
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit a4a82ce3d24d4409143a7b7b980072ada6e20b2a
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Thu Jan 26 15:48:30 2023 +0100

    dm: correct block comments format.

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:12 -05:00
Benjamin Marzinski d8a60f4024 dm: change "unsigned" to "unsigned int"
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 86a3238c7b9b759cb864f4f768ab2e24687dc0e6
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Jan 25 21:14:58 2023 +0100

    dm: change "unsigned" to "unsigned int"

    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:11 -05:00
Benjamin Marzinski da3466f503 dm: add missing SPDX-License-Indentifiers
Bugzilla: https://bugzilla.redhat.com/2179168
Upstream Status: kernel/git/torvalds/linux.git

commit 3bd940030752a33ff665eefdd74a1cdb74a4f9b0
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Jan 25 21:00:44 2023 +0100

    dm: add missing SPDX-License-Indentifiers

    'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has
    deprecated its use.

    Suggested-by: John Wiele <jwiele@redhat.com>
    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-03-16 17:15:10 -05:00
Benjamin Marzinski 3eb6f26a44 dm cache: set needs_check flag after aborting metadata
Bugzilla: https://bugzilla.redhat.com/2162536
Upstream Status: kernel/git/torvalds/linux.git

commit 6b9973861cb2e96dcd0bb0f1baddc5c034207c5c
Author: Mike Snitzer <snitzer@kernel.org>
Date:   Wed Nov 30 14:02:47 2022 -0500

    dm cache: set needs_check flag after aborting metadata

    Otherwise the commit that will be aborted will be associated with the
    metadata objects that will be torn down.  Must write needs_check flag
    to metadata with a reset block manager.

    Found through code-inspection (and compared against dm-thin.c).

    Cc: stable@vger.kernel.org
    Fixes: 028ae9f76f ("dm cache: add fail io mode and needs_check flag")
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-01-30 09:13:57 -06:00
Benjamin Marzinski 2387a2e88a dm cache: Fix UAF in destroy()
Bugzilla: https://bugzilla.redhat.com/2162536
Upstream Status: kernel/git/torvalds/linux.git

commit 6a459d8edbdbe7b24db42a5a9f21e6aa9e00c2aa
Author: Luo Meng <luomeng12@huawei.com>
Date:   Tue Nov 29 10:48:49 2022 +0800

    dm cache: Fix UAF in destroy()

    Dm_cache also has the same UAF problem when dm_resume()
    and dm_destroy() are concurrent.

    Therefore, cancelling timer again in destroy().

    Cc: stable@vger.kernel.org
    Fixes: c6b4fcbad0 ("dm: add cache target")
    Signed-off-by: Luo Meng <luomeng12@huawei.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2023-01-30 09:13:55 -06:00
Benjamin Marzinski 7965010d5f dm cache: fix typo in 2 comment blocks
Bugzilla: https://bugzilla.redhat.com/2138462
Upstream Status: kernel/git/torvalds/linux.git

commit 5c29e784738c25be0f4ab188a88bf47697ca28fb
Author: Steven Lung <1030steven@gmail.com>
Date:   Tue Jun 21 15:12:59 2022 +0800

    dm cache: fix typo in 2 comment blocks

    Replace neccessarily with necessarily.

    Signed-off-by: Steven Lung <1030steven@gmail.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-11-04 11:15:56 -05:00
Benjamin Marzinski e596a22767 dm cache: use dm_submit_bio_remap
Bugzilla: https://bugzilla.redhat.com/2090507
Upstream Status: kernel/git/torvalds/linux.git

commit 69596f555b81942b41b10669e5faefc6ce883abb
Author: Mike Snitzer <snitzer@redhat.com>
Date:   Tue Mar 8 20:01:37 2022 -0500

    dm cache: use dm_submit_bio_remap

    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-06-27 11:00:57 -05:00
Benjamin Marzinski 2cbf07af95 dm: stop using bdevname
Bugzilla: https://bugzilla.redhat.com/2090507
Upstream Status: kernel/git/torvalds/linux.git

commit 385411ffba0c3305491346b98ba4d2cd8063f002
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Mar 1 10:38:15 2022 +0200

    dm: stop using bdevname

    Just use the %pg format specifier instead.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2022-06-27 11:00:53 -05:00
Ming Lei b144a611f6 block: remove QUEUE_FLAG_DISCARD
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083917
Conflicts: deal with xfs conflict because we don't backport commit ("
0560f31a09e5 xfs: convert mount flags to features"), and what we need
is just to replace blk_queue_discard() with bdev_max_discard_sectors().

commit 70200574cc229f6ba038259e8142af2aa09e6976
Author: Christoph Hellwig <hch@lst.de>
Date:   Fri Apr 15 06:52:55 2022 +0200

    block: remove QUEUE_FLAG_DISCARD

    Just use a non-zero max_discard_sectors as an indicator for discard
    support, similar to what is done for write zeroes.

    The only places where needs special attention is the RAID5 driver,
    which must clear discard support for security reasons by default,
    even if the default stacking rules would allow for it.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
    Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
    Acked-by: Coly Li <colyli@suse.de> [bcache]
    Acked-by: David Sterba <dsterba@suse.com> [btrfs]
    Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
    Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2022-06-22 08:58:02 +08:00
Ming Lei 9d3c40c012 block: pass a block_device to bio_clone_fast
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083917

commit abfc426d1b2fb2176df59851a64223b58ddae7e7
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Feb 2 17:01:09 2022 +0100

    block: pass a block_device to bio_clone_fast

    Pass a block_device to bio_clone_fast and __bio_clone_fast and give
    the functions more suitable names.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Mike Snitzer <snitzer@redhat.com>
    Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2022-06-22 08:56:21 +08:00
Ming Lei b994c3c7b1 dm-cache: remove __remap_to_origin_clear_discard
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083917

commit 3c4b455ef8acdacd0e5ecd33428d4f32f861637a
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Feb 2 17:01:05 2022 +0100

    dm-cache: remove __remap_to_origin_clear_discard

    Fold __remap_to_origin_clear_discard into the two callers to prepare
    for bio cloning refactoring.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Mike Snitzer <snitzer@redhat.com>
    Link: https://lore.kernel.org/r/20220202160109.108149-10-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2022-06-22 08:56:21 +08:00
Benjamin Marzinski 7cbdbc8f5e dm: update target status functions to support IMA measurement
Bugzilla: https://bugzilla.redhat.com/2031198
Upstream Status: kernel/git/torvalds/linux.git

commit 8ec456629d0bf051e41ef2c87a60755f941dd11c
Author: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Date:   Mon Jul 12 17:49:03 2021 -0700

    dm: update target status functions to support IMA measurement

    For device mapper targets to take advantage of IMA's measurement
    capabilities, the status functions for the individual targets need to be
    updated to handle the status_type_t case for value STATUSTYPE_IMA.

    Update status functions for the following target types, to log their
    respective attributes to be measured using IMA.
     01. cache
     02. crypt
     03. integrity
     04. linear
     05. mirror
     06. multipath
     07. raid
     08. snapshot
     09. striped
     10. verity

    For rest of the targets, handle the STATUSTYPE_IMA case by setting the
    measurement buffer to NULL.

    For IMA to measure the data on a given system, the IMA policy on the
    system needs to be updated to have the following line, and the system
    needs to be restarted for the measurements to take effect.

    /etc/ima/ima-policy
     measure func=CRITICAL_DATA label=device-mapper template=ima-buf

    The measurements will be reflected in the IMA logs, which are located at:

    /sys/kernel/security/integrity/ima/ascii_runtime_measurements
    /sys/kernel/security/integrity/ima/binary_runtime_measurements

    These IMA logs can later be consumed by various attestation clients
    running on the system, and send them to external services for attesting
    the system.

    The DM target data measured by IMA subsystem can alternatively
    be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with
    DM_TABLE_STATUS_CMD.

    Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2021-12-21 17:51:23 -06:00
Ming Lei cadc5bb346 dm: use bdev_nr_sectors and bdev_nr_bytes instead of open coding them
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018403

commit 6dcbb52cddd9e50c8f6625b02a31f6dffc0d1a7b
Author: Christoph Hellwig <hch@lst.de>
Date:   Mon Oct 18 12:11:05 2021 +0200

    dm: use bdev_nr_sectors and bdev_nr_bytes instead of open coding them

    Use the proper helpers to read the block device size.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Acked-by: Mike Snitzer <snitzer@redhat.com>
    Link: https://lore.kernel.org/r/20211018101130.1838532-6-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2021-12-06 16:45:25 +08:00
Mike Snitzer dc4fa29fe4 dm io tracker: factor out IO tracker
Allow other code to use dm_io_tracker.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-06-25 15:28:59 -04:00
Xu Wang 63508e38c1 dm cache: remove needless request_queue NULL pointer checks
Since commit ff9ea32381 ("block, bdi: an active gendisk always has a
request_queue associated with it") the request_queue pointer returned
from bdev_get_queue() shall never be NULL.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-03-26 14:53:42 -04:00
Zheng Yongjun b77709237e dm cache: simplify the return expression of load_mapping()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-12-22 09:54:48 -05:00
Nick Desaulniers 35d2835d2a Revert "dm cache: fix arm link errors with inline"
This reverts commit 43aeaa2957.

Since commit 0bddd227f3 ("Documentation: update for gcc 4.9 requirement")
the minimum supported version of GCC is gcc-4.9. It's now safe to remove
this code.

Link: https://github.com/ClangBuiltLinux/linux/issues/427
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-12-01 15:43:36 -05:00
Mike Snitzer d4a512edcc dm: use dm_table_get_device_name() where appropriate in targets
dm_table_get_device_name() avoids calling dm_table_get_md() followed by
dm_device_name() -- saves intermediate dm_table_get_md() call.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-09-29 16:33:08 -04:00
Christoph Hellwig 21cf866145 writeback: remove bdi->congested_fn
Except for pktdvd, the only places setting congested bits are file
systems that allocate their own backing_dev_info structures.  And
pktdvd is a deprecated driver that isn't useful in stack setup
either.  So remove the dead congested_fn stacking infrastructure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
Acked-by: David Sterba <dsterba@suse.com>
[axboe: fixup unused variables in bcache/request.c]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08 17:20:46 -06:00
Christoph Hellwig ed00aabd5e block: rename generic_make_request to submit_bio_noacct
generic_make_request has always been very confusingly misnamed, so rename
it to submit_bio_noacct to make it clear that it is submit_bio minus
accounting and a few checks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 07:27:24 -06:00
Mike Snitzer 636be4241b dm: bump version of core and various targets
Changes made during the 5.6 cycle warrant bumping the version number
for DM core and the targets modified by this commit.

It should be noted that dm-thin, dm-crypt and dm-raid already had
their target version bumped during the 5.6 merge window.

Signed-off-by; Mike Snitzer <snitzer@redhat.com>
2020-03-03 11:10:21 -05:00
Mikulas Patocka 7cdf6a0aae dm cache: fix a crash due to incorrect work item cancelling
The crash can be reproduced by running the lvm2 testsuite test
lvconvert-thin-external-cache.sh for several minutes, e.g.:
  while :; do make check T=shell/lvconvert-thin-external-cache.sh; done

The crash happens in this call chain:
do_waker -> policy_tick -> smq_tick -> end_hotspot_period -> clear_bitset
-> memset -> __memset -- which accesses an invalid pointer in the vmalloc
area.

The work entry on the workqueue is executed even after the bitmap was
freed. The problem is that cancel_delayed_work doesn't wait for the
running work item to finish, so the work item can continue running and
re-submitting itself even after cache_postsuspend. In order to make sure
that the work item won't be running, we must use cancel_delayed_work_sync.

Also, change flush_workqueue to drain_workqueue, so that if some work item
submits itself or another work item, we are properly waiting for both of
them.

Fixes: c6b4fcbad0 ("dm: add cache target")
Cc: stable@vger.kernel.org # v3.9
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-27 12:00:52 -05:00
Mikulas Patocka 26b924b93c dm cache: replace spin_lock_irqsave with spin_lock_irq
If we are in a place where it is known that interrupts are enabled,
functions spin_lock_irq/spin_unlock_irq should be used instead of
spin_lock_irqsave/spin_unlock_irqrestore.

spin_lock_irq and spin_unlock_irq are faster because they don't need to
push and pop the flags register.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-05 14:53:04 -05:00
Mikulas Patocka 13bd677a47 dm cache: fix bugs when a GFP_NOWAIT allocation fails
GFP_NOWAIT allocation can fail anytime - it doesn't wait for memory being
available and it fails if the mempool is exhausted and there is not enough
memory.

If we go down this path:
  map_bio -> mg_start -> alloc_migration -> mempool_alloc(GFP_NOWAIT)
we can see that map_bio() doesn't check the return value of mg_start(),
and the bio is leaked.

If we go down this path:
  map_bio -> mg_start -> mg_lock_writes -> alloc_prison_cell ->
  dm_bio_prison_alloc_cell_v2 -> mempool_alloc(GFP_NOWAIT) ->
  mg_lock_writes -> mg_complete
the bio is ended with an error - it is unacceptable because it could
cause filesystem corruption if the machine ran out of memory
temporarily.

Change GFP_NOWAIT to GFP_NOIO, so that the mempool code will properly
wait until memory becomes available. mempool_alloc with GFP_NOIO can't
fail, so remove the code paths that deal with allocation failure.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-10-17 11:13:50 -04:00
Mike Snitzer de7180ff90 dm cache: add support for discard passdown to the origin device
DM cache now defaults to passing discards down to the origin device.
User may disable this using the "no_discard_passdown" feature when
creating the cache device.

If the cache's underlying origin device doesn't support discards then
passdown is disabled (with warning).  Similarly, if the underlying
origin device's max_discard_sectors is less than a cache block discard
passdown will be disabled (this is required because sizing of the cache
internal discard bitset depends on it).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05 14:53:52 -05:00
Mike Snitzer 61697a6abd dm: eliminate 'split_discard_bios' flag from DM target interface
There is no need to have DM core split discards on behalf of a DM target
now that blk_queue_split() handles splitting discards based on the
queue_limits.  A DM target just needs to set max_discard_sectors,
discard_granularity, etc, in queue_limits.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-02-20 23:24:55 -05:00
Shenghui Wang c7cd55504a dm cache: destroy migration_cache if cache target registration failed
Commit 7e6358d244 ("dm: fix various targets to dm_register_target
after module __init resources created") inadvertently introduced this
bug when it moved dm_register_target() after the call to KMEM_CACHE().

Fixes: 7e6358d244 ("dm: fix various targets to dm_register_target after module __init resources created")
Cc: stable@vger.kernel.org
Signed-off-by: Shenghui Wang <shhuiw@foxmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-10-09 13:53:03 -04:00
Mike Snitzer 5d07384a66 dm cache: fix resize crash if user doesn't reload cache table
A reload of the cache's DM table is needed during resize because
otherwise a crash will occur when attempting to access smq policy
entries associated with the portion of the cache that was recently
extended.

The reason is cache-size based data structures in the policy will not be
resized, the only way to safely extend the cache is to allow for a
proper cache policy initialization that occurs when the cache table is
loaded.  For example the smq policy's space_init(), init_allocator(),
calc_hotspot_params() must be sized based on the extended cache size.

The fix for this is to disallow cache resizes of this pattern:
1) suspend "cache" target's device
2) resize the fast device used for the cache
3) resume "cache" target's device

Instead, the last step must be a full reload of the cache's DM table.

Fixes: 66a636356 ("dm cache: add stochastic-multi-queue (smq) policy")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-10-04 15:20:52 -04:00
Mike Snitzer 7209049d40 dm kcopyd: return void from dm_kcopyd_copy()
dm_kcopyd_copy() only ever returns 0 so there is no need for callers to
account for possible failure.  Same goes for dm_kcopyd_zero().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-07-31 17:33:21 -04:00
John Pittman af9313c32c dm cache: only allow a single io_mode cache feature to be requested
More than one io_mode feature can be requested when creating a dm cache
device (as is: last one wins).  The io_mode selections are incompatible
with one another, we should force them to be selected exclusively.  Add
a counter to check for more than one io_mode selection.

Fixes: 629d0a8a1a ("dm cache metadata: add "metadata2" feature")
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-07-27 15:24:18 -04:00
Mike Snitzer 72d711c876 dm: adjust structure members to improve alignment
Eliminate most holes in DM data structures that were modified by
commit 6f1c819c21 ("dm: convert to bioset_init()/mempool_init()").
Also prevent structure members from unnecessarily spanning cache
lines.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-08 11:53:14 -04:00
Kent Overstreet 6f1c819c21 dm: convert to bioset_init()/mempool_init()
Convert dm to embedded bio sets.

Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30 15:33:32 -06:00
Mike Snitzer 1eb5fa849f dm: allow targets to return output from messages they are sent
Could be useful for a target to return stats or other information.
If a target does DMEMIT() anything to @result from its .message method
then it must return 1 to the caller.

Signed-off-By: Mike Snitzer <snitzer@redhat.com>
2018-04-03 15:04:10 -04:00
monty_pavel@sina.com 7e6358d244 dm: fix various targets to dm_register_target after module __init resources created
A NULL pointer is seen if two concurrent "vgchange -ay -K <vg name>"
processes race to load the dm-thin-pool module:

 PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange"
  #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9
  0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992
  0000002 [ffff883cd743d730] oops_end at ffffffff81515c90
  0000003 [ffff883cd743d760] no_context at ffffffff81049f1b
  0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5
  0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce
  0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f
  0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae
  0000008 [ffff883cd743d980] page_fault at ffffffff81514f95
     [exception RIP: kmem_cache_alloc+108]
     RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046
     RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0
     RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000
     RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000
     R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0
     R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246
     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
  0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5
 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083
 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4
 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool]
 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod]
 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod]
 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod]

The race results in a NULL pointer because:

Process A (vgchange -ay -K):
 	a. send DM_LIST_VERSIONS_CMD ioctl;
 	b. pool_target not registered;
 	c. modprobe dm_thin_pool and wait until end.

Process B (vgchange -ay -K):
 	a. send DM_LIST_VERSIONS_CMD ioctl;
 	b. pool_target registered;
 	c. table_load->dm_table_add_target->pool_ctr;
 	d. _new_mapping_cache is NULL and panic.
Note:
 	1. process A and process B are two concurrent processes.
 	2. pool_target can be detected by process B but
 	_new_mapping_cache initialization has not ended.

To fix dm-thin-pool, and other targets (cache, multipath, and snapshot)
with the same problem, simply dm_register_target() after all resources
created during module init (as labelled with __init) are finished.

Cc: stable@vger.kernel.org
Signed-off-by: monty <monty_pavel@sina.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-12-04 10:23:10 -05:00
Mike Snitzer ef7afb3656 dm cache: lift common migration preparation code to alloc_migration()
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:45:07 -05:00
Joe Thornber ede6507d67 dm cache: remove usused deferred_cells member from struct cache
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:45:06 -05:00
Mike Snitzer 693b960ea8 dm cache: simplify get_per_bio_data() by removing data_size argument
There is only one per_bio_data size now that writethrough-specific data
was removed from the per_bio_data structure.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:44:49 -05:00
Mike Snitzer 9958f1d9a0 dm cache: remove all obsolete writethrough-specific code
Now that the writethrough code is much simpler there is no need to track
so much state or cascade bio submission (as was done, via
writethrough_endio(), to issue origin then cache IO in series).

As such the obsolete writethrough list and workqueue is also removed.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:44:48 -05:00
Mike Snitzer 2df3bae9a6 dm cache: submit writethrough writes in parallel to origin and cache
Discontinue issuing writethrough write IO in series to the origin and
then cache.

Use bio_clone_fast() to create a new origin clone bio that will be
mapped to the origin device and then bio_chain() it to the bio that gets
remapped to the cache device.  The origin clone bio does _not_ have a
copy of the per_bio_data -- as such check_if_tick_bio_needed() will not
be called.

The cache bio (parent bio) will not complete until the origin bio has
completed -- this fulfills bio_clone_fast()'s requirements as well as
the requirement to not complete the original IO until the write IO has
completed to both the origin and cache device.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:44:47 -05:00
Mike Snitzer 8e3c382777 dm cache: pass cache structure to mode functions
No functional changes, just a bit cleaner than passing cache_features
structure.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:44:42 -05:00
Joe Thornber d1260e2a3f dm cache: fix race condition in the writeback mode overwrite_bio optimisation
When a DM cache in writeback mode moves data between the slow and fast
device it can often avoid a copy if the triggering bio either:

i) covers the whole block (no point copying if we're about to overwrite it)
ii) the migration is a promotion and the origin block is currently discarded

Prior to this fix there was a race with case (ii).  The discard status
was checked with a shared lock held (rather than exclusive).  This meant
another bio could run in parallel and write data to the origin, removing
the discard state.  After the promotion the parallel write would have
been lost.

With this fix the discard status is re-checked once the exclusive lock
has been aquired.  If the block is no longer discarded it falls back to
the slower full copy path.

Fixes: b29d4986d ("dm cache: significant rework to leverage dm-bio-prison-v2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-10 15:43:39 -05:00
Linus Torvalds dff4d1f6fe - Some request-based DM core and DM multipath fixes and cleanups
- Constify a few variables in DM core and DM integrity
 
 - Add bufio optimization and checksum failure accounting to DM integrity
 
 - Fix DM integrity to avoid checking integrity of failed reads
 
 - Fix DM integrity to use init_completion
 
 - A couple DM log-writes target fixes
 
 - Simplify DAX flushing by eliminating the unnecessary flush abstraction
   that was stood up for DM's use.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZuo8UAAoJEMUj8QotnQNa5BEIANO4mHh1nrzEbH72a4RCLgxV
 H1Pk1zZx/W1bhOOmcRRhxCSM85dPgsCegc5EmpwLZEMavQrP9UZblHcYOUsyIx7W
 S/lWa+soOq/5N2OveROc4WdoWVs50UFmc1+BcClc4YrEe+15XC3R0VMkjX2b/hUL
 o2eYhPjpMlgaorMtRRU6MAooo2fBRQ9m05aPeVgd35fxibrE7PZm+EYW09wa0STi
 9ufuDXJf8+TtFP/38BD41LbUEskuHUZTSDeAJ+3DBaTtfEZcZYxsst4P9JangsHx
 jqqqI9aYzFD2a27fl9WLhCvm40YFiKp5nwzED0RZjzWxVa/jTShX7a49BdzTTfw=
 =rkSB
 -----END PGP SIGNATURE-----

Merge tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Some request-based DM core and DM multipath fixes and cleanups

 - Constify a few variables in DM core and DM integrity

 - Add bufio optimization and checksum failure accounting to DM
   integrity

 - Fix DM integrity to avoid checking integrity of failed reads

 - Fix DM integrity to use init_completion

 - A couple DM log-writes target fixes

 - Simplify DAX flushing by eliminating the unnecessary flush
   abstraction that was stood up for DM's use.

* tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dax: remove the pmem_dax_ops->flush abstraction
  dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK
  dm integrity: make blk_integrity_profile structure const
  dm integrity: do not check integrity for failed read operations
  dm log writes: fix >512b sectorsize support
  dm log writes: don't use all the cpu while waiting to log blocks
  dm ioctl: constify ioctl lookup table
  dm: constify argument arrays
  dm integrity: count and display checksum failures
  dm integrity: optimize writing dm-bufio buffers that are partially changed
  dm rq: do not update rq partially in each ending bio
  dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior
  dm mpath: complain about unsupported __multipath_map_bio() return values
  dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
2017-09-14 13:43:16 -07:00
Eric Biggers 5916a22b83 dm: constify argument arrays
The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-08-28 11:47:18 -04:00
Christoph Hellwig 74d46992e0 block: replace bi_bdev with a gendisk pointer and partitions index
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23 12:49:55 -06:00