Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Ewan D. Milne	67b4c32b01	scsi: scsi_debug: Skip host/bus reset settle delay JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Skip the reset settle delay during error handling since the scsi_debug driver doesn't need this delay. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241216184852.2626339-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 29081c21a7064cdbf29295d07f0e44776280918e) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-13 14:10:18 -04:00
Ewan D. Milne	fc56a68284	scsi: scsi_debug: Fix hrtimer support for ndelay JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Since commit `771f712ba5` ("scsi: scsi_debug: Fix cmd duration calculation"), ns_from_boot value is only evaluated in schedule_resp() for polled requests. However, ns_from_boot is also required for hrtimer support for when ndelay is less than INCLUSIVE_TIMING_MAX_NS, so fix up the logic to decide when to evaluate ns_from_boot. Fixes: `771f712ba5` ("scsi: scsi_debug: Fix cmd duration calculation") Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241202130045.2335194-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 6918141d815acef056a0d10e966a027d869a922d) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-13 14:10:17 -04:00
Ewan D. Milne	ee7ecbbb49	scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy length JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline If the sg_copy_buffer() call returns less than sdebug_sector_size, then we drop out of the copy loop. However, we still report that we copied the full expected amount, which is not proper. Fix by keeping a running total and return that value. Fixes: 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support") Reported-by: Colin Ian King <colin.i.king@gmail.com> Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241018101655.4207-1-john.g.garry@oracle.com Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit d28d17a845600dd9f7de241de9b1528a1b138716) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-13 14:10:17 -04:00
Ewan D. Milne	08eda0014e	scsi: scsi_debug: Maintain write statistics per group number JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Conflicts: Merge differences due to out-of-order commit 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support") Track per GROUP NUMBER how many write commands have been processed. Make this information available in sysfs. Reset these statistics if any data is written into the sysfs attribute. Note: SCSI devices should only interpret the information in the GROUP NUMBER field as a stream identifier if the ST_ENBLE bit has been set to one. This patch follows a simpler approach: count the number of writes per GROUP NUMBER whether or not the group number represents a stream identifier. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-20-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit af180c0880f9df14be31807f0bb0fa6f0d34a943) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-13 14:07:03 -04:00
Ewan D. Milne	957381f636	scsi: scsi_debug: Implement GET STREAM STATUS JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Implement the GET STREAM STATUS SCSI command. Report that the first five stream indexes correspond to permanent streams. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-19-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit ad620becda436fc02e50e5f6fe01de1d1f3794c9) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 15:20:17 -04:00
Ewan D. Milne	41334f02e7	scsi: scsi_debug: Implement the IO Advice Hints Grouping mode page JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Implement an IO Advice Hints Grouping mode page with three permanent streams. A permanent stream is a stream for which the device server does not allow closing or otherwise modifying the configuration of that stream. The stream identifier enable (ST_ENBLE) bit specifies whether the stream identifier may be used in the GROUP NUMBER field of SCSI WRITE commands. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-18-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit f8ab2710177a762ce0f9b8426f9fc292394949df) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 15:16:47 -04:00
Ewan D. Milne	110eb6ffb8	scsi: scsi_debug: Allocate the MODE SENSE response from the heap JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Make the MODE SENSE response buffer larger and allocate it from the heap. This patch prepares for adding support for the IO Advice Hints Grouping mode page. Suggested-by: Douglas Gilbert <dgilbert@interlog.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-17-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit b952eb270df38bc0d930a1ef965666ecf54a2097) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 15:14:56 -04:00
Ewan D. Milne	630c356aad	scsi: scsi_debug: Rework subpage code error handling JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Move the subpage code checks into the switch statement to make it easier to add support for new page code / subpage code combinations. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-16-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit f19c3e4fe2542d7b145d294386666958c9fabe17) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 15:14:27 -04:00
Ewan D. Milne	77b9efe948	scsi: scsi_debug: Rework page code error handling JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Instead of tracking whether or not the page code is valid in a boolean variable, jump to error handling code if an unsupported page code is encountered. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-15-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit b2f860903fe9774f755a917edc674ba6e879fa55) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 15:13:57 -04:00
Ewan D. Milne	b87edb7b5a	scsi: scsi_debug: Support the block limits extension VPD page JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline >From SBC-5 r05: "Reduced stream control: a) reduces the maximum number of streams that the device server supports; and b) increases the number of write commands that are able to specify a stream to be written in any write command that contains the GROUP NUMBER field in its CDB. If the RSCS bit (see 6.6.5) is set to one, then the device server shall: a) support per group stream identifier usage as described in 4.32.2; b) support the IO Advice Hints Grouping mode page (see 6.5.7); and c) set the MAXIMUM NUMBER OF STREAMS field (see 6.6.5) to a value that is less than 64. Device servers that set the RSCS bit to one may support other features (e.g., permanent streams (see 4.32.4)). 4.32.4 Permanent streams A permanent stream is a stream for which the device server does not allow closing or otherwise modifying the configuration of that stream. The PERM bit (see 5.9.2.3) indicates whether a stream is a permanent stream. If a STREAM CONTROL command (see 5.32) specifies the closing of a permanent stream, the device server terminates that command with CHECK CONDITION status instead of closing the specified stream. A permanent stream is always an open stream. Device severs should assign the lowest numbered stream identifiers to permanent streams." Report that reduced stream control is supported. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-14-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit b1e5c0b34db8e7dac04af618e53c64e70c86aac8) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 15:12:44 -04:00
Ewan D. Milne	823eacd626	scsi: scsi_debug: Reduce code duplication JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline All VPD pages have the page code in byte one. Reduce code duplication by storing the VPD page code once. Reviewed-by: Avri Altman <avri.altman@wdc.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240130214911.1863909-13-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit a5fe98eb8f630f3ad3d1d5c16374621e8c0cd702) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-11 14:42:30 -04:00
Ewan D. Milne	e631137821	scsi: scsi_debug: Remove a useless memset() JIRA: https://issues.redhat.com/browse/RHEL-62151 Upstream Status: From upstream linux mainline 'arr' is kzalloc()'ed, so there is no need to call memset(.., 0, ...) on it. It is already cleared. This is a follow up of commit b952eb270df3 ("scsi: scsi_debug: Allocate the MODE SENSE response from the heap"). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/6296722174e39a51cac74b7fc68b0d75bd0db2a3.1725690433.git.christophe.jaillet@wanadoo.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit bba20b894e3c2e20f1ac914561b9ac241e0e359e) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-11-15 17:58:05 -05:00
Ewan D. Milne	b43f91a3e8	scsi: scsi_debug: Fix create target debugfs failure JIRA: https://issues.redhat.com/browse/RHEL-62151 Upstream Status: From upstream linux mainline Target debugfs entry is removed via async_schedule() which isn't drained when adding same name target, so failure of "Directory 'target11:0:0' with parent 'scsi_debug' already present!" can be triggered easily. Fix it by switching to domain async schedule, and draining it before adding new target debugfs entry. Cc: Wenchao Hao <haowenchao2@huawei.com> Fixes: f084fe52c640 ("scsi: scsi_debug: Add debugfs interface to fail target reset") Signed-off-by: Ming Lei <ming.lei@redhat.com> Acked-by: Wenchao Hao <haowenchao22@gmail.com> Link: https://lore.kernel.org/r/20240619013803.3008857-1-ming.lei@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit b402a0dce64aa3e14a9bd15ab1dd87a93967f90c) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-11-15 17:58:02 -05:00
Ming Lei	a5718e446f	scsi: scsi_debug: Atomic write support JIRA: https://issues.redhat.com/browse/RHEL-56837 Conflicts: context difference because we don't backport patchset of "Pass data lifetime information to SCSI disk devices" commit 84f3a3c01d70efba736bc42155cf32722067b327 Author: John Garry <john.g.garry@oracle.com> Date: Thu Jun 20 12:53:58 2024 +0000 scsi: scsi_debug: Atomic write support Add initial support for atomic writes. As is standard method, feed device properties via modules param, those being: - atomic_max_size_blks - atomic_alignment_blks - atomic_granularity_blks - atomic_max_size_with_boundary_blks - atomic_max_boundary_blks These just match sbc4r22 section 6.6.4 - Block limits VPD page. We just support ATOMIC WRITE (16). The major change in the driver is how we lock the device for RW accesses. Currently the driver uses a per-device lock for accessing device metadata and "media" data (calls to do_device_access()) atomically for the duration of the whole read/write command. This should not suit verifying atomic writes. Reason being that currently all reads/writes are atomic, so using atomic writes does not prove anything. Change device access model to basis that regular writes only atomic on a per-sector basis, while reads and atomic writes are fully atomic. As mentioned, since accessing metadata and device media is atomic, continue to have regular writes involving metadata - like discard or PI - as atomic. We can improve this later. Currently we only support model where overlapping going reads or writes wait for current access to complete before commencing an atomic write. This is described in 4.29.3.2 section of the SBC. However, we simplify, things and wait for all accesses to complete (when issuing an atomic write). Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20240620125359.2684798-10-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2024-09-27 11:19:10 +08:00
Ewan D. Milne	ecb7089d84	scsi: scsi_debug: Make pseudo_lld_bus const JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Now that the driver core can properly handle constant struct bus_type, move the pseudo_lld_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Link: https://lore.kernel.org/r/20240203-bus_cleanup-scsi-v1-3-6f552fb24f71@marliere.net Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit ac0dd0f33adb804b8301ae415a91f56f97f40bae) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:16 -04:00
Ewan D. Milne	a6ec9c196e	scsi: scsi_debug: Delete some bogus error checking JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Smatch complains that "dentry" is never initialized. These days everyone initializes all their stack variables to zero so this means that it will trigger a warning every time this function is run. Really, debugfs functions are not supposed to be checked for errors in normal code. For example, if we updated this code to check the correct variable then it would print a warning if CONFIG_DEBUGFS was disabled. We don't want that. Just delete the check. Fixes: f084fe52c640 ("scsi: scsi_debug: Add debugfs interface to fail target reset") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/c602c9ad-5e35-4e18-a47f-87ed956a9ec2@moroto.mountain Reviewed-by: Wenchao Hao <haowenchao2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 037fbd3fcfbd99145f9310d93f6637012807cfd0) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:07 -04:00
Ewan D. Milne	49dadb705b	scsi: scsi_debug: Fix some bugs in sdebug_error_write() JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline There are two bug in this code: 1) If count is zero, then it will lead to a NULL dereference. The kmalloc() will successfully allocate zero bytes and the test for "if (buf[0] == '-')" will read beyond the end of the zero size buffer and Oops. 2) The code does not ensure that the user's string is properly NUL terminated which could lead to a read overflow. Fixes: a9996d722b11 ("scsi: scsi_debug: Add interface to manage error injection for a single device") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/7733643d-e102-4581-8d29-769472011c97@moroto.mountain Reviewed-by: Wenchao Hao <haowenchao2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 860c3d03bbc3f17aef8600662c488f27fd093142) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:07 -04:00
Ewan D. Milne	c5cb0883c7	scsi: scsi_debug: Add param to control sdev's allow_restart JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Conflicts: Merge differences Add new module param "allow_restart" to control scsi_device's allow_restart flag. This flag determines if EH is triggered after a command completes with sense_key 0x6, ASC 0x4 and ASCQ 0x2. EH would be triggered if allow_restart=1 in this condition. The new param can be used with the error injection capability to test how commands completing with sense_key 0x6, ASC 0x4 and ASCQ 0x2 are handled. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-11-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 573c2d066eb950dd9bd6e8735d3a859bbc21b3cc) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:05 -04:00
Ewan D. Milne	1babffc555	scsi: scsi_debug: Add debugfs interface to fail target reset JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline The interface is found at /sys/kernel/debug/scsi_debug/target<h:c:t>/fail_reset where <h:c:t> identifies the target to inject errors on. It's a simple bool type interface which would make this target's reset fail if set to 'Y'. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-10-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit f084fe52c640775de51056670f568ec7104b922a) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:05 -04:00
Ewan D. Milne	de43835a93	scsi: scsi_debug: Add new error injection type: Reset LUN failed JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Add error injection type 4 to make scsi_debug_device_reset() return FAILED. Fail abort command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x4 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "4 -10 0x12" > ${error} will make the device return FAILED when trying to reset LUN with inquiry command 10 times. error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "4 -10 0xff" > ${error} will make the device return FAILED when trying to reset LUN 10 times. Usually we do not care about what command it is when trying to perform reset LUN, so 0xff could be applied. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-9-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 0267811625e13a7743eeb6072b57509bf909f484) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:05 -04:00
Ewan D. Milne	a3cef1b4fa	scsi: scsi_debug: Add new error injection type: Abort Failed JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Add error injection type 3 to make scsi_debug_abort() return FAILED. Fail abort command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x3 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "3 -10 0x12" > ${error} will make the device return FAILED when aborting inquiry command 10 times. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-8-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 5551ce928805ba790db0fa0a895e207d5d05717d) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:05 -04:00
Ewan D. Milne	04793daf04	scsi: scsi_debug: Set command result and sense data if error is injected JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline If a fail command error is injected, set the command's status and sense data then finish this SCSI command. Set SCSI command's status and sense data format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x2 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error Count \| \| \| \| 0: the rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| 4 \| x8 \| Host byte in scsi_cmd::status \| \| \| \| [scsi_cmd::status has 32 bits holding these 3 bytes] \| +--------+------+-------------------------------------------------------+ \| 5 \| x8 \| Driver byte in scsi_cmd::status \| +--------+------+-------------------------------------------------------+ \| 6 \| x8 \| SCSI Status byte in scsi_cmd::status \| +--------+------+-------------------------------------------------------+ \| 7 \| x8 \| SCSI Sense Key in scsi_cmnd \| +--------+------+-------------------------------------------------------+ \| 8 \| x8 \| SCSI ASC in scsi_cmnd \| +--------+------+-------------------------------------------------------+ \| 9 \| x8 \| SCSI ASCQ in scsi_cmnd \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "2 -10 0x88 0 0 0x2 0x3 0x11 0x0" >${error} will make device's read command return with media error with additional sense of "Unrecovered read error" (UNC): Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-7-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 33592274321eab2e35e2fdf01857c722431fcf8f) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:05 -04:00
Ewan D. Milne	a4addb9783	scsi: scsi_debug: Return failed value if error is injected JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline If a fail queuecommand error is injected, return the failed value defined in the rule from queuecommand. Make queuecommand return format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x1 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| 4 \| x32 \| The queuecommand() return value we want \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "1 1 0x12 0x1055" > ${error} will make each INQUIRY command sent to that device return 0x1055 (SCSI_MLQUEUE_HOST_BUSY). Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-6-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 33bccf55c20b66dfca6644c8dc9b396cec82e85c) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:04 -04:00
Ewan D. Milne	565e0f670a	scsi: scsi_debug: Time out command if the error is injected JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline If a timeout error is injected, return 0 from scsi_debug_queuecommand to make the command time out. Time out SCSI command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x0 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "0 -10 0x12" > ${error} will make the device's inquiry command time out 10 times. echo "0 1 0x12" > ${error} will make the device's inquiry time out each time it is invoked on this device. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 32be8b6e22eb76a08c66414593ef02d5eb151be7) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:04 -04:00
Ewan D. Milne	6f71863199	scsi: scsi_debug: Define grammar to remove added error injection JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline The grammar to remove error injection is a line with fixed 3 columns separated by spaces. First column is fixed to "-". It tells this is a removal operation. Second column is the error code to match. Third column is the scsi command to match. For example the following command would remove timeout injection of inquiry command: echo "- 0 0x12" > /sys/kernel/debug/scsi_debug/0:0:0:1/error Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-4-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 962d77cd4c852e6a34ffd44fc5f32ba678e02633) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:04 -04:00
Ewan D. Milne	98ddabc8cd	scsi: scsi_debug: Add interface to manage error injection for a single device JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline This new facility uses the debugfs pseudo file system which is typically mounted under the /sys/kernel/debug directory and requires root permissions to access. The interface file is found at /sys/kernel/debug/scsi_debug/<h:c:t:l>/error where <h:c:t:l> identifies the device (logical unit (LU)) to inject errors on. For the following description the ${error} environment variable is assumed to be set to/sys/kernel/debug/scsi_debug/1:0:0:0/error where 1:0:0:0 is a pseudo device (LU) owned by the scsi_debug driver. Rules are written to ${error} in the normal sysfs fashion (e.g. 'echo "0 -2 0x12" > ${error}'). More than one rule can be active on a device at a time and inactive rules (i.e. those whose error count is 0) remain in the rule listing. The existing rules can be read with 'cat ${error}' with oneline output for each rule. The interface format is line-by-line, each line is an error injection rule. Each rule contains integers separated by spaces, the first three columns correspond to "Error code", "Error count" and "SCSI command", other columns depend on Error code. General rule format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error code \| \| \| \| 0: timeout SCSI command \| \| \| \| 1: fail queuecommand, make queuecommand return \| \| \| \| given value \| \| \| \| 2: fail command, finish command with SCSI status, \| \| \| \| sense key and ASC/ASCQ values \| \| \| \| 3: make abort commands for specific command fail \| \| \| \| 4: make reset lun for specific command fail \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| ... \| xxx \| Error type specific fields \| +--------+------+-------------------------------------------------------+ Notes: - When multiple error inject rules are added for the same SCSI command, the one with smaller error code will take effect (and the others will be ignored). - If the same error (i.e. same Error code and SCSI command) is added, the older one will be overwritten.. - Currently, the basic types are (u8/u16/u32/u64/s8/s16/s32/s64) and the hexadecimal types (x8/x16/x32/x64). - Where a hexadecimal value is expected (e.g. Column 3: SCSI command opcode) the "0x" prefix is optional on the value (e.g. the INQUIRY opcode can be given as '0x12' or '12'). - When the Error count is negative, reading ${error} will show that value incrementing, stopping when it gets to 0. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-3-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit a9996d722b1197b625acfa350dd2849e35ad6092) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:04 -04:00
Ewan D. Milne	a43e72c2c2	scsi: scsi_debug: Create scsi_debug directory in the debugfs filesystem JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Create directory scsi_debug in the root of the debugfs filesystem. Prepare to add interface for manage error injection. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-2-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 6e2d15f59b1cc6ed613b94e0969335a7868f04ca) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:03 -04:00
Ming Lei	d464850fc8	block: remove support for the host aware zone model JIRA: https://issues.redhat.com/browse/RHEL-25988 Conflicts: drop change on ublk, btrfs and f2fs, all are not supported on rhel9.5 commit 7437bb73f087e5f216f9c6603f5149d354e315af Author: Christoph Hellwig <hch@lst.de> Date: Sun Dec 17 17:53:57 2023 +0100 block: remove support for the host aware zone model When zones were first added the SCSI and ATA specs, two different models were supported (in addition to the drive managed one that is invisible to the host): - host managed where non-conventional zones there is strict requirement to write at the write pointer, or else an error is returned - host aware where a write point is maintained if writes always happen at it, otherwise it is left in an under-defined state and the sequential write preferred zones behave like conventional zones (probably very badly performing ones, though) Not surprisingly this lukewarm model didn't prove to be very useful and was finally removed from the ZBC and SBC specs (NVMe never implemented it). Due to to the easily disappearing write pointer host software could never rely on the write pointer to actually be useful for say recovery. Fortunately only a few HDD prototypes shipped using this model which never made it to mass production. Drop the support before it is too late. Note that any such host aware prototype HDD can still be used with Linux as we'll now treat it as a conventional HDD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2024-03-07 13:19:59 +08:00
Ming Lei	141753077d	scsi: scsi_debug: Remove dead code JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 23815df5af5790c6e99b6bb1ffd39d509d0a7bdb Author: Maurizio Lombardi <mlombard@redhat.com> Date: Wed Jun 28 17:06:38 2023 +0200 scsi: scsi_debug: Remove dead code The ramdisk rwlocks are not used anymore. Fixes: `87c715dcde` ("scsi: scsi_debug: Add per_host_store option") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20230628150638.53218-1-mlombard@redhat.com Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:27 +08:00
Ming Lei	a4a9bb24b5	scsi: scsi_debug: Abort commands from scsi_debug_device_reset() JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 0c028b6a115e7f18480ec3f98ba7bccf011646ea Author: John Garry <john.g.garry@oracle.com> Date: Sun Apr 16 17:56:54 2023 +0000 scsi: scsi_debug: Abort commands from scsi_debug_device_reset() Currently scsi_debug_device_reset() does not do much apart from setting the SDEBUG_UA_POR ("Power on, reset, or bus device reset") flag, which is eventually passed back to the SCSI midlayer later for a "unit attention" command. There is a report that blktest scsi/007 test fails due to commit 1107c7b24ee3 ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd"). The problem there is that there are dangling scsi_debug queued commands when we attempt to remove the driver. scsi/007 test triggers SCSI EH and attempts to abort a timed-out command. Function scsi_debug_device_reset() is called as part of the EH, but does not deal with outstanding erroneous command. Prior to the named commit, removing the driver caused all dangling queued commands to be stopped - this should have not been necessary. Fix by aborting outstanding commands on a scsi_device basis from scsi_debug_device_reset(). Fixes: 1107c7b24ee3 ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd") Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202304071111.e762fcbd-yujie.liu@intel.com Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230416175654.159163-1-john.g.garry@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:26 +08:00
Ming Lei	5ece060306	scsi: scsi_debug: Fix missing error code in scsi_debug_init() JIRA: https://issues.redhat.com/browse/RHEL-15276 commit b32283d75335d8263fc9f5ae16c8a196f1d8b5d5 Author: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Date: Thu Apr 6 00:46:07 2023 -0700 scsi: scsi_debug: Fix missing error code in scsi_debug_init() Smatch reports: drivers/scsi/scsi_debug.c:6996 scsi_debug_init() warn: missing error code 'ret' Although it is unlikely that KMEM_CACHE might fail, but if it does then ret might be zero. So to fix this explicitly mark ret as "-ENOMEM" and then goto driver_unreg. Fixes: 1107c7b24ee3 ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20230406074607.3637097-1-harshit.m.mogalapalli@oracle.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:25 +08:00
Ming Lei	c0606cf316	scsi: scsi_debug: Drop sdebug_queue JIRA: https://issues.redhat.com/browse/RHEL-15276 commit f1437cd1e535c5d5cc9f6e5bfdfc9b1cd3141bc4 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:10 2023 +0000 scsi: scsi_debug: Drop sdebug_queue It's easy to get scsi_debug to error on throughput testing when we have multiple shosts: $ lsscsi [7:0:0:0] disk Linux scsi_debug 0191 [0:0:0:0] disk Linux scsi_debug 0191 $ fio --filename=/dev/sda --filename=/dev/sdb --direct=1 --rw=read --bs=4k --iodepth=256 --runtime=60 --numjobs=40 --time_based --name=jpg --eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error jpg: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256 ... fio-3.28 Starting 40 processes [ 27.521809] hrtimer: interrupt took 33067 ns [ 27.904660] sd 7:0:0:0: [sdb] tag#171 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s [ 27.904660] sd 0:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s fio: io_u error [ 27.904667] sd 0:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 00 00 27 00 00 01 18 00 on file /dev/sda[ 27.904670] sd 0:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s The issue is related to how the driver manages submit queues and tags. A single array of submit queues - sdebug_q_arr - with its own set of tags is shared among all shosts. As such, for occasions when we have more than one shost it is possible to overload the submit queues and run out of tags. The struct sdebug_queue is to manage tags and hold the associated queued command entry pointer (for that tag). Since the tagset iters are now used for functions like sdebug_blk_mq_poll(), there is no need to manage these queues. Indeed, blk-mq already provides what we need for managing tags and queues. Drop sdebug_queue and all its usage in the driver. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-12-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:24 +08:00
Ming Lei	63a1904bf5	scsi: scsi_debug: Only allow sdebug_max_queue be modified when no shosts JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 57f7225a4fe25425c29402adad990c7409958c40 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:09 2023 +0000 scsi: scsi_debug: Only allow sdebug_max_queue be modified when no shosts The shost->can_queue value is initially used to set per-HW queue context tag depth in the block layer. This ensures that the shost is not sent too many commands which it can deal with. However lowering sdebug_max_queue separately means that we can easily overload the shost, as in the following example: $ cat /sys/bus/pseudo/drivers/scsi_debug/max_queue 192 $ cat /sys/class/scsi_host/host0/can_queue 192 $ echo 100 > /sys/bus/pseudo/drivers/scsi_debug/max_queue $ cat /sys/class/scsi_host/host0/can_queue 192 $ fio --filename=/dev/sda --direct=1 --rw=read --bs=4k --iodepth=256 --runtime=1200 --numjobs=10 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error iops-test-job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256 ... fio-3.28 Starting 10 processes [ 111.269885] scsi_io_completion_action: 400 callbacks suppressed [ 111.269885] blk_print_req_error: 400 callbacks suppressed [ 111.269889] I/O error, dev sda, sector 440 op 0x0:(READ) flags 0x1200000 phys_seg 1 prio class 2 [ 111.269892] sd 0:0:0:0: [sda] tag#132 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s [ 111.269897] sd 0:0:0:0: [sda] tag#132 CDB: Read(10) 28 00 00 00 01 68 00 00 08 00 [ 111.277058] I/O error, dev sda, sector 360 op 0x0:(READ) flags 0x1200000 phys_seg 1 prio class 2 [...] Ensure that this cannot happen by allowing sdebug_max_queue be modified only when we have no shosts. As such, any shost->can_queue value will match sdebug_max_queue, and sdebug_max_queue cannot be modified separately. Since retired_max_queue is no longer set, remove support. Continue to apply the restriction that sdebug_host_max_queue cannot be modified when sdebug_host_max_queue is set. Adding support for that would mean extra code, and no one has complained about this restriction previously. A command like the following may be used to remove a shost: echo -1 > /sys/bus/pseudo/drivers/scsi_debug/add_host Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-11-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:23 +08:00
Ming Lei	bf9059a20a	scsi: scsi_debug: Use scsi_host_busy() in delay_store() and ndelay_store() JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 12f3eef016ea7a72c6e0d0fe6f66882086d9c4a9 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:08 2023 +0000 scsi: scsi_debug: Use scsi_host_busy() in delay_store() and ndelay_store() The functions to update ndelay and delay value first check whether we have any in-flight IO for any host. It does this by checking if any tag is used in the global submit queues. We can achieve the same by setting the host as blocked and then ensuring that we have no in-flight commands with scsi_host_busy(). Note that scsi_host_busy() checks SCMD_STATE_INFLIGHT flag, which is only set per command after we ensure that the host is not blocked, i.e. we see more commands active after the check for scsi_host_busy() returns 0. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-10-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:22 +08:00
Ming Lei	41ae7d990a	scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued() JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 9c559c9b4748fed11687694e65e5d6d1eb2919cd Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:07 2023 +0000 scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued() Instead of iterating all deferred commands in the submission queue structures, use blk_mq_tagset_busy_iter(), which is a standard API for this. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-9-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:21 +08:00
Ming Lei	d56349d9cd	scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in sdebug_blk_mq_poll() JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 600d9ead3936b2f22e664c59345a2e006ff324c5 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:06 2023 +0000 scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in sdebug_blk_mq_poll() Instead of iterating all deferred commands in the submission queue structures, use blk_mq_tagset_busy_iter(), which is a standard API for this. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-8-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:20 +08:00
Ming Lei	2268a1499e	scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 1107c7b24ee3280abfc59f1b9186e285cabdd3ec Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:05 2023 +0000 scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd Eventually we will drop the sdebug_queue struct as it is not really required, so start with making the sdebug_queued_cmd dynamically allocated for the lifetime of the scsi_cmnd in the driver. As an interim measure, make sdebug_queued_cmd.sd_dp a pointer to struct sdebug_defer. Also keep a value of the index allocated in sdebug_queued_cmd.qc_arr in struct sdebug_queued_cmd. To deal with an races in accessing the scsi cmnd allocated struct sdebug_queued_cmd, add a spinlock for the scsi command in its priv area. Races may be between scheduling a command for completion, aborting a command, and the command actually completing and freeing the struct sdebug_queued_cmd. [mkp: typo fix] Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-7-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:19 +08:00
Ming Lei	3c16be9a04	scsi: scsi_debug: Use scsi_block_requests() to block queues JIRA: https://issues.redhat.com/browse/RHEL-15276 commit a0473bf31df5bf2da7ecb50023c129659ce0a835 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:04 2023 +0000 scsi: scsi_debug: Use scsi_block_requests() to block queues The feature to block queues is quite dubious, since it races with in-flight IO. Indeed, it seems unnecessary for block queues for any times we do so. Anyway, to keep the same behaviour, use standard SCSI API to stop IO being sent - scsi_{un}block_requests(). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-6-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:18 +08:00
Ming Lei	254b1ffe44	scsi: scsi_debug: Protect block_unblock_all_queues() with mutex JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 25b80b2c7582ea15ba90b8007f1e1f1b8fc762b9 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:03 2023 +0000 scsi: scsi_debug: Protect block_unblock_all_queues() with mutex There is no reason that calls to block_unblock_all_queues() from different context can't race with one another, so protect with the sdebug_host_list_mutex. There's no need for a more fine-grained per shost locking here (and we don't have a per-host lock anyway). Also simplify some touched code in sdebug_change_qdepth(). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-5-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:17 +08:00
Ming Lei	8dfe78184a	scsi: scsi_debug: Change shost list lock to a mutex JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 0aaa3fad4fd931ba3accac3a1fcc7334ca780591 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:02 2023 +0000 scsi: scsi_debug: Change shost list lock to a mutex The shost list lock, sdebug_host_list_lock, is a spinlock. We would only lock in non-atomic context in this driver, so use a mutex instead, which is friendlier if we need to schedule when iterating. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-4-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:16 +08:00
Ming Lei	180ed3a853	scsi: scsi_debug: Don't iter all shosts in clear_luns_changed_on_target() JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 00f9d622e8b237d8403569ee51f7bfb9bf89a2d5 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:01 2023 +0000 scsi: scsi_debug: Don't iter all shosts in clear_luns_changed_on_target() In clear_luns_changed_on_target(), we iter all devices for all shosts to conditionally clear the SDEBUG_UA_LUNS_CHANGED flag in the per-device uas_bm. One condition to see whether we clear the flag is to test whether the host for the device under consideration is the same as the matching device's (devip) host. This check will only ever pass for devices for the same shost, so only iter the devices for the matching device shost. We can now drop the spinlock'ing of the sdebug_host_list_lock in the same function. This will allow us to use a mutex instead of the spinlock for the global shost lock, as clear_luns_changed_on_target() could be called in non-blocking context, in scsi_debug_queuecommand() -> make_ua() -> clear_luns_changed_on_target() (which is why required a spinlock). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-3-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:15 +08:00
Ming Lei	ad1b5a491b	scsi: scsi_debug: Fix check for sdev queue full JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 6500d2045d5247cfb2ac31cc1691d7191096389b Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 27 07:43:00 2023 +0000 scsi: scsi_debug: Fix check for sdev queue full There is a report that the blktests scsi/004 test for "TASK SET FULL" (TSF) now fails. The condition upon we should issue this TSF is when the sdev queue is full. The check for a full queue has an off-by-1 error. Previously we would increment the number of requests in the queue after testing if the queue would be full, i.e. test if one less than full. Since we now use scsi_device_busy() to count the number of requests in the queue, this would already account for the current request, so fix the test for queue full accordingly. Fixes: 151f0ec9ddb5 ("scsi: scsi_debug: Drop sdebug_dev_info.num_in_q") Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202303201334.18b30edc-oliver.sang@intel.com Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-2-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:14 +08:00
Ming Lei	c631ad39ab	scsi: scsi_debug: Remove redundant driver match function JIRA: https://issues.redhat.com/browse/RHEL-15276 commit c45b3804292ba5f95b86a8866e2e2cac03fa0155 Author: Lizhe <sensor1010@163.com> Date: Sun Mar 19 12:27:32 2023 +0800 scsi: scsi_debug: Remove redundant driver match function If there is no driver match function, the driver core assumes that each candidate pair (driver, device) matches, see driver_match_device(). Drop the pseudo_lld bus match function that always returned 1. This results in the same behaviour as when there is no match function. [mkp+jgg: patch description] Signed-off-by: Lizhe <sensor1010@163.com> Link: https://lore.kernel.org/r/20230319042732.278691-1-sensor1010@163.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:13 +08:00
Ming Lei	aee2a44179	scsi: scsi_debug: Add poll mode deferred completions to statistics JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 548ebb335f743fa2647fe61bb1ad29d2c706afda Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:14 2023 +0000 scsi: scsi_debug: Add poll mode deferred completions to statistics Currently commands completed via poll mode are not included in the statistics gathering for deferred completions and missed CPUs. Poll mode completions should be treated the same as other deferred completion types, so add poll mode completions to the statistics. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-12-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:12 +08:00
Ming Lei	4bf6b7ff7a	scsi: scsi_debug: Get command abort feature working again JIRA: https://issues.redhat.com/browse/RHEL-15276 commit f037b5cb07138cd519f35fd08ebef2faf075959f Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:13 2023 +0000 scsi: scsi_debug: Get command abort feature working again The command abort feature allows us to test aborting a command which has timed-out. The idea is that for specific commands we just don't call scsi_done() and allow the request to timeout, which ensures SCSI EH kicks-in we try to abort the command. Since commit `4a0c6f432d` ("scsi: scsi_debug: Add new defer type for mq_poll") this does not seem to work. The issue is that we clear the sd_dp->aborted flag in schedule_resp() before the completion callback has run. When the completion callback actually runs, it calls scsi_done() as normal as sd_dp->aborted unset. This is all very racy. Fix by not clearing sd_dp->aborted in schedule_resp(). Also move the call to blk_abort_request() from schedule_resp() to sdebug_q_cmd_complete(), which makes the code have a more logical sequence. I also note that this feature only works for commands which are classed as "SDEG_RES_IMMED_MASK", but only practically triggered with prior RW commands. So for my experiment I need to run fio to trigger the error on the "nth" command (see inject_on_this_cmd()), and then run something like sg_sync to queue a command to actually trigger the abort. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-11-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:11 +08:00
Ming Lei	7495041ddd	scsi: scsi_debug: Drop sdebug_dev_info.num_in_q JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 151f0ec9ddb539c403a17c86da092e751736c121 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:12 2023 +0000 scsi: scsi_debug: Drop sdebug_dev_info.num_in_q In schedule_resp(), under certain conditions we check whether the per-device queue is full (num_in_q == queue depth - 1) and we may inject a "task set full" (TSF) error if it is. However how we read num_in_q is racy - many threads may see the same "queue is full" value (and also issue a TSF). There is per-queue locking in reading per-device num_in_q, but that would not help. Replace how we read num_in_q at this location with a call to scsi_device_busy(). Calling scsi_device_busy() is likewise racy (as reading num_in_q), so nothing lost or gained. Calling scsi_device_busy() is also slow as it needs to read all bits in the per-device budget bitmap, but we can live with that since we're just a simulator and it's only under a certain configs which we would see this. Also move the "task set full" print earlier as it would only be called now under this condition. However, previously it may not have been called - like returning early - but keep it simple and always call it. At this point we can drop sdebug_dev_info.num_in_q - it is difficult to maintain properly and adds extra normal case command processing. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-10-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:10 +08:00
Ming Lei	0a80b26465	scsi: scsi_debug: Drop check for num_in_q exceeding queue depth JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 0befb8790969087946f5726d8d80b4f83053ea21 Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:11 2023 +0000 scsi: scsi_debug: Drop check for num_in_q exceeding queue depth The per-device num_in_q value cannot exceed the device queue depth, so drop the check. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-9-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:09 +08:00
Ming Lei	3b12a66050	scsi: scsi_debug: Drop scsi_debug_host_reset() device NULL pointer check JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 9c2303820bf033f798fe14a856d7df431640001b Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:10 2023 +0000 scsi: scsi_debug: Drop scsi_debug_host_reset() device NULL pointer check The check for device pointer for the SCSI command is unnecessary, so drop it. The only caller is scsi_try_host_reset() -> eh_host_reset_handler(), and there that pointer cannot be NULL. Indeed, there is already code later in the same function which does not check the device pointer for the SCSI command. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-8-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:07 +08:00
Ming Lei	2776121a0f	scsi: scsi_debug: Drop scsi_debug_bus_reset() NULL pointer checks JIRA: https://issues.redhat.com/browse/RHEL-15276 commit 519bfc14c156f31cc113709c71e7f66e0f6f228e Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:09 2023 +0000 scsi: scsi_debug: Drop scsi_debug_bus_reset() NULL pointer checks The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so drop them. Likewise, drop the NULL check for sdbg_host. The only caller is scsi_try_bus_reset() -> eh_bus_reset_handler(), and there those pointers cannot be NULL. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-7-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:07 +08:00
Ming Lei	0c64f61f5b	scsi: scsi_debug: Drop scsi_debug_target_reset() NULL pointer checks JIRA: https://issues.redhat.com/browse/RHEL-15276 commit a15df530a189fcc62003df7a7272b2918a9ef73a Author: John Garry <john.g.garry@oracle.com> Date: Mon Mar 13 09:31:08 2023 +0000 scsi: scsi_debug: Drop scsi_debug_target_reset() NULL pointer checks The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so drop them. Likewise, drop the NULL check for sdbg_host. The only caller is scsi_try_target_reset() -> eh_target_reset_handler(), and there those pointers cannot be NULL. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-6-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2023-11-02 08:30:06 +08:00

1 2 3 4 5 ...

354 Commits