Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Ewan D. Milne	e125b74c02	scsi: scsi_error: Add kernel-doc for exported functions JIRA: https://issues.redhat.com/browse/RHEL-86156 Upstream Status: From upstream linux mainline Convert scsi_report_bus_reset() and scsi_report_device_reset() to kernel-doc since they are exported. This allows them to be part of the driver-api/scsi.rst docbook. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-2-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 20b98768c2e743df7648476521818600ac4a86ef) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2025-04-13 14:10:18 -04:00
Ewan D. Milne	96d8b99128	scsi: sd: Handle read/write CDL timeout failures JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Commands using a duration limit descriptor that has limit policies set to a value other than 0x0 may be failed by the device if one of the limits are exceeded. For such commands, since the failure is the result of the user duration limit configuration and workload, the commands should not be retried and terminated immediately. Furthermore, to allow the user to differentiate these "soft" failures from hard errors due to hardware problem, a different error code than EIO should be returned. There are 2 cases to consider: (1) The failure is due to a limit policy failing the command with a check condition sense key, that is, any limit policy other than 0xD. For this case, scsi_check_sense() is modified to detect failures with the ABORTED COMMAND sense key and the COMMAND TIMEOUT BEFORE PROCESSING or COMMAND TIMEOUT DURING PROCESSING or COMMAND TIMEOUT DURING PROCESSING DUE TO ERROR RECOVERY additional sense code. For these failures, a SUCCESS disposition is returned so that scsi_finish_command() is called to terminate the command. (2) The failure is due to a limit policy set to 0xD, which result in the command being terminated with a GOOD status, COMPLETED sense key, and DATA CURRENTLY UNAVAILABLE additional sense code. To handle this case, the scsi_check_sense() is modified to return a SUCCESS disposition so that scsi_finish_command() is called to terminate the command. In addition, scsi_decide_disposition() has to be modified to see if a command being terminated with GOOD status has sense data. This is as defined in SCSI Primary Commands - 6 (SPC-6), so all according to spec, even if GOOD status commands were not checked before. If scsi_check_sense() detects sense data representing a duration limit, scsi_check_sense() will set the newly introduced SCSI ML byte SCSIML_STAT_DL_TIMEOUT. This SCSI ML byte is checked in scsi_noretry_cmd(), so that a command that failed because of a CDL timeout cannot be retried. The SCSI ML byte is also checked in scsi_result_to_blk_status() to complete the command request with the BLK_STS_DURATION_LIMIT status, which result in the user seeing ETIME errors for the failed commands. Co-developed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-12-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 390e2d1a587405a522dc6b433d45648f895a352c) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:20 -04:00
Ewan D. Milne	1d939e03cb	scsi: core: Kick the requeue list after inserting when flushing JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline When libata calls ata_link_abort() to abort all ata queued commands, it calls blk_abort_request() on the SCSI command representing each QC. This causes scsi_timeout() to be called, which calls scsi_eh_scmd_add() for each SCSI command. scsi_eh_scmd_add() sets the SCSI host to state recovery, and then adds the command to shost->eh_cmd_q. This will wake up the SCSI EH, and eventually the libata EH strategy handler will be called, which calls scsi_eh_flush_done_q() to either flush retry or flush finish each failed command. The commands that are flush retried by scsi_eh_flush_done_q() are done so using scsi_queue_insert(). Before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"), __scsi_queue_insert() called blk_mq_requeue_request() with the second argument set to true, indicating that it should always kick/run the requeue list after inserting. After commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"), __scsi_queue_insert() does not kick/run the requeue list after inserting, if the current SCSI host state is recovery (which is the case in the libata example above). This optimization is probably fine in most cases, as I can only assume that most often someone will eventually kick/run the queues. However, that is not the case for scsi_eh_flush_done_q(), where we can see that the request gets inserted to the requeue list, but the queue is never started after the request has been inserted, leading to the block layer waiting for the completion of command that never gets to run. Since scsi_eh_flush_done_q() is called by SCSI EH context, the SCSI host state is most likely always in recovery when this function is called. Thus, let scsi_eh_flush_done_q() explicitly kick the requeue list after inserting a flush retry command, so that scsi_eh_flush_done_q() keeps the same behavior as before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"). Simple reproducer for the libata example above: $ hdparm -Y /dev/sda $ echo 1 > /sys/class/scsi_device/0\:0\:0\:0/device/delete Fixes: 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary") Reported-by: Kevin Locke <kevin@kevinlocke.name> Closes: https://lore.kernel.org/linux-scsi/ZZw3Th70wUUvCiCY@kevinlocke.name/ Signed-off-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240111120533.3612509-1-cassel@kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 6df0e077d76bd144c533b61d6182676aae6b0a85) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:08 -04:00
Ewan D. Milne	1a9369c1a6	scsi: core: Add a precondition check in scsi_eh_scmd_add() JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Calling scsi_eh_scmd_add() may cause the error handler never to be woken up because this may result in shost->host_failed to become larger than scsi_host_busy(shost). Hence complain if scsi_eh_scmd_add() is called after SCMD_STATE_INFLIGHT has been cleared. Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20231115193343.2262013-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 10b53db2db8dfda84b25833043f2b63123572af6) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:07 -04:00
Ewan D. Milne	bbb62e4e58	scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu() JIRA: https://issues.redhat.com/browse/RHEL-33543 Upstream Status: From upstream linux mainline Earlier commits in this series allow battery-powered systems to build their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option. This Kconfig option causes call_rcu() to delay its callbacks in order to batch them. This means that a given RCU grace period covers more callbacks, thus reducing the number of grace periods, in turn reducing the amount of energy consumed, which increases battery lifetime which can be a very good thing. This is not a subtle effect: In some important use cases, the battery lifetime is increased by more than 10%. This CONFIG_RCU_LAZY=y option is available only for CPUs that offload callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y. Delaying callbacks is normally not a problem because most callbacks do nothing but free memory. If the system is short on memory, a shrinker will kick all currently queued lazy callbacks out of their laziness, thus freeing their memory in short order. Similarly, the rcu_barrier() function, which blocks until all currently queued callbacks are invoked, will also kick lazy callbacks, thus enabling rcu_barrier() to complete in a timely manner. However, there are some cases where laziness is not a good option. For example, synchronize_rcu() invokes call_rcu(), and blocks until the newly queued callback is invoked. It would not be a good for synchronize_rcu() to block for ten seconds, even on an idle system. Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a given CPU kicks any lazy callbacks that might be already queued on that CPU. After all, if there is going to be a grace period, all callbacks might as well get full benefit from it. Yes, this could be done the other way around by creating a call_rcu_lazy(), but earlier experience with this approach and feedback at the 2022 Linux Plumbers Conference shifted the approach to call_rcu() being lazy with call_rcu_hurry() for the few places where laziness is inappropriate. And another call_rcu() instance that cannot be lazy is the one in the scsi_eh_scmd_add() function. Leaving this instance lazy results in unacceptably slow boot times. Therefore, make scsi_eh_scmd_add() use call_rcu_hurry() in order to revert to the old behavior. [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ] Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: <linux-scsi@vger.kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> (cherry picked from commit 54d87b0a0c19bc3f740e4cd4b87ba14ce2e4ea73) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-06-03 13:47:02 -04:00
Ming Lei	9f83840c83	scsi: core: Move scsi_host_busy() out of host lock if it is for per-command JIRA: https://issues.redhat.com/browse/RHEL-23941 commit 4e6c9011990726f4d175e2cdfebe5b0b8cce4839 Author: Ming Lei <ming.lei@redhat.com> Date: Sat Feb 3 10:45:21 2024 +0800 scsi: core: Move scsi_host_busy() out of host lock if it is for per-command Commit 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") intended to fix a hard lockup issue triggered by EH. The core idea was to move scsi_host_busy() out of the host lock when processing individual commands for EH. However, a suggested style change inadvertently caused scsi_host_busy() to remain under the host lock. Fix this by calling scsi_host_busy() outside the lock. Fixes: 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") Cc: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240203024521.2006455-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2024-02-18 15:47:51 +08:00
Ming Lei	66e05829c1	scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler JIRA: https://issues.redhat.com/browse/RHEL-23941 commit 4373534a9850627a2695317944898eb1283a2db0 Author: Ming Lei <ming.lei@redhat.com> Date: Fri Jan 12 15:00:00 2024 +0800 scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock every time for deciding if error handler kthread needs to be waken up. This can be too heavy in case of recovery, such as: - N hardware queues - queue depth is M for each hardware queue - each scsi_host_busy() iterates over (N * M) tag/requests If recovery is triggered in case that all requests are in-flight, each scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called for the last in-flight request, scsi_host_busy() has been run for (N * M - 1) times, and request has been iterated for (NM - 1) (N * M) times. If both N and M are big enough, hard lockup can be triggered on acquiring host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169). Fix the issue by calling scsi_host_busy() outside the host lock. We don't need the host lock for getting busy count because host the lock never covers that. [mkp: Drop unnecessary 'busy' variables pointed out by Bart] Cc: Ewan Milne <emilne@redhat.com> Fixes: `6eb045e092` ("scsi: core: avoid host-wide host_busy counter for scsi_mq") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240112070000.4161982-1-ming.lei@redhat.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Tested-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2024-02-18 15:47:24 +08:00
Scott Weaver	fab724ae78	Merge: scsi: core: Always send batch on reset or error handling command MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3551 JIRA: https://issues.redhat.com/browse/RHEL-19730 scsi: core: Always send batch on reset or error handling command In commit `8930a6c207` ("scsi: core: add support for request batching") the block layer bd->last flag was mapped to SCMD_LAST and used as an indicator to send the batch for the drivers that implement this feature. However, the error handling code was not updated accordingly. scsi_send_eh_cmnd() is used to send error handling commands and request sense. The problem is that request sense comes as a single command that gets into the batch queue and times out. As a result the device goes offline after several failed resets. This was observed on virtio_scsi during a device resize operation. [ 496.316946] sd 0:0:4:0: [sdd] tag#117 scsi_eh_0: requesting sense [ 506.786356] sd 0:0:4:0: [sdd] tag#117 scsi_send_eh_cmnd timeleft: 0 [ 506.787981] sd 0:0:4:0: [sdd] tag#117 abort To fix this always set SCMD_LAST flag in scsi_send_eh_cmnd() and scsi_reset_ioctl(). Fixes: `8930a6c207` ("scsi: core: add support for request batching") Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com> Link: https://lore.kernel.org/r/20231215121008.2881653-1-alexander.atanasov@virtuozzo.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 066c5b46b6eaf2f13f80c19500dbb3b84baabb33) Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Approved-by: Ming Lei <ming.lei@redhat.com> Approved-by: Chris Leech <cleech@redhat.com> Signed-off-by: Scott Weaver <scweaver@redhat.com>	2024-01-26 13:26:57 -05:00
Ewan D. Milne	5ef059b25e	scsi: core: Always send batch on reset or error handling command JIRA: https://issues.redhat.com/browse/RHEL-19730 Upstream Status: From upstream linux mainline In commit `8930a6c207` ("scsi: core: add support for request batching") the block layer bd->last flag was mapped to SCMD_LAST and used as an indicator to send the batch for the drivers that implement this feature. However, the error handling code was not updated accordingly. scsi_send_eh_cmnd() is used to send error handling commands and request sense. The problem is that request sense comes as a single command that gets into the batch queue and times out. As a result the device goes offline after several failed resets. This was observed on virtio_scsi during a device resize operation. [ 496.316946] sd 0:0:4:0: [sdd] tag#117 scsi_eh_0: requesting sense [ 506.786356] sd 0:0:4:0: [sdd] tag#117 scsi_send_eh_cmnd timeleft: 0 [ 506.787981] sd 0:0:4:0: [sdd] tag#117 abort To fix this always set SCMD_LAST flag in scsi_send_eh_cmnd() and scsi_reset_ioctl(). Fixes: `8930a6c207` ("scsi: core: add support for request batching") Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com> Link: https://lore.kernel.org/r/20231215121008.2881653-1-alexander.atanasov@virtuozzo.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 066c5b46b6eaf2f13f80c19500dbb3b84baabb33) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2024-01-04 17:08:04 -05:00
Tomas Henzl	2bac835539	scsi: core: Allow libata to complete successful commands via EH JIRA: https://issues.redhat.com/browse/RHEL-10941 In SCSI, we get the sense data as part of the completion, for ATA however, we need to fetch the sense data as an extra step. For an aborted ATA command the sense data is fetched via libata's ->eh_strategy_handler(). For Command Duration Limits policy 0xD: The device shall complete the command without error with the additional sense code set to DATA CURRENTLY UNAVAILABLE. In order to handle this policy in libata, we intend to send a successful command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the sense data for the good command. This is similar to how we handle an aborted ATA command, just that we need to read the Successful NCQ Commands log instead of the NCQ Command Error log. When we get a SATA completion with successful commands, ATA_SENSE will be set, indicating that some commands in the completion have sense data. The sense_valid bitmask in the Sense Data for Successful NCQ Commands log will inform exactly which commands that had sense data, which might be a subset of all the commands that was completed in the same completion. (Yet all will have ATA_SENSE set, since the status is per completion.) The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will not set the scmd->result to DID_TIME_OUT for these commands. However, the successful commands that did not have sense data, must not get their result marked as DID_TIME_OUT by SCSI EH. Add a new flag SCMD_FORCE_EH_SUCCESS, which tells SCSI EH to not mark a command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. This will be used by libata in a subsequent commit. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-5-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 3d848ca1ebc8d8864f25bd461914c93eff82a2d2) Signed-off-by: Tomas Henzl <thenzl@redhat.com>	2023-11-07 01:01:19 +01:00
Ewan D. Milne	f579e72550	scsi: core: scsi_error: Do not queue pointless abort workqueue functions Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline If a host template doesn't implement the .eh_abort_handler() there is no point in queueing the abort workqueue function; all it does is invoking SCSI EH anyway. So return 'FAILED' from scsi_abort_command() if the .eh_abort_handler() is not implemented and save us from having to wait for the abort workqueue function to complete. Cc: Niklas Cassel <niklas.cassel@wdc.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> [niklas: moved the check to the top of scsi_abort_command()] Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20221206131346.2045375-1-niklas.cassel@wdc.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit d0b9025540ef57cc4464ab2fc64ed8ddc49b5658) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-12 13:20:44 -04:00
Ewan D. Milne	d9090f85f4	scsi: core: Increase scsi_device's iodone_cnt in scsi_timeout() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline If a SCSI command times out and is going to be aborted, we should increase the iodone_cnt of the related scsi_device. Otherwise the iodone_cnt would be smaller than iorequest_cnt. Increasing iodone_cnt in scsi_timeout() would not cause a double accounting issue. Brief analysis follows: - We add the iodone_cnt when BLK_EH_DONE is returned in scsi_timeout(). The related command's timeout event would not happen. - If the abort succeeds and the command is not retried, the command would be completed with scsi_finish_command() which would not increase iodone_cnt. - If the abort succeeds and the command is retried, it would be requeue. A scsi_dispatch_cmd() would be called and iorequest_cnt would be increased again. - If the abort fails, the error handler successfully recovers the device, and the command is not retried, the command would be completed with scsi_finish_command() which would not increase iodone_cnt. - If the abort fails, the error handler successfully recovers the device, and the command is retried, the iorequest_cnt would be increased again. Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221123122137.150776-2-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit ec9780e48c77f469c339b53940ef0c5eacc8b9d2) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-12 13:20:41 -04:00
Ewan D. Milne	427448c2f4	scsi: core: Change the return type of .eh_timed_out() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline Conflicts: This upstream change is incompatible with previous return values from the scsi_host_template.eh_timed_out() function, which is used by out-of-box drivers. For RHEL, change the ordinal values of SCSI_EH_xxx to be compatible. The new SCSI_EH_DONE value is currently only used by UFS. Commit `6600593cbd` ("block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE") made it impossible for .eh_timed_out() implementations to call scsi_done() without causing a crash. Restore support for SCSI timeout handlers to call scsi_done() as follows: * Change all .eh_timed_out() handlers as follows: - Change the return type into enum scsi_timeout_action. - Change BLK_EH_RESET_TIMER into SCSI_EH_RESET_TIMER. - Change BLK_EH_DONE into SCSI_EH_NOT_HANDLED. * In scsi_timeout(), convert the SCSI_EH_* values into BLK_EH_* values. Reviewed-by: Lee Duncan <lduncan@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-3-bvanassche@acm.org Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit dee7121e8c0a3ce41af2b02d516f54eaec32abcd) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-12 13:20:39 -04:00
Ewan D. Milne	4e4f37780a	scsi: core: Fix a race between scsi_done() and scsi_timeout() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline If there is a race between scsi_done() and scsi_timeout() and if scsi_timeout() loses the race, scsi_timeout() should not reset the request timer. Hence change the return value for this case from BLK_EH_RESET_TIMER into BLK_EH_DONE. Although the block layer holds a reference on a request (req->ref) while calling a timeout handler, restarting the timer (blk_add_timer()) while a request is being completed is racy. Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Hannes Reinecke <hare@suse.de> Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: `15f73f5b3e` ("blk-mq: move failure injection out of blk_mq_complete_request") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 978b7922d3dca672b41bb4b8ce6c06ab77112741) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-12 13:20:39 -04:00
Ewan D. Milne	f2cd346628	scsi: core: Add I/O timeout count for SCSI device Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline Currently struct scsi_device maintains counters for requests, completions, and errors but is missing a counter for timeouts. For better tracking of timeouts, add a suitable counter. Link: https://lore.kernel.org/r/1663666339-17560-1-git-send-email-wubo40@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 48517eefb20ec2d6595ebd77ae11f34b3540cd78) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-11 12:49:52 -04:00
Ewan D. Milne	c5eed3c9b0	scsi: core: Convert scsi_decide_disposition() to use SCSIML_STAT Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline Don't use: - DID_TARGET_FAILURE - DID_NEXUS_FAILURE - DID_ALLOC_FAILURE - DID_MEDIUM_ERROR Instead use the SCSI midlayer internal values. Link: https://lore.kernel.org/r/20220812010027.8251-10-michael.christie@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 7dfaae6ac1b0267d5a064970ba794a916c33b823) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-11 12:49:51 -04:00
Ewan D. Milne	ee643e7d3e	scsi: core: Add error codes for internal SCSI midlayer use Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171093 Upstream Status: From upstream linux mainline If a driver returns: - DID_TARGET_FAILURE - DID_NEXUS_FAILURE - DID_ALLOC_FAILURE - DID_MEDIUM_ERROR we hit a couple bugs: 1. The SCSI error handler runs because scsi_decide_disposition() has no case statements for them and we return FAILED. 2. For SG IO the userspace app gets a success status instead of failed, because scsi_result_to_blk_status() clears those errors. This patch adds a new internal error code byte for use by the SCSI midlayer. This will be used instead of the above error codes, so we don't have to play that clearing the host code game in scsi_result_to_blk_status() and drivers cannot accidentally use them. A subsequent commit will then remove the internal users of the above codes and convert us to use the new ones. Link: https://lore.kernel.org/r/20220812010027.8251-9-michael.christie@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 36ebf1e2aa148bdcf03c413bddfc605c54b57669) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2023-04-11 12:49:51 -04:00
Frantisek Hrbata	d9819eb3e5	Merge: block: update with v6.1-rc2 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1517 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2131144 Signed-off-by: Ming Lei <ming.lei@redhat.com> Approved-by: Rafael Aquini <aquini@redhat.com> Approved-by: Nigel Croxon <ncroxon@redhat.com> Approved-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>	2022-11-03 13:30:02 -04:00
Ewan D. Milne	7cbabe2912	scsi: core: Shorten long warning messages Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2132461 Upstream Status: From upstream linux mainline sdev_printk() will only accept messages up to 128 bytes. Shorten strings exceeding 128 bytes avoid printing an incomplete sentence like: [ 475.156955] sd 9:0:0:0: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical Link: https://lore.kernel.org/r/20220630024516.1571209-1-lizhijian@fujitsu.com Suggested-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit a2417db3679cffa67fbdc6c175cf68ffc86b8ac3) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-10-25 13:58:25 -04:00
Ewan D. Milne	7f2683766f	scsi: core: Remove unreachable code warning Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2132461 Upstream Status: From upstream linux mainline The smatch tool reported the following warning: drivers/scsi/scsi_error.c:1988 scsi_decide_disposition() warn: ignoring unreachable code. Remove the "default:return FAILED;" instead of "return FAILED;" reported by smatch, because compilers can provide more useful diagnostics about switch/case statements that do not have a default statement, especially if the "switch" applies to a value with enumeration type. Link: https://lore.kernel.org/r/20220301080448.112813-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit e1b353e7a31dcaf47c234812c46a2db9cd5be584) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-10-24 16:12:57 -04:00
Ming Lei	26dca21eb1	block: change request end_io handler to pass back a return value Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2131144 Conflicts: drop change in drivers/ufs/core/ufshpb.c; drop change on nvme since rhel9 doesn't backport nvme uring pt command yet commit de671d6116b5210097cd6fbb877bac92536f265b Author: Jens Axboe <axboe@kernel.dk> Date: Wed Sep 21 15:19:54 2022 -0600 block: change request end_io handler to pass back a return value Everything is just converted to returning RQ_END_IO_NONE, and there should be no functional changes with this patch. In preparation for allowing the end_io handler to pass ownership back to the block layer, rather than retain ownership of the request. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-10-23 20:50:08 +08:00
Ming Lei	7ee4df9ee8	scsi/core: Change the return type of scsi_noretry_cmd() into bool Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2118511 commit 88b32c3cdf5fff7ed5bdaec7493428185cc65b6e Author: Bart Van Assche <bvanassche@acm.org> Date: Thu Jul 14 11:07:06 2022 -0700 scsi/core: Change the return type of scsi_noretry_cmd() into bool This patch prepares for introducing the new blk_opf_t type in the SCSI core. Since the value returned by scsi_noretry_cmd() is only used in boolean expressions, this patch does not change any functionality. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-41-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-10-12 09:20:21 +08:00
Ming Lei	681c4122e9	blk-mq: Drop blk_mq_ops.timeout 'reserved' arg Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2118511 commit 9bdb4833dd399cbff82cc20893f52bdec66a9eca Author: John Garry <john.garry@huawei.com> Date: Wed Jul 6 20:03:51 2022 +0800 blk-mq: Drop blk_mq_ops.timeout 'reserved' arg With new API blk_mq_is_reserved_rq() we can tell if a request is from the reserved pool, so stop passing 'reserved' arg. There is actually only a single user of that arg for all the callback implementations, which can use blk_mq_is_reserved_rq() instead. This will also allow us to stop passing the same 'reserved' around the blk-mq iter functions next. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/1657109034-206040-4-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-10-12 09:20:16 +08:00
Ming Lei	7c09a1cb5b	scsi: core: Remove reserved request time-out handling Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2118511 commit deef1be18e3fc62ddf04fb3e5e8ff6a301693dcc Author: John Garry <john.garry@huawei.com> Date: Wed Jul 6 20:03:49 2022 +0800 scsi: core: Remove reserved request time-out handling The SCSI core code does not currently support reserved commands. As such, requests which time-out would never be reserved, and scsi_timeout() 'reserved' arg should never be set. Remove handling for reserved requests, drop the wrapper scsi_timeout() as it now just calls scsi_times_out() always, and finally rename scsi_times_out() -> scsi_timeout() to match the blk_mq_ops method name. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/1657109034-206040-2-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-10-12 09:20:16 +08:00
Ewan D. Milne	08529f2309	scsi: restore setting of scmd->scsi_done() in EH and reset ioctl paths Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2120469 Upstream Status: RHEL-only Since not all SCSI drivers have been updated to call scsi_done() directly, restore the setting of scmd->scsi_done() in both the EH and reset ioctl command submission paths to avoid a null pointer dereference panic. For some reason, upstream commit bf23e619039d ("scsi: core: Use a structure member to track the SCSI command submitter") removed the setting of scmd->scsi_done() prior to all the drivers being updated. This should have instead been done in the later patch in the series 11b68e36b167 ("scsi: core: Call scsi_done directly") after all the driver changes. Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-09-14 14:22:54 -04:00
Patrick Talbert	61e00f3f38	Merge: Additional SCSI updates for 9.1 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/973 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094105 Upstream Status: From upstream linux mainline Tested: Tested by QE regression testing for error handling paths Signed-off-by: Ewan D. Milne <emilne@redhat.com> Additional updates to the SCSI beyond upstream 5.17 have been requested by kernel-rt: scsi: usb: storage: Complete the SCSI request directly scsi: usb: Call scsi_done() directly These changes are dependent on the scsi_done_direct() functionality, which in turn depend on some refactoring of the scsi_done() mechanism and require some earlier patches in order to apply cleanly: scsi: core: Add scsi_done_direct() for immediate completion scsi: core: Rename scsi_mq_done() into scsi_done() and export it scsi: core: Use a structure member to track the SCSI command submitter Note that the upstream series to refactor scsi_done() touched all drivers to have them invoke the now exported scsi_done() symbols for command completion, instead of calling direct through scsi_cmnd->scsi_done(). As it is undesirable to make these changes, particularly at this stage in the development cycle, due to the amount of testing required, the changes for this BZ will not include the removal of scsi_cmnd->scsi_done() and will allow drivers to use either the old indirect or the new direct mechanism, to provide more time to allow all the drivers to be updated. The usb changes will be incorporated into a subsequent usb sybsystem update. Approved-by: Maurizio Lombardi <mlombard@redhat.com> Approved-by: Tomas Henzl <thenzl@redhat.com> Approved-by: Chris Leech <cleech@redhat.com> Approved-by: Torez Smith <torez@redhat.com> Signed-off-by: Patrick Talbert <ptalbert@redhat.com>	2022-07-21 10:40:40 +02:00
Ming Lei	d64fe46972	blk-mq: remove the done argument to blk_execute_rq_nowait Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083917 Conflicts: drop changes on uhs and nvme io_uring passthrough commit e2e530867245d051dc7800b0d07193b3e581f5b9 Author: Christoph Hellwig <hch@lst.de> Date: Tue May 24 14:15:30 2022 +0200 blk-mq: remove the done argument to blk_execute_rq_nowait Let the caller set it together with the end_io_data instead of passing a pointless argument. Note the the target code did in fact already set it and then just overrode it again by calling blk_execute_rq_nowait. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-06-22 08:58:09 +08:00
Ewan D. Milne	44b40ff5dc	scsi: core: Use a structure member to track the SCSI command submitter Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094105 Upstream Status: From upstream linux mainline Conflicts: KABI changes for addition to struct scsi_cmnd Conditional statements are faster than indirect calls. Use a structure member to track the SCSI command submitter such that later patches can call scsi_done(scmd) instead of scmd->scsi_done(scmd). The asymmetric behavior that scsi_send_eh_cmnd() sets the submission context to the SCSI error handler and that it does not restore the submission context to the SCSI core is retained. Link: https://lore.kernel.org/r/20211007202923.2174984-2-bvanassche@acm.org Cc: Hannes Reinecke <hare@suse.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit bf23e619039d360d503b7282d030daf2277a5d47) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-06-06 17:22:49 -04:00
Ewan D. Milne	cce2fcde29	scsi: core: sd: Add silence_suspend flag to suppress some PM messages Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071832 Upstream Status: From upstream linux mainline Kernel messages produced during runtime PM can cause a never-ending cycle because user space utilities (e.g. journald or rsyslog) write the messages back to storage, causing runtime resume, more messages, and so on. Messages that tell of things that are expected to happen are arguably unnecessary, so add a flag to suppress them. This flag is used by the UFS driver. Link: https://lore.kernel.org/r/20220228113652.970857-2-adrian.hunter@intel.com Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit af4edb1d50c6d1044cb34bc43621411b7ba2cffe) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-04-27 18:55:08 -04:00
Ewan D. Milne	6cc9603e63	scsi: core: Use eh_timeout for START STOP UNIT Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071832 Upstream Status: From upstream linux mainline In some scenarios START STOP UNIT may time out. The default recovery time of 30 seconds is relatively large. Modifying rq_timeout to adjust the START STOP UNIT timeout value will affect the regular I/O. Commit `9728c0814e` ("[SCSI] make scsi_eh_try_stu use block timeout") switched to rq_timeout for the START STOP UNIT command. However commit `0816c9251a` ("[SCSI] Allow error handling timeout to be specified") introduced an explicit eh_timeout parameter. It makes more sense to use this value as the timeout for START STOP UNIT. Link: https://lore.kernel.org/r/1636507412-21678-1-git-send-email-brookxu.cn@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Chunguang Xu <brookxu@tencent.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit adcc796b4f55c18ee5fca8190a592c84cf8682e0) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-04-27 18:55:03 -04:00
Ewan D. Milne	60d1a2a24f	scsi: core: Simplify control flow in scmd_eh_abort_handler() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071832 Upstream Status: From upstream linux mainline Simplify the nested conditionals in the function by using a label for the error path. Introduce local "shost" to avoid repeated "sdev->shost" usage. Also remove scsi_eh_complete_abort() since there is now only one place it would be called. Link: https://lore.kernel.org/r/20211029194311.17504-3-emilne@redhat.com Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 54d816d3d36293728ffc8488fae14b002d4b4a64) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-04-27 18:55:02 -04:00
Ewan D. Milne	f4fbde3853	scsi: core: Avoid leaving shost->last_reset with stale value if EH does not run Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071832 Upstream Status: From upstream linux mainline The changes to issue the abort from the scmd->abort_work instead of the EH thread introduced a problem if eh_deadline is used. If aborting the command(s) is successful, and there are never any scmds added to the shost->eh_cmd_q, there is no code path which will reset the ->last_reset value back to zero. The effect of this is that after a successful abort with no EH thread activity, a subsequent timeout, perhaps a long time later, might immediately be considered past a user-set eh_deadline time, and the host will be reset with no attempt at recovery. Fix this by resetting ->last_reset back to zero in scmd_eh_abort_handler() if it is determined that the EH thread will not run to do this. Thanks to Gopinath Marappan for investigating this problem. Link: https://lore.kernel.org/r/20211029194311.17504-2-emilne@redhat.com Fixes: `e494f6a728` ("[SCSI] improved eh timeout handler") Cc: stable@vger.kernel.org Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 5ae17501bc62a49b0b193dcce003f16375f16654) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-04-27 18:55:02 -04:00
Ewan D. Milne	a652e613d0	scsi: core: Use scsi_cmd_to_rq() instead of scsi_cmnd.request Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071832 Upstream Status: From upstream linux mainline Conflicts: Merge differences due to out-of-order commits 21fd056f65f0 ("block: remove the ->rq_disk field in struct request") and `e1c83c8f82` ("scsi: core: scsi_logging: Fix a BUG") Prepare for removal of the request pointer by using scsi_cmd_to_rq() instead. Cast away constness where necessary when passing a SCSI command pointer to scsi_cmd_to_rq(). This patch does not change any functionality. Link: https://lore.kernel.org/r/20210809230355.8186-3-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit aa8e25e5006aac52c943c84e9056ab488630ee19) Signed-off-by: Ewan D. Milne <emilne@redhat.com>	2022-04-27 18:54:51 -04:00
Ming Lei	df922b5e32	block: remove the gendisk argument to blk_execute_rq Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066297 Conflicts: drop change on ufshpb since rhel9 doesn't support it; apply change on fs/nfsd/blocklayout.c commit b84ba30b6c7a75babdf73b83bc3c7b59b944501a Author: Christoph Hellwig <hch@lst.de> Date: Fri Nov 26 13:18:01 2021 +0100 block: remove the gendisk argument to blk_execute_rq Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait given that it is unused now. Also convert the boolean at_head parameter to actually use the bool type while touching the prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-04-11 11:44:25 +08:00
Ming Lei	91f6e9fde8	block: remove blk_{get,put}_request Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066297 Conflicts: drop change on scsi/ufs/ufshpb.c which isn't included in rhel9 commit 0bf6d96cb8294094ce1e44cbe8cf65b0899d0a3a Author: Christoph Hellwig <hch@lst.de> Date: Mon Oct 25 09:05:07 2021 +0200 block: remove blk_{get,put}_request These are now pointless wrappers around blk_mq_{alloc,free}_request, so remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211025070517.1548584-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-04-11 11:44:15 +08:00
Ming Lei	5dd7fe91bc	scsi: add a scsi_alloc_request helper Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066297 commit 68ec3b819a5d600a4ede8b596761dccac9f39ebc Author: Christoph Hellwig <hch@lst.de> Date: Thu Oct 21 08:06:05 2021 +0200 scsi: add a scsi_alloc_request helper Add a new helper that calls blk_get_request and initializes the scsi_request to avoid the indirect call through ->.initialize_rq_fn. Note that this makes the pktcdvd driver depend on the SCSI core, but given that only SCSI devices support SCSI passthrough requests that is not a functional change. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20211021060607.264371-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com>	2022-04-11 11:44:14 +08:00
Linus Torvalds	a022f7d575	block-5.14-2021-07-08 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmDnGVYQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpv6UEAC78zkseI8TmKaowNfkz/+MkP9eSFb1pVn3 rxpbPOsZompHoZpeWt4oHL+3Rmm3a9iRo/APA2ELas4zvp+Q+6uG7eha2Dc4hUA9 YgeO4z9YfG8wQNZc3x7bncb6ZwqEE5nnbFe/m25SyrAZVLlZ7FKHxfoZDqjhlGFC eLNiYO6vdvwgCoBMcotyCDttrPfEu6947/5vB1zevv57twdQQaEWGUhvyx1XrlDX 0YD5fmdOjNU2isgxt4xo2Ur2zL6w254/hvj58sV3Z7JfkJpI9DCK+ztKEfzuyEhA WYz06rDAT1+1KuVLfowaZ+pYiPPOIsL0+QXI83r3nLaE7WGGlfS8Hmz//1FbziYs ZSZI826kEN+/lKeWTcKOOMhmkYyXEFFuQZS34eg9KI4xwML8v+ILlHmcp+tjebw9 vzNF6f7N2ki+jnyxxyNxeMHxeAMWsqnIRROOhZg6bbs6UVNpDy4qRzpQaDOaJsVe uSAQ6PTd/etR9KE+ClhLe6X7Rmp/lfZCPe64wqM/3k1qV2KWhE1fwCQO4c5o1MBN rpk3Ef5PZYP3aakCvZnfcjMWlpZNbq/xMc6vPc+yq32akq1t1KbODVBiR5odcH0C Gt5N11im50SO06haBt7EOe4JMQLbK5sxG15t4C6mNQZgPegGfaLlVkKpzIkOzUha OkRofKMcDA== =gHse -----END PGP SIGNATURE----- Merge tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block Pull more block updates from Jens Axboe: "A combination of changes that ended up depending on both the driver and core branch (and/or the IDE removal), and a few late arriving fixes. In detail: - Fix io ticks wrap-around issue (Chunguang) - nvme-tcp sock locking fix (Maurizio) - s390-dasd fixes (Kees, Christoph) - blk_execute_rq polling support (Keith) - blk-cgroup RCU iteration fix (Yu) - nbd backend ID addition (Prasanna) - Partition deletion fix (Yufen) - Use blk_mq_alloc_disk for mmc, mtip32xx, ubd (Christoph) - Removal of now dead block request types due to IDE removal (Christoph) - Loop probing and control device cleanups (Christoph) - Device uevent fix (Christoph) - Misc cleanups/fixes (Tetsuo, Christoph)" * tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block: (34 commits) blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs block: fix the problem of io_ticks becoming smaller nvme-tcp: can't set sk_user_data without write_lock loop: remove unused variable in loop_set_status() block: remove the bdgrab in blk_drop_partitions block: grab a device refcount in disk_uevent s390/dasd: Avoid field over-reading memcpy() dasd: unexport dasd_set_target_state block: check disk exist before trying to add partition ubd: remove dead code in ubd_setup_common nvme: use return value from blk_execute_rq() block: return errors from blk_execute_rq() nvme: use blk_execute_rq() for passthrough commands block: support polling through blk_execute_rq block: remove REQ_OP_SCSI_{IN,OUT} block: mark blk_mq_init_queue_data static loop: rewrite loop_exit using idr_for_each_entry loop: split loop_lookup loop: don't allow deleting an unspecified loop device loop: move loop_ctl_mutex locking into loop_add ...	2021-07-09 12:05:33 -07:00
Christoph Hellwig	da6269da4c	block: remove REQ_OP_SCSI_{IN,OUT} With the legacy IDE driver gone drivers now use either REQ_OP_DRV_* or REQ_OP_SCSI_*, so unify the two concepts of passthrough requests into a single one. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-06-30 15:34:19 -06:00
Hannes Reinecke	3d45cefc8e	scsi: core: Drop obsolete Linux-specific SCSI status codes Originally the SCSI subsystem has been using 'special' SCSI status codes, which were the SAM-specified ones but shifted by 1. As most drivers have now been modified to use the SAM-specified ones, having two nearly identical sets of definitions only causes confusion. The Linux-specifed SCSI status codes have been marked obsolete for several years so drop them and use the SAM-specified status codes throughout. Link: https://lore.kernel.org/r/20210427083046.31620-41-hare@suse.de Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-05-31 23:59:18 -04:00
Hannes Reinecke	54cf31d07a	scsi: core: Drop message byte helper The message byte is now unused, so we can drop the helper to set the message byte and the check for message bytes during error recovery. Link: https://lore.kernel.org/r/20210427083046.31620-38-hare@suse.de Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-05-31 22:48:24 -04:00
Hannes Reinecke	4bd51e54e1	scsi: core: Use DID_TIME_OUT instead of DRIVER_TIMEOUT Set DID_TIME_OUT instead of DRIVER_TIMEOUT when a command is finally marked as failed after error recovery. Link: https://lore.kernel.org/r/20210427083046.31620-12-hare@suse.de Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-05-31 22:48:22 -04:00
Hannes Reinecke	d0672a03e0	scsi: core: Introduce scsi_status_is_check_condition() Add a helper function scsi_status_is_check_condition() to encapsulate the frequent checks for SAM_STAT_CHECK_CONDITION. Link: https://lore.kernel.org/r/20210427083046.31620-9-hare@suse.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-05-31 22:48:21 -04:00
Bart Van Assche	b8e162f9e7	scsi: core: Introduce enum scsi_disposition Improve readability of the code in the SCSI core by introducing an enumeration type for the values used internally that decide how to continue processing a SCSI command. The eh_*_handler return values have not been changed because that would involve modifying all SCSI drivers. The output of the following command has been inspected to verify that no out-of-range values are assigned to a variable of type enum scsi_disposition: KCFLAGS=-Wassign-enum make CC=clang W=1 drivers/scsi/ Link: https://lore.kernel.org/r/20210415220826.29438-6-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	280e91b026	scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case The comment above scsi_send_eh_cmnd() says: "Returns SUCCESS or FAILED or NEEDS_RETRY". This patch makes all values returned by scsi_send_eh_cmnd() match the documentation of this function. This change does not affect the behavior of scsi_eh_tur() nor of scsi_eh_try_stu() nor of the scsi_request_sense() callers. See also commit `bbe9fb0d04` ("scsi: Avoid that .queuecommand() gets called for a blocked SCSI device"; v5.3). Link: https://lore.kernel.org/r/20210415220826.29438-5-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Linus Torvalds	bdb39c9509	SCSI misc on 20210219 This series consists of the usual driver updates (ufs, ibmvfc, qla2xxx, hisi_sas, pm80xx) plus the removal of the gdth driver (which is bound to cause conflicts with a trivial change somewhere). The only big major rework of note is the one from Hannes trying to clean up our result handling code in the drivers to make it consistent. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYDAdliYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishTblAQCk6wD8 fcb4TItSRp0DpRzs37zhppEbrBgveuAFHhr5swEA0gL2mHcq0vnyNBinCLnERrE7 TPYJqUKJNktnjVG7ZWc= =wW6p -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (ufs, ibmvfc, qla2xxx, hisi_sas, pm80xx) plus the removal of the gdth driver (which is bound to cause conflicts with a trivial change somewhere). The only big major rework of note is the one from Hannes trying to clean up our result handling code in the drivers to make it consistent" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (194 commits) scsi: MAINTAINERS: Adjust to reflect gdth scsi driver removal scsi: ufs: Give clk scaling min gear a value scsi: lpfc: Fix 'physical' typos scsi: megaraid_mbox: Fix spelling of 'allocated' scsi: qla2xxx: Simplify the calculation of variables scsi: message: fusion: Fix 'physical' typos scsi: target: core: Change ASCQ for residual write scsi: target: core: Signal WRITE residuals scsi: target: core: Set residuals for 4Kn devices scsi: hisi_sas: Add trace FIFO debugfs support scsi: hisi_sas: Flush workqueue in hisi_sas_v3_remove() scsi: hisi_sas: Enable debugfs support by default scsi: hisi_sas: Don't check .nr_hw_queues in hisi_sas_task_prep() scsi: hisi_sas: Remove deferred probe check in hisi_sas_v2_probe() scsi: lpfc: Add auto select on IRQ_POLL scsi: ncr53c8xx: Fix typos scsi: lpfc: Fix ancient double free scsi: qla2xxx: Fix some memory corruption scsi: qla2xxx: Remove redundant NULL check scsi: megaraid: Fix ifnullfree.cocci warnings ...	2021-02-22 10:24:58 -08:00
Guoqing Jiang	8eeed0b554	block: remove unnecessary argument from blk_execute_rq_nowait The 'q' is not used since commit `a1ce35fa49` ("block: remove dead elevator code"), also update the comment of the function. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: target-devel@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: linux-ide@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-nfs@vger.kernel.org Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-01-24 21:52:39 -07:00
Muneendra Kumar	60bee27ba2	scsi: core: No retries on abort success Add a new optional routine, eh_should_retry_cmd(), in scsi_host_template that allows the transport to decide if a cmd is retryable. Return true if the transport is in a state the cmd should be retried on. Update scmd_eh_abort_handler() and scsi_eh_flush_done_q() to both call scsi_eh_should_retry_cmd() to check whether the command needs to be retried. The above changes were based on a patch by Mike Christie. Link: https://lore.kernel.org/r/1609969748-17684-3-git-send-email-muneendra.kumar@broadcom.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-01-14 22:55:17 -05:00
Muneendra Kumar	962c8dcdd5	scsi: core: Add a new error code DID_TRANSPORT_MARGINAL in scsi.h Add code in scsi_result_to_blk_status to translate a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t i.e BLK_STS_TRANSPORT. Add DID_TRANSPORT_MARGINAL case to scsi_decide_disposition(). Link: https://lore.kernel.org/r/1609969748-17684-2-git-send-email-muneendra.kumar@broadcom.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-01-14 22:55:17 -05:00
Linus Torvalds	55e0500eb5	SCSI misc on 20201013 This series consists of the usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes. There are only three core changes: adding sense codes, cleaning up noretry and adding an option for limitless retries. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX4YulyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaZDAQCT7rwG UEZYHgYkU9EX9ERVBQM0SW4mLrxf3g3P5ioJsAEAtkclCM4QsIOP+MIPjIa0EyUY khu0kcrmeFR2YwA8zhw= =4w4S -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes. There are only three core changes: adding sense codes, cleaning up noretry and adding an option for limitless retries" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits) scsi: hisi_sas: Recover PHY state according to the status before reset scsi: hisi_sas: Filter out new PHY up events during suspend scsi: hisi_sas: Add device link between SCSI devices and hisi_hba scsi: hisi_sas: Add check for methods _PS0 and _PR0 scsi: hisi_sas: Add controller runtime PM support for v3 hw scsi: hisi_sas: Switch to new framework to support suspend and resume scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() scsi: qedf: Remove redundant assignment to variable 'rc' scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store() scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed scsi: sun_esp: Use module_platform_driver to simplify the code scsi: sun3x_esp: Use module_platform_driver to simplify the code scsi: sni_53c710: Use module_platform_driver to simplify the code scsi: qlogicpti: Use module_platform_driver to simplify the code scsi: mac_esp: Use module_platform_driver to simplify the code scsi: jazz_esp: Use module_platform_driver to simplify the code scsi: mvumi: Fix error return in mvumi_io_attach() scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req() scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs() ...	2020-10-14 15:15:35 -07:00
Mike Christie	2a242d59d6	scsi: core: Add limitless cmd retry support Add infinite retry support to SCSI midlayer by combining common checks for retries into some helper functions, and then checking for the -1/SCSI_CMD_RETRIES_NO_LIMIT. Link: https://lore.kernel.org/r/1601566554-26752-2-git-send-email-michael.christie@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-10-02 18:53:06 -04:00

1 2 3 4 5 ...

314 Commits