linux-kernelorg-stable/drivers
Maarten Lankhorst a91c809659 devcoredump: Fix circular locking dependency with devcd->mutex.
The original code causes a circular locking dependency found by lockdep.

======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 Tainted: G S   U
------------------------------------------------------
xe_fault_inject/5091 is trying to acquire lock:
ffff888156815688 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}, at: __flush_work+0x25d/0x660

but task is already holding lock:

ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&devcd->mutex){+.+.}-{3:3}:
       mutex_lock_nested+0x4e/0xc0
       devcd_data_write+0x27/0x90
       sysfs_kf_bin_write+0x80/0xf0
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (kn->active#236){++++}-{0:0}:
       kernfs_drain+0x1e2/0x200
       __kernfs_remove+0xae/0x400
       kernfs_remove_by_name_ns+0x5d/0xc0
       remove_files+0x54/0x70
       sysfs_remove_group+0x3d/0xa0
       sysfs_remove_groups+0x2e/0x60
       device_remove_attrs+0xc7/0x100
       device_del+0x15d/0x3b0
       devcd_del+0x19/0x30
       process_one_work+0x22b/0x6f0
       worker_thread+0x1e8/0x3d0
       kthread+0x11c/0x250
       ret_from_fork+0x26c/0x2e0
       ret_from_fork_asm+0x1a/0x30
-> #0 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}:
       __lock_acquire+0x1661/0x2860
       lock_acquire+0xc4/0x2f0
       __flush_work+0x27a/0x660
       flush_delayed_work+0x5d/0xa0
       dev_coredump_put+0x63/0xa0
       xe_driver_devcoredump_fini+0x12/0x20 [xe]
       devm_action_release+0x12/0x30
       release_nodes+0x3a/0x120
       devres_release_all+0x8a/0xd0
       device_unbind_cleanup+0x12/0x80
       device_release_driver_internal+0x23a/0x280
       device_driver_detach+0x14/0x20
       unbind_store+0xaf/0xc0
       drv_attr_store+0x21/0x50
       sysfs_kf_write+0x4a/0x80
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
other info that might help us debug this:
Chain exists of: (work_completion)(&(&devcd->del_wk)->work) --> kn->active#236 --> &devcd->mutex
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&devcd->mutex);
                               lock(kn->active#236);
                               lock(&devcd->mutex);
  lock((work_completion)(&(&devcd->del_wk)->work));
 *** DEADLOCK ***
5 locks held by xe_fault_inject/5091:
 #0: ffff8881129f9488 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x72/0xf0
 #1: ffff88810c755078 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x123/0x220
 #2: ffff8881054811a0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x55/0x280
 #3: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
 #4: ffffffff8359e020 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x660
stack backtrace:
CPU: 14 UID: 0 PID: 5091 Comm: xe_fault_inject Tainted: G S   U              6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 PREEMPT_{RT,(lazy)}
Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.10 12/13/2021
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xf0
 dump_stack+0x10/0x20
 print_circular_bug+0x285/0x360
 check_noncircular+0x135/0x150
 ? register_lock_class+0x48/0x4a0
 __lock_acquire+0x1661/0x2860
 lock_acquire+0xc4/0x2f0
 ? __flush_work+0x25d/0x660
 ? mark_held_locks+0x46/0x90
 ? __flush_work+0x25d/0x660
 __flush_work+0x27a/0x660
 ? __flush_work+0x25d/0x660
 ? trace_hardirqs_on+0x1e/0xd0
 ? __pfx_wq_barrier_func+0x10/0x10
 flush_delayed_work+0x5d/0xa0
 dev_coredump_put+0x63/0xa0
 xe_driver_devcoredump_fini+0x12/0x20 [xe]
 devm_action_release+0x12/0x30
 release_nodes+0x3a/0x120
 devres_release_all+0x8a/0xd0
 device_unbind_cleanup+0x12/0x80
 device_release_driver_internal+0x23a/0x280
 ? bus_find_device+0xa8/0xe0
 device_driver_detach+0x14/0x20
 unbind_store+0xaf/0xc0
 drv_attr_store+0x21/0x50
 sysfs_kf_write+0x4a/0x80
 kernfs_fop_write_iter+0x169/0x220
 vfs_write+0x293/0x560
 ksys_write+0x72/0xf0
 __x64_sys_write+0x19/0x30
 x64_sys_call+0x2bf/0x2660
 do_syscall_64+0x93/0xb60
 ? __f_unlock_pos+0x15/0x20
 ? __x64_sys_getdents64+0x9b/0x130
 ? __pfx_filldir64+0x10/0x10
 ? do_syscall_64+0x1a2/0xb60
 ? clear_bhb_loop+0x30/0x80
 ? clear_bhb_loop+0x30/0x80
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x76e292edd574
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
RSP: 002b:00007fffe247a828 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076e292edd574
RDX: 000000000000000c RSI: 00006267f6306063 RDI: 000000000000000b
RBP: 000000000000000c R08: 000076e292fc4b20 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 00006267f6306063
R13: 000000000000000b R14: 00006267e6859c00 R15: 000076e29322a000
 </TASK>
xe 0000:03:00.0: [drm] Xe device coredump has been deleted.

Fixes: 01daccf748 ("devcoredump : Serialize devcd_del work")
Cc: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250723142416.1020423-1-dev@lankhorst.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-17 09:47:40 +02:00
..
accel
accessibility
acpi More ACPI support updates for 6.18-rc1 2025-10-07 09:45:07 -07:00
amba
android
ata
atm
auxdisplay
base devcoredump: Fix circular locking dependency with devcd->mutex. 2025-10-17 09:47:40 +02:00
bcma
block block-6.18-20251009 2025-10-10 10:37:13 -07:00
bluetooth
bus Char/Misc/IIO/Binder changes for 6.18-rc1 2025-10-04 16:26:32 -07:00
cache
cdrom
cdx Char/Misc/IIO/Binder changes for 6.18-rc1 2025-10-04 16:26:32 -07:00
char tpm: Prevent local DOS via tpm/tpm0/ppi/*operations 2025-10-10 08:21:45 +03:00
clk There's a bunch of patches here across drivers/clk/ to migrate drivers to use 2025-10-07 09:28:37 -07:00
clocksource hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
comedi
connector
counter
cpufreq
cpuidle
crypto This push contains the following changes: 2025-10-08 09:38:31 -07:00
cxl
dax
dca
devfreq
dibs
dio
dma dmaengine updates for v6.18 2025-10-06 10:37:06 -07:00
dma-buf
dpll
edac
eisa
extcon
firewire
firmware EFI updates for v6.18 2025-10-05 12:08:14 -07:00
fpga
fsi
fwctl
gnss
gpio gpio: wcd934x: mark the GPIO controller as sleeping 2025-10-10 09:37:19 +02:00
gpu drm next fixes for 6.18-rc1 2025-10-10 14:02:14 -07:00
greybus
hid hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
hsi
hte
hv
hwmon
hwspinlock
hwtracing Char/Misc/IIO/Binder changes for 6.18-rc1 2025-10-04 16:26:32 -07:00
i2c Revert "i2c: boardinfo: Annotate code used in init phase only" 2025-10-11 23:57:33 +02:00
i3c
idle
iio
infiniband
input Input updates for v6.18-rc0 2025-10-08 09:44:38 -07:00
interconnect
iommu
ipack
irqchip irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resume 2025-10-07 10:23:22 +02:00
isdn
leds
macintosh
mailbox qcom: add Glymur CPUCP mailbox binding 2025-10-08 11:44:21 -07:00
mcb
md
media USB/Thunderbolt changes for 6.18-rc1 2025-10-04 16:07:08 -07:00
memory
memstick
message
mfd
misc - Remove a bunch of asm implementing condition flags testing in KVM's 2025-10-11 11:19:16 -07:00
mmc
most
mtd MTD core: 2025-10-04 15:50:37 -07:00
mux
net Including fixes from netfilter. 2025-10-09 11:13:08 -07:00
nfc
ntb
nubus
nvdimm libnvdimm for 6.18 2025-10-06 11:17:18 -07:00
nvme
nvmem Char/Misc fixes for 6.18-rc1 2025-10-07 12:13:26 -07:00
of Devicetree fixes for v6.18: 2025-10-10 13:05:40 -07:00
opp
parisc
parport
pci pci-v6.18-fixes-1 2025-10-08 18:51:00 -07:00
pcmcia
peci
perf arm64 fixes for -rc1 2025-10-07 08:59:25 -07:00
phy phy-for-6.18 2025-10-06 10:34:22 -07:00
pinctrl pci-v6.18-changes 2025-10-06 10:41:03 -07:00
platform
pmdomain
pnp
power
powercap
pps
ps3
ptp
pwm
rapidio
ras
regulator
remoteproc remoteproc updates for v6.18 2025-10-04 15:45:17 -07:00
reset
rpmsg
rtc RTC for 6.18 2025-10-11 11:56:47 -07:00
s390 more s390 updates for 6.18 merge window 2025-10-09 10:51:43 -07:00
sbus
scsi SCSI misc on 20251011 2025-10-11 11:49:00 -07:00
sh
siox
slimbus
soc - switch longson32 platform to DT and use MIPS_GENERIC framework 2025-10-05 10:09:55 -07:00
soundwire soundwire updates for 6.18 2025-10-06 10:32:22 -07:00
spi
spmi
ssb
staging Staging driver fixes for 6.18-rc1 2025-10-07 11:41:06 -07:00
target SCSI misc on 20251011 2025-10-11 11:49:00 -07:00
tc
tee
thermal
thunderbolt
tty TTY driver fix for 6.18-rc1 2025-10-07 11:36:01 -07:00
ufs SCSI misc on 20251011 2025-10-11 11:49:00 -07:00
uio hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
usb USB/Thunderbolt changes for 6.18-rc1 2025-10-04 16:07:08 -07:00
vdpa
vfio vfio: Dump migration features under debugfs 2025-10-06 11:22:48 -06:00
vhost
video fbdev fixes & enhancements for 6.18-rc1: 2025-10-10 09:36:23 -07:00
virt
virtio
w1
watchdog linux-watchdog 6.18-rc1 tag 2025-10-06 11:00:30 -07:00
xen
zorro
Kconfig
Makefile hyperv-next for v6.18 2025-10-07 08:40:15 -07:00