Commit Graph

241 Commits

Author SHA1 Message Date
Michal Schmidt 416f1aa3ca driver core: mark async_driver as a const *
JIRA: https://issues.redhat.com/browse/RHEL-59894

commit c6c631d2b72b9390587cd1ee5b7905f8ea5bb1ea
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Jun 11 15:01:09 2024 +0200

    driver core: mark async_driver as a const *

    Within struct device_private, mark the async_driver * as const as it is
    never modified.  This requires some internal-to-the-driver-core
    functions to also have their parameters marked as constant, and there is
    one place where we cast _back_ from the const pointer to a real one, as
    the driver core still wants to modify the structure in a number of
    remaining places.

    Cc: Rafael J. Wysocki <rafael@kernel.org>
    Link: https://lore.kernel.org/r/20240611130103.3262749-12-gregkh@linuxfoundation.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
2024-10-04 14:45:41 +02:00
Michal Schmidt 30fc797fc5 driver core: make driver_detach() take a const *
JIRA: https://issues.redhat.com/browse/RHEL-59894

commit f6e98ef5f78a106821d451f9783dd96ba8551cb3
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Jun 11 15:01:08 2024 +0200

    driver core: make driver_detach() take a const *

    driver_detach() does not modify the driver itself, so make the pointer
    constant.  In doing so, the function driver_allows_async_probing() also
    needs to be changed so that the pointer type passes through to that
    function properly.

    Cc: Rafael J. Wysocki <rafael@kernel.org>
    Link: https://lore.kernel.org/r/20240611130103.3262749-11-gregkh@linuxfoundation.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
2024-10-04 14:45:41 +02:00
Michal Schmidt ad98007225 driver core: make device_release_driver_internal() take a const *
JIRA: https://issues.redhat.com/browse/RHEL-59894

commit 33ebea9bc0a36f62590d37d0a3c859759181573e
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Jun 11 15:01:07 2024 +0200

    driver core: make device_release_driver_internal() take a const *

    Change device_release_driver_internal() to take a const struct
    device_driver * as it is not modifying it at all.

    Cc: Rafael J. Wysocki <rafael@kernel.org>
    Link: https://lore.kernel.org/r/20240611130103.3262749-10-gregkh@linuxfoundation.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
2024-10-04 14:45:40 +02:00
Mark Langsdorf 850660f745 driver core: Call dma_cleanup() on the test_remove path
JIRA: https://issues.redhat.com/browse/RHEL-26183

commit f429378a9bf84d79a7e2cae05d2e3384cf7d68ba
Author: Jason Gunthorpe <jgg@nvidia.com>
Date: Sat, 05 Aug 2023 08:31:41 +0000

When test_remove is enabled really_probe() does not properly pair
dma_configure() with dma_remove(), it will end up calling dma_configure()
twice. This corrupts the owner_cnt and renders the group unusable with
VFIO/etc.

Add the missing cleanup before going back to re_probe.

Fixes: 25f3bcfc54bc ("driver core: Add dma_cleanup callback in bus_type")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Closes: https://lore.kernel.org/all/6472f254-c3c4-8610-4a37-8d9dfdd54ce8@huawei.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v2-4deed94e283e+40948-really_probe_dma_cleanup_jgg@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2024-05-31 15:37:42 -04:00
Mark Langsdorf 9a26baf3e9 driver core: Don't require dynamic_debug for initcall_debug probe timing
JIRA: https://issues.redhat.com/browse/RHEL-1023

commit e2f06aa885081e1391916367f53bad984714b4db
Author: Stephen Boyd <swboyd@chromium.org>
Date: Thu, 20 Apr 2023 14:17:47 +0000

Don't require the use of dynamic debug (or modification of the kernel to
add a #define DEBUG to the top of this file) to get the printk message
about driver probe timing. This printk is only emitted when
initcall_debug is enabled on the kernel commandline, and it isn't
immediately obvious that you have to do something else to debug boot
timing issues related to driver probe. Add a comment too so it doesn't
get converted back to pr_debug().

Fixes: eb7fbc9fb1 ("driver core: Add missing '\n' in log messages")
Cc: stable <stable@kernel.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Brian Norris <briannorris@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-11-01 11:12:34 -05:00
Mark Langsdorf e2656f523b driver core: Make state_synced device attribute writeable
JIRA: https://issues.redhat.com/browse/RHEL-1023

commit f8fb576658a3e19796e2e1a12a5ec8f44dac02b6
Author: Saravana Kannan <saravanak@google.com>
Date: Fri, 10 Mar 2023 09:06:22 +0000

If the file is written to and sync_state() hasn't been called for the
device yet, then call sync_state() for the device independent of the
state of its consumers.

This is useful for supplier devices that have one or more consumers that
don't have a driver but the consumers are in a state that don't use the
resources supplied by the supplier device.

This gives finer grained control than using the
fw_devlink.sync_state=timeout kernel commandline parameter.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-11-01 11:12:28 -05:00
Mark Langsdorf d1e7bc56d7 driver core: Add fw_devlink.sync_state command line param
JIRA: https://issues.redhat.com/browse/RHEL-1023

commit ffbe08a8e86d03513dc45b5389fab7f3477433b6
Author: Saravana Kannan <saravanak@google.com>
Date: Fri, 10 Mar 2023 09:06:21 +0000

When all devices that could probe have finished probing (based on
deferred_probe_timeout configuration or late_initcall() when
!CONFIG_MODULES), this parameter controls what to do with devices that
haven't yet received their sync_state() calls.

fw_devlink.sync_state=strict is the default and the driver core will
continue waiting on all consumers of a device to probe successfully
before sync_state() is called for the device. This is the default
behavior since calling sync_state() on a device when all its consumers
haven't probed could make some systems unusable/unstable. When this
option is selected, we also print the list of devices that haven't had
sync_state() called on them by the time all devices the could probe have
finished probing.

fw_devlink.sync_state=timeout will cause the driver core to give up
waiting on consumers and call sync_state() on any devices that haven't
yet received their sync_state() calls. This option is provided for
systems that won't become unusable/unstable as they might be able to
save power (depends on state of hardware before kernel starts) if all
devices get their sync_state().

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-11-01 11:12:28 -05:00
Ming Lei 7645d00cd6 driver core: return bool from driver_probe_done
JIRA: https://issues.redhat.com/browse/RHEL-1516
Conflicts: keep "extern" since we don't backport 8a2b9c84c708 ("
	driver core: driver.h: remove extern from function prototypes")

commit aa5f6ed8c21ec1aa5fd688118d8d5cd87c5ffc1d
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed May 31 14:55:12 2023 +0200

    driver core: return bool from driver_probe_done

    bool is the most sensible return value for a yes/no return.  Also
    add __init as this funtion is only called from the early boot code.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Link: https://lore.kernel.org/r/20230531125535.676098-2-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2023-09-18 15:59:32 +08:00
Mark Langsdorf 2b5958df68 drivers: base: dd: fix memory leak with using debugfs_lookup()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit 36c893d3a759ae7c91ee7d4871ebfc7504f08c40
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Thu, 2 Feb 2023 15:16:21 +0100

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:33:12 -04:00
Mark Langsdorf d473080b02 driver core: bus: move bus notifier logic into bus.c
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit ed9f918174cb35ba51d2fc86a613305dd8bc4cfe
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Wed, 11 Jan 2023 10:23:31 +0100

The logic to touch the bus notifier was open-coded in numberous places
in the driver core.  Clean that up by creating a local bus_notify()
function and have everyone call this function instead, making the
reading of the caller code simpler and easier to maintain over time.

Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230111092331.3946745-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:33:11 -04:00
Mark Langsdorf ca35715206 driver core: Make driver_deferred_probe_timeout a static variable
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit 504fa212d7030fb1c042290dc2eb92b21515573a
Author: Javier Martinez Canillas <javierm@redhat.com>
Date: Wed, 28 Dec 2022 00:21:52 +0100

It is not used outside of its compilation unit, so there's no need to
export this variable.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20221227232152.3094584-1-javierm@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:33:10 -04:00
Mark Langsdorf 1e01f0eaa5 Revert "driver core: Set default deferred_probe_timeout back to 0."
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit f516d01b9df2782b9399c44fa1d21c3d09211f8a
Author: Saravana Kannan <saravanak@google.com>
Date:   Wed Jun 1 00:07:02 2022 -0700

This reverts commit 11f7e7ef553b6b93ac1aa74a3c2011b9cc8aeb61.

Let's take another shot at getting deferred_probe_timeout=10 to work.

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-7-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:33:10 -04:00
Mark Langsdorf 7d3ed0cb7c driver core: Fix bus_type.match() error handling in __driver_attach()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit 27c0d217340e47ec995557f61423ef415afba987
Author: "Isaac J. Manjarres" <isaacmanjarres@google.com>
Date: Tue, 20 Sep 2022 17:14:13 -0700

When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.

This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.

Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:20:29 -04:00
Mark Langsdorf 13b8ae3c1f driver core: mark driver_allows_async_probing static
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit 189a87f8ef8ceed16b2a230dc0ce65117068ac30
Author: Christoph Hellwig <hch@lst.de>
Date: Sun, 30 Oct 2022 10:22:55 +0100

driver_allows_async_probing is only used in drivers/base/dd.c, so mark
it static and remove the declaration in drivers/base/base.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221030092255.872280-1-hch@lst.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:20:28 -04:00
Mark Langsdorf 94f0a3ef34 driver_core: move from strlcpy with unused retval to strscpy
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit 07b7b883be5ba0b4bd9ebf8d72c236ef36ae2676
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Thu, 18 Aug 2022 22:59:56 +0200

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818205956.6528-1-wsa+renesas@sang-engineering.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:20:26 -04:00
Mark Langsdorf 3416f877ed driver core: Set default deferred_probe_timeout back to 0.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 9be4cbd09da820a20d400670a45fc1571f6a13b8
Author: Saravana Kannan <saravanak@google.com>
Date:   Fri Jun 3 13:31:38 2022 +0200

Since we had to effectively reverted
commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") in an earlier patch, a non-zero
deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set
the default back to zero until we can fix that.

[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/

Fixes: 2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")
Cc: Mark Brown <broonie@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 15:20:18 -04:00
Mark Langsdorf f949c14906 driver core: Don't probe devices after bus_type.match() probe deferral
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542
Author: "Isaac J. Manjarres" <isaacmanjarres@google.com>
Date: Wed, 17 Aug 2022 11:40:26 -0700

Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.

If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.

If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.

Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 15:18:46 -04:00
Mark Langsdorf 282eff4b80 driver core: fix potential deadlock in __driver_attach
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 70fe758352cafdee72a7b13bf9db065f9613ced8
Author: Zhang Wensheng <zhangwensheng5@huawei.com>
Date: Wed, 22 Jun 2022 15:43:27 +0800

In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").

stack like commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").
list below:
    In __driver_attach function, The lock holding logic is as follows:
    ...
    __driver_attach
    if (driver_allows_async_probing(drv))
      device_lock(dev)      // get lock dev
        async_schedule_dev(__driver_attach_async_helper, dev); // func
          async_schedule_node
            async_schedule_node_domain(func)
              entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
              /* when fail or work limit, sync to execute func, but
                 __driver_attach_async_helper will get lock dev as
                 will, which will lead to A-A deadlock.  */
              if (!entry || atomic_read(&entry_count) > MAX_WORK) {
                func;
              else
                queue_work_node(node, system_unbound_wq, &entry->work)
      device_unlock(dev)

    As above show, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not be allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __driver_attach_async_helper getting lock dev.

Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:

[  370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
0 flags:0x00004000
[  370.788865] Call Trace:
[  370.789374]  <TASK>
[  370.789841]  __schedule+0x482/0x1050
[  370.790613]  schedule+0x92/0x1a0
[  370.791290]  schedule_preempt_disabled+0x2c/0x50
[  370.792256]  __mutex_lock.isra.0+0x757/0xec0
[  370.793158]  __mutex_lock_slowpath+0x1f/0x30
[  370.794079]  mutex_lock+0x50/0x60
[  370.794795]  __device_driver_lock+0x2f/0x70
[  370.795677]  ? driver_probe_device+0xd0/0xd0
[  370.796576]  __driver_attach_async_helper+0x1d/0xd0
[  370.797318]  ? driver_probe_device+0xd0/0xd0
[  370.797957]  async_schedule_node_domain+0xa5/0xc0
[  370.798652]  async_schedule_node+0x19/0x30
[  370.799243]  __driver_attach+0x246/0x290
[  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
[  370.800548]  bus_for_each_dev+0x9d/0x130
[  370.801132]  driver_attach+0x22/0x30
[  370.801666]  bus_add_driver+0x290/0x340
[  370.802246]  driver_register+0x88/0x140
[  370.802817]  ? virtio_scsi_init+0x116/0x116
[  370.803425]  scsi_register_driver+0x1a/0x30
[  370.804057]  init_sd+0x184/0x226
[  370.804533]  do_one_initcall+0x71/0x3a0
[  370.805107]  kernel_init_freeable+0x39a/0x43a
[  370.805759]  ? rest_init+0x150/0x150
[  370.806283]  kernel_init+0x26/0x230
[  370.806799]  ret_from_fork+0x1f/0x30

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:14:28 -04:00
Mark Langsdorf 99235a0050 driver core: Add wait_for_init_devices_probe helper function
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 2f8c3ae8288e4a4018330ed5c4e758b878d9c555
Author: Saravana Kannan <saravanak@google.com>
Date: Wed, 1 Jun 2022 00:07:00 -0700

Some devices might need to be probed and bound successfully before the
kernel boot sequence can finish and move on to init/userspace. For
example, a network interface might need to be bound to be able to mount
a NFS rootfs.

With fw_devlink=on by default, some of these devices might be blocked
from probing because they are waiting on a optional supplier that
doesn't have a driver. While fw_devlink will eventually identify such
devices and unblock the probing automatically, it might be too late by
the time it unblocks the probing of devices. For example, the IP4
autoconfig might timeout before fw_devlink unblocks probing of the
network interface.

This function is available to temporarily try and probe all devices that
have a driver even if some of their suppliers haven't been added or
don't have drivers.

The drivers can then decide which of the suppliers are optional vs
mandatory and probe the device if possible. By the time this function
returns, all such "best effort" probes are guaranteed to be completed.
If a device successfully probes in this mode, we delete all fw_devlink
discovered dependencies of that device where the supplier hasn't yet
probed successfully because they have to be optional dependencies.

This also means that some devices that aren't needed for init and could
have waited for their optional supplier to probe (when the supplier's
module is loaded later on) would end up probing prematurely with limited
functionality.  So call this function only when boot would fail without
it.

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:14:24 -04:00
Mark Langsdorf 6871937dde driver core: Fix wait_for_device_probe() & deferred_probe_timeout interaction
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 5ee76c256e928455212ab759c51d198fedbe7523
Author: Saravana Kannan <saravanak@google.com>
Date: Fri, 3 Jun 2022 13:31:37 +0200

Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1].  This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.

Commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.

However, if wait_for_device_probe() is called from the kernel_init()
context:

- Before deferred_probe_initcall() [2], it causes the boot process to
  hang due to a deadlock.

- After deferred_probe_initcall() [3], it blocks kernel_init() from
  continuing till deferred_probe_timeout expires and beats the point of
  deferred_probe_timeout that's trying to wait for userspace to load
  modules.

Neither of this is good. So revert the changes to
wait_for_device_probe().

[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/

Fixes: 35a672363a ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz <jstultz@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:14:24 -04:00
Mark Langsdorf 0bf9fda14e driver core: fix deadlock in __device_attach
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit b232b02bf3c205b13a26dcec08e53baddd8e59ed
Author: Zhang Wensheng <zhangwensheng5@huawei.com>
Date: Wed, 18 May 2022 15:45:16 +0800

In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev)      // get lock dev
  async_schedule_dev(__device_attach_async_helper, dev); // func
    async_schedule_node
      async_schedule_node_domain(func)
        entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
	/* when fail or work limit, sync to execute func, but
	   __device_attach_async_helper will get lock dev as
	   well, which will lead to A-A deadlock.  */
	if (!entry || atomic_read(&entry_count) > MAX_WORK) {
	  func;
	else
	  queue_work_node(node, system_unbound_wq, &entry->work)
  device_unlock(dev)

As shown above, when it is allowed to do async probes, because of
out of memory or work limit, async work is not allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__device_attach_async_helper getting lock dev.

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: 765230b5f0 ("driver-core: add asynchronous probing support for drivers")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:14:17 -04:00
Mark Langsdorf 794d54faaa driver core: Extend deferred probe timeout on driver registration
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
Author: Saravana Kannan <saravanak@google.com>
Date: Fri, 29 Apr 2022 15:09:32 -0700

The deferred probe timer that's used for this currently starts at
late_initcall and runs for driver_deferred_probe_timeout seconds. The
assumption being that all available drivers would be loaded and
registered before the timer expires. This means, the
driver_deferred_probe_timeout has to be pretty large for it to cover the
worst case. But if we set the default value for it to cover the worst
case, it would significantly slow down the average case. For this
reason, the default value is set to 0.

Also, with CONFIG_MODULES=y and the current default values of
driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
drivers will cause their consumer devices to always defer their probes.
This is because device links created by fw_devlink defer the probe even
before the consumer driver's probe() is called.

Instead of a fixed timeout, if we extend an unexpired deferred probe
timer on every successful driver registration, with the expectation more
modules would be loaded in the near future, then the default value of
driver_deferred_probe_timeout only needs to be as long as the worst case
time difference between two consecutive module loads.

So let's implement that and set the default value to 10 seconds when
CONFIG_MODULES=y.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220429220933.1350374-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:14:17 -04:00
Mark Langsdorf 42eea5bb31 driver core: Add "*" wildcard support to driver_async_probe cmdline param
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit f79f662e4cd56f6fd3436dd68057d2ec1f7d2025
Author: Saravana Kannan <saravanak@google.com>
Date: Tue, 3 May 2022 17:53:43 -0700

There's currently no way to use driver_async_probe kernel cmdline param
to enable default async probe for all drivers.  So, add support for "*"
to match with all driver names.  When "*" is used, all other drivers
listed in driver_async_probe are drivers that will NOT match the "*".

For example:
* driver_async_probe=drvA,drvB,drvC
  drvA, drvB and drvC do asynchronous probing.

* driver_async_probe=*
  All drivers do asynchronous probing except those that have set
  PROBE_FORCE_SYNCHRONOUS flag.

* driver_async_probe=*,drvA,drvB,drvC
  All drivers do asynchronous probing except drvA, drvB, drvC and those
  that have set PROBE_FORCE_SYNCHRONOUS flag.

Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Feng Tang <feng.tang@intel.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220504005344.117803-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:14:16 -04:00
Mark Langsdorf f4ea408b94 driver core: Prevent overriding async driver of a device before it probe
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 84e7c6786aad1dffa04f5729270f8fcd7281fe4b
Author: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Date: Wed, 16 Mar 2022 15:43:28 +0800

When there are 2 matched drivers for a device using
async probe mechanism, the dev->p->async_driver might
be overridden by the last attached driver.
So just skip the later one if the previous matched driver
was not handled by async thread yet.

Below is my use case which having this problem.

Make both driver mmcblk and mmc_test allow async probe,
the dev->p->async_driver will be overridden by the later driver
mmc_test and bind to the device then claim it for testing.
When it happen, mmcblk will never do probe again.

Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Link: https://lore.kernel.org/r/20220316074328.1801-1-mark-pk.tsai@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 10:32:56 -04:00
Mark Langsdorf 37752e64f2 Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 4c32174a24759d5ac6dc42b508fcec2afb8b9602
Author: Bagas Sanjaya <bagasdotme@gmail.com>
Date: Sat, 16 Apr 2022 14:11:38 +0700

Sphinx reported build warnings mentioning drivers/base/dd.c:

</path/to/linux>/Documentation/driver-api/infrastructure:35:
./drivers/base/dd.c:280: WARNING: Unexpected indentation.
</path/to/linux>/Documentation/driver-api/infrastructure:35:
./drivers/base/dd.c:281: WARNING: Block quote ends without a blank line;
unexpected unindent.

The warnings above is due to syntax error in the "Return" section of driver_deferred_probe_check_state() which messed up with desired line breaks.

Fix the issue by using ReST lists syntax.

Fixes: c8c43cee29 ("driver core: Fix driver_deferred_probe_check_state() logic")
Cc: linux-pm@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220416071137.19512-1-bagasdotme@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 10:32:56 -04:00
Alex Williamson 753a9bf8b3 driver core: Add dma_cleanup callback in bus_type
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2124620

commit 25f3bcfc54bcf7b0e45d140ec8bfbbf10ba11869
Author: Lu Baolu <baolu.lu@linux.intel.com>
Date:   Mon Apr 18 08:49:51 2022 +0800

    driver core: Add dma_cleanup callback in bus_type

    The bus_type structure defines dma_configure() callback for bus drivers
    to configure DMA on the devices. This adds the paired dma_cleanup()
    callback and calls it during driver unbinding so that bus drivers can do
    some cleanup work.

    One use case for this paired DMA callbacks is for the bus driver to check
    for DMA ownership conflicts during driver binding, where multiple devices
    belonging to a same IOMMU group (the minimum granularity of isolation and
    protection) may be assigned to kernel drivers or user space respectively.

    Without this change, for example, the vfio driver has to listen to a bus
    BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
    This leads to bad user experience since careless driver binding operation
    may crash the system if the admin overlooks the group restriction. Aside
    from bad design, this leads to a security problem as a root user, even with
    lockdown=integrity, can force the kernel to BUG.

    With this change, the bus driver could check and set the DMA ownership in
    driver binding process and fail on ownership conflicts. The DMA ownership
    should be released during driver unbinding.

    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
    Link: https://lore.kernel.org/r/20220418005000.897664-3-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-14 12:35:44 -06:00
Mark Langsdorf 7f793178e6 drivers/base/dd.c : Remove the initial value of the global variable
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit 901581389eade09af969c1a4183e17ec663131d0
Author: lizhe <sensor1010@163.com>
Date: Wed, 9 Mar 2022 05:54:18 -0800

The global variable driver_deferred_probe_enable has a default value of
false and does not need to be initialized to false.

Signed-off-by: lizhe <sensor1010@163.com>
Link: https://lore.kernel.org/r/20220309135418.31101-1-sensor1010@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:17 -05:00
Mark Langsdorf 388781daea driver core: dd: fix return value of __setup handler
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit f2aad54703dbe630f9d8b235eb58e8c8cc78f37d
Author: Randy Dunlap <rdunlap@infradead.org>
Date: Mon, 28 Feb 2022 20:18:29 -0800

When "driver_async_probe=nulltty" is used on the kernel boot command line,
it causes an Unknown parameter message and the string is added to init's
environment strings, polluting them.

  Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
  driver_async_probe=nulltty", will be passed to user space.

 Run /sbin/init as init process
   with arguments:
     /sbin/init
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc6
     driver_async_probe=nulltty

Change the return value of the __setup function to 1 to indicate
that the __setup option has been handled.

Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Fixes: 1ea61b68d0 ("async: Add cmdline option to specify drivers to be async probed")
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Reviewed-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220301041829.15137-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:17 -05:00
Mark Langsdorf e5bea3e514 driver core: Refactor sysfs and drv/bus remove hooks
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit 4b775aaf1ea9997f5eb1a792f357a7b81a1fc632
Author: Rob Herring <robh@kernel.org>
Date: Wed, 23 Feb 2022 16:52:57 -0600

There are 3 copies of the same device sysfs cleanup and drv/bus remove()
hooks used for probe failure, testing re-probing, and device unbinding.

Let's refactor the code to its own function.

Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220223225257.1681968-3-robh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:17 -05:00
Mark Langsdorf 35dbc7fedc driver core: Refactor multiple copies of device cleanup
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit 9ad307213fa4081f4bc2f2daa31d4f2d35d7a213
Author: Rob Herring <robh@kernel.org>
Date: Wed, 23 Feb 2022 16:52:56 -0600

There are 3 copies of the same device cleanup code used for probe failure,
testing re-probing, and device unbinding. Changes to this code often miss
at least one of the copies of the code. See commits d0243bbd5d ("drivers
core: Free dma_range_map when driver probe failed") and d8f7a5484f21
("driver core: Free DMA range map when device is released") for example.

Let's refactor the code to its own function.

Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220223225257.1681968-2-robh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:17 -05:00
Mark Langsdorf 657ef489c4 driver core: cleanup double words comments
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit b4ae8c2fb673d2fc60cb8fe645dba4f4db8b0dab
Author: Tom Rix <trix@redhat.com>
Date: Sat, 12 Feb 2022 06:32:33 -0800

Remove the second 'are' and 'the'.

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220212143233.2648872-1-trix@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:16 -05:00
Mark Langsdorf 9abcdf3aa1 driver core: Free DMA range map when device is released
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit d8f7a5484f2188e9af2d9e4e587587d724501b12
Author: =?UTF-8?q?M=C3=A5rten=20Lindahl?= <marten.lindahl@axis.com>
Date: Wed, 16 Feb 2022 10:41:28 +0100

When unbinding/binding a driver with DMA mapped memory, the DMA map is
not freed before the driver is reloaded. This leads to a memory leak
when the DMA map is overwritten when reprobing the driver.

This can be reproduced with a platform driver having a dma-range:

dummy {
	...
	#address-cells = <0x2>;
	#size-cells = <0x2>;
	ranges;
	dma-ranges = <...>;
	...
};

and then unbinding/binding it:

~# echo soc:dummy >/sys/bus/platform/drivers/<driver>/unbind

DMA map object 0xffffff800b0ae540 still being held by &pdev->dev

~# echo soc:dummy >/sys/bus/platform/drivers/<driver>/bind
~# echo scan > /sys/kernel/debug/kmemleak
~# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffff800b0ae540 (size 64):
  comm "sh", pid 833, jiffies 4295174550 (age 2535.352s)
  hex dump (first 32 bytes):
    00 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 80 00 00 00 00 00 00 00 80 00 00 00 00  ................
  backtrace:
    [<ffffffefd1694708>] create_object.isra.0+0x108/0x344
    [<ffffffefd1d1a850>] kmemleak_alloc+0x8c/0xd0
    [<ffffffefd167e2d0>] __kmalloc+0x440/0x6f0
    [<ffffffefd1a960a4>] of_dma_get_range+0x124/0x220
    [<ffffffefd1a8ce90>] of_dma_configure_id+0x40/0x2d0
    [<ffffffefd198b68c>] platform_dma_configure+0x5c/0xa4
    [<ffffffefd198846c>] really_probe+0x8c/0x514
    [<ffffffefd1988990>] __driver_probe_device+0x9c/0x19c
    [<ffffffefd1988cd8>] device_driver_attach+0x54/0xbc
    [<ffffffefd1986634>] bind_store+0xc4/0x120
    [<ffffffefd19856e0>] drv_attr_store+0x30/0x44
    [<ffffffefd173c9b0>] sysfs_kf_write+0x50/0x60
    [<ffffffefd173c1c4>] kernfs_fop_write_iter+0x124/0x1b4
    [<ffffffefd16a013c>] new_sync_write+0xdc/0x160
    [<ffffffefd16a256c>] vfs_write+0x23c/0x2a0
    [<ffffffefd16a2758>] ksys_write+0x64/0xec

To prevent this we should free the dma_range_map when the device is
released.

Fixes: e0d072782c ("dma-mapping: introduce DMA range map, supplanting dma_pfn_offset")
Cc: stable <stable@vger.kernel.org>
Suggested-by: Rob Herring <robh@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com>
Link: https://lore.kernel.org/r/20220216094128.4025861-1-marten.lindahl@axis.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:16 -05:00
Mark Langsdorf ab6d934b4b driver core: Make bus notifiers in right order in really_probe()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit 00eb74ea2c14418042347eaa34c6b73ac6ec1e76
Author: Lu Baolu <baolu.lu@linux.intel.com>
Date: Fri, 31 Dec 2021 11:39:01 +0800

If a driver cannot be bound to a device, the correct bus notifier order
should be:

 - BUS_NOTIFY_BIND_DRIVER: driver is about to be bound
 - BUS_NOTIFY_DRIVER_NOT_BOUND: driver failed to be bound

or no notifier if the failure happens before the actual binding.

The really_probe() notifies a BUS_NOTIFY_DRIVER_NOT_BOUND event without
a BUS_NOTIFY_BIND_DRIVER if .dma_configure() returns failure. This
change makes the notifiers in order.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211231033901.2168664-3-baolu.lu@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:16 -05:00
Mark Langsdorf a13ec9ee2c driver core: Move driver_sysfs_remove() after driver_sysfs_add()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067284

commit 885e50253bfd6750327a265405461496d6af1639
Author: Lu Baolu <baolu.lu@linux.intel.com>
Date: Fri, 31 Dec 2021 11:39:00 +0800

The driver_sysfs_remove() should be called after driver_sysfs_add() in
really_probe(). The out-of-order driver_sysfs_remove() tries to remove
some nonexistent nodes under the device and driver sysfs nodes. This is
allowed, hence this change doesn't fix any problem, just a cleanup.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211231033901.2168664-2-baolu.lu@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:04:15 -05:00
Mark Langsdorf 264db420f3 driver core: Fix error return code in really_probe()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067252

commit f04948dea236b000da09c466a7ec931ecd8d7867
Author: Zhen Lei <thunder.leizhen@huawei.com>
Date: Wed, 7 Jul 2021 15:43:01 +0800

In the case of error handling, the error code returned by the subfunction
should be propagated instead of 0.

Fixes: 1901fb2604 ("Driver core: fix "driver" symlink timing")
Fixes: 23b6904442 ("driver core: add dev_groups to all drivers")
Fixes: 8fd456ec0c ("driver core: Add state_synced sysfs file for devices that support it")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210707074301.2722-1-thunder.leizhen@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:01:47 -05:00
Filip Schauer 4d1014c181 drivers core: Fix oops when driver probe fails
dma_range_map is freed to early, which might cause an oops when
a driver probe fails.
 Call trace:
  is_free_buddy_page+0xe4/0x1d4
  __free_pages+0x2c/0x88
  dma_free_contiguous+0x64/0x80
  dma_direct_free+0x38/0xb4
  dma_free_attrs+0x88/0xa0
  dmam_release+0x28/0x34
  release_nodes+0x78/0x8c
  devres_release_all+0xa8/0x110
  really_probe+0x118/0x2d0
  __driver_probe_device+0xc8/0xe0
  driver_probe_device+0x54/0xec
  __driver_attach+0xe0/0xf0
  bus_for_each_dev+0x7c/0xc8
  driver_attach+0x30/0x3c
  bus_add_driver+0x17c/0x1c4
  driver_register+0xc0/0xf8
  __platform_driver_register+0x34/0x40
  ...

This issue is introduced by commit d0243bbd5d ("drivers core:
Free dma_range_map when driver probe failed"). It frees
dma_range_map before the call to devres_release_all, which is too
early. The solution is to free dma_range_map only after
devres_release_all.

Fixes: d0243bbd5d ("drivers core: Free dma_range_map when driver probe failed")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Filip Schauer <filip@mg6.at>
Link: https://lore.kernel.org/r/20210727112311.GA7645@DESKTOP-E8BN1B0.localdomain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-27 14:44:43 +02:00
Jason Gunthorpe 0d9f837c69 driver core: Export device_driver_attach()
This is intended as a replacement API for device_bind_driver(). It has at
least the following benefits:

- Internal locking. Few of the users of device_bind_driver() follow the
  locking rules

- Calls device driver probe() internally. Notably this means that devm
  support for probe works correctly as probe() error will call
  devres_release_all()

- struct device_driver -> dev_groups is supported

- Simplified calling convention, no need to manually call probe().

The general usage is for situations that already know what driver to bind
and need to ensure the bind is synchronized with other logic. Call
device_driver_attach() after device_add().

If probe() returns a failure then this will be preserved up through to the
error return of device_driver_attach().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210617142218.1877096-6-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-21 15:29:24 -06:00
Christoph Hellwig 45ddcb4294 driver core: Don't return EPROBE_DEFER to userspace during sysfs bind
EPROBE_DEFER is an internal kernel error code and it should not be leaked
to userspace via the bind_store() sysfs. Userspace doesn't have this
constant and cannot understand it.

Further, it doesn't really make sense to have userspace trigger a deferred
probe via bind_store(), which could eventually succeed, while
simultaneously returning an error back.

Resolve this by splitting driver_probe_device so that the version used
by the sysfs binding that turns EPROBE_DEFER into -EAGAIN, while the one
used for internally binding keeps the error code, and calls
driver_deferred_probe_add where needed.  This also allows to nicely split
out the defer_all_probes / probe_count checks so that they actually allow
for full device_{block,unblock}_probing protection while not bothering
the sysfs bind case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210617142218.1877096-5-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-21 15:29:24 -06:00
Christoph Hellwig ef6dcbdd8e driver core: Flow the return code from ->probe() through to sysfs bind
Currently really_probe() returns 1 on success and 0 if the probe() call
fails. This return code arrangement is designed to be useful for
__device_attach_driver() which is walking the device list and trying every
driver. 0 means to keep trying.

However, it is not useful for the other places that call through to
really_probe() that do actually want to see the probe() return code.

For instance bind_store() would be better to return the actual error code
from the driver's probe method, not discarding it and returning -ENODEV.

Reorganize things so that really_probe() returns the error code from
->probe as a (inverted) positive number, and 0 for successful attach.

With this, __device_attach_driver can ignore the (positive) probe errors,
return 1 to exit the loop for a successful binding and pass on the
other negative errors, while device_driver_attach simplify inverts the
positive errors and returns all errors to the sysfs code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20210617142218.1877096-4-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-21 15:29:24 -06:00
Christoph Hellwig e1499647c6 driver core: Better distinguish probe errors in really_probe
really_probe tries to special case errors from ->probe, but due to all
other initialization added to the function over time now a lot of
internal errors hit that code path as well.  Untangle that by adding
a new probe_err local variable and apply the special casing only to
that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Link: https://lore.kernel.org/r/20210617142218.1877096-3-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-21 15:29:24 -06:00
Jason Gunthorpe 204db60c83 driver core: Pull required checks into driver_probe_device()
Checking if the dev is dead or if the dev is already bound is a required
precondition to invoking driver_probe_device(). All the call chains
leading here duplicate these checks.

Add it directly to driver_probe_device() so the precondition is clear and
remove the checks from device_driver_attach() and
__driver_attach_async_helper().

The other call chain going through __device_attach_driver() does have
these same checks but they are inlined into logic higher up the call stack
and can't be removed.

The sysfs uAPI call chain starting at bind_store() is a bit confused
because it reads dev->driver unlocked and returns -ENODEV if it is !NULL,
otherwise it reads it again under lock and returns 0 if it is !NULL. Fix
this to always return -EBUSY and always read dev->driver under its lock.

Done in preparation for the next patches which will add additional
callers to driver_probe_device() and will need these checks as well.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
[hch: drop the extra checks in device_driver_attach and bind_store]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20210617142218.1877096-2-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-21 15:29:24 -06:00
Greg Kroah-Hartman a00fcbc115 Linux 5.12-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmBzdS0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGDdAIAIpKH/tAHhH7s7QH
 m5ewgE8foP7M5Ue9fp3+JmbtaYSzhCAMcKhqGtat/zk5PvA9AoYCDXrTetfYtBHh
 LUOmhL9hcKItNobfkYBok6BiFjGUEL3HMqz5w+MUsMwnXIc4RXqfJmsQ932z9Kxf
 yDwe6ehIzJVrQLI/C0mTamYRHu2aiZ1VWzhKuT493rLeg0R2odCCIClPN+/QvCwb
 8/sk6l1c8eOUYYMUzKFZifaZGb12qDjRt4pZmk51aMTzg0WCpElJG+7Uqr4QQhZP
 p6xeNuUQq6WwxtlDkmo79Uzkrurb5tN2/hZ1RcJhs3EdHfpR0MjIyH3Znnb31gnu
 39VjHhg=
 =4KP/
 -----END PGP SIGNATURE-----

Merge tag 'v5.12-rc7' into driver-core-next

We need the driver core fix in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-14 19:53:39 +02:00
Ahmad Fatoum 72a91f192d driver core: add helper for deferred probe reason setting
We now have three places within the same file doing the same operation
of freeing this pointer and setting it anew. A helper makes this
arguably easier to read, so add one.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20210323153714.25120-2-a.fatoum@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05 13:17:01 +02:00
Saravana Kannan d46f3e3ed5 driver core: Improve fw_devlink & deferred_probe_timeout interaction
deferred_probe_timeout kernel commandline parameter allows probing of
consumer devices if the supplier devices don't have any drivers.

fw_devlink=on will indefintely block probe() calls on a device if all
its suppliers haven't probed successfully. This completely skips calls
to driver_deferred_probe_check_state() since that's only called when a
.probe() function calls framework APIs. So fw_devlink=on breaks
deferred_probe_timeout.

deferred_probe_timeout in its current state also ignores a lot of
information that's now available to the kernel. It assumes all suppliers
that haven't probed when the timer expires (or when initcalls are done
on a static kernel) will never probe and fails any calls to acquire
resources from these unprobed suppliers.

However, this assumption by deferred_probe_timeout isn't true under many
conditions. For example:
- If the consumer happens to be before the supplier in the deferred
  probe list.
- If the supplier itself is waiting on its supplier to probe.

This patch fixes both these issues by relaxing device links between
devices only if the supplier doesn't have any driver that could match
with (NOT bound to) the supplier device. This way, we only fail attempts
to acquire resources from suppliers that truly don't have any driver vs
suppliers that just happen to not have probed yet.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210402040342.2944858-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05 09:17:56 +02:00
Saravana Kannan eed6e41813 driver core: Fix locking bug in deferred_probe_timeout_work_func()
list_for_each_entry_safe() is only useful if we are deleting nodes in a
linked list within the loop. It doesn't protect against other threads
adding/deleting nodes to the list in parallel. We need to grab
deferred_probe_mutex when traversing the deferred_probe_pending_list.

Cc: stable@vger.kernel.org
Fixes: 25b4e70dcc ("driver core: allow stopping deferred probe after init")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210402040342.2944858-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05 09:14:18 +02:00
Greg Kroah-Hartman b20e829390 Merge 5.12-rc6 into driver-core-next
We need the driver core fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05 08:51:37 +02:00
Jia-Ju Bai d225ef6fda base: dd: fix error return code of driver_sysfs_add()
When device_create_file() fails and returns a non-zero value,
no error return code of driver_sysfs_add() is assigned.
To fix this bug, ret is assigned with the return value of
device_create_file(), and then ret is checked.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Link: https://lore.kernel.org/r/20210324023405.12465-1-baijiaju1990@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-28 14:55:39 +02:00
Yogesh Lal e611f8cd87 driver core: Use unbound workqueue for deferred probes
Deferred probe usually runs only on pinned kworkers, which might take
longer time if a device contains multiple sub-devices. One such case
is of sound card on mobile devices, where we have good number of
mixers and controls per mixer.

We observed boot up improvement - deferred probes take ~600ms when bound
to little core kworker and ~200ms when deferred probe is queued on
unbound wq. This is due to scheduler moving the worker running deferred
probe work to big CPUs. Without this change, we see the worker is running
on LITTLE CPU due to affinity.

Since kworker runs deferred probe of several devices, the locality may
not be important. Also, init thread executing driver initcalls, can
potentially migrate as it has cpu affinity set to all cpus.In addition
to this, async probes use unbounded workqueue. So, using unbounded wq for
deferred probes looks to be similar to these w.r.t. scheduling behavior.

Signed-off-by: Yogesh Lal <ylal@codeaurora.org>
Link: https://lore.kernel.org/r/1616583698-6398-1-git-send-email-ylal@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-28 14:55:39 +02:00
Ahmad Fatoum f0acf637d6 driver core: clear deferred probe reason on probe retry
When retrying a deferred probe, any old defer reason string should be
discarded. Otherwise, if the probe is deferred again at a different spot,
but without setting a message, the now incorrect probe reason will remain.

This was observed with the i.MX I2C driver, which ultimately failed
to probe due to lack of the GPIO driver. The probe defer for GPIO
doesn't record a message, but a previous probe defer to clock_get did.
This had the effect that /sys/kernel/debug/devices_deferred listed
a misleading probe deferral reason.

Cc: stable <stable@vger.kernel.org>
Fixes: d090b70ede ("driver core: add deferring probe reason to devices_deferred property")
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20210319110459.19966-1-a.fatoum@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 15:13:43 +01:00
Saravana Kannan b6f617df4f driver core: Update device link status properly for device_bind_driver()
Device link status was not getting updated correctly when
device_bind_driver() is called on a device. This causes a warning[1].
Fix this by updating device links that can be updated and dropping
device links that can't be updated to a sensible state.

[1] - https://lore.kernel.org/lkml/56f7d032-ba5a-a8c7-23de-2969d98c527e@nvidia.com/

Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210302211133.2244281-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:58:11 +01:00