linux-kernelorg-stable/include/drm/drm_device.h

372 lines
9.3 KiB
C
Raw Normal View History

#ifndef _DRM_DEVICE_H_
#define _DRM_DEVICE_H_
#include <linux/list.h>
#include <linux/kref.h>
#include <linux/mutex.h>
#include <linux/idr.h>
#include <drm/drm_mode_config.h>
struct drm_driver;
struct drm_minor;
struct drm_master;
struct drm_vblank_crtc;
struct drm_vma_offset_manager;
struct drm_vram_mm;
struct drm_fb_helper;
struct inode;
struct pci_dev;
struct pci_controller;
drm: Introduce device wedged event Introduce device wedged event, which notifies userspace of 'wedged' (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is no longer operating as expected and has become unrecoverable from driver context. Purpose of this implementation is to provide drivers a generic way to recover the device with the help of userspace intervention without taking any drastic measures (like resetting or re-enumerating the full bus, on which the underlying physical device is sitting) in the driver. A 'wedged' device is basically a device that is declared dead by the driver after exhausting all possible attempts to recover it from driver context. The uevent is the notification that is sent to userspace along with a hint about what could possibly be attempted to recover the device from userspace and bring it back to usable state. Different drivers may have different ideas of a 'wedged' device depending on hardware implementation of the underlying physical device, and hence the vendor agnostic nature of the event. It is up to the drivers to decide when they see the need for device recovery and how they want to recover from the available methods. Driver prerequisites -------------------- The driver, before opting for recovery, needs to make sure that the 'wedged' device doesn't harm the system as a whole by taking care of the prerequisites. Necessary actions must include disabling DMA to system memory as well as any communication channels with other devices. Further, the driver must ensure that all dma_fences are signalled and any device state that the core kernel might depend on is cleaned up. All existing mmaps should be invalidated and page faults should be redirected to a dummy page. Once the event is sent, the device must be kept in 'wedged' state until the recovery is performed. New accesses to the device (IOCTLs) should be rejected, preferably with an error code that resembles the type of failure the device has encountered. This will signify the reason for wedging, which can be reported to the application if needed. Recovery -------- Current implementation defines three recovery methods, out of which, drivers can use any one, multiple or none. Method(s) of choice will be sent in the uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in order of less to more side-effects. If driver is unsure about recovery or method is unknown (like soft/hard system reboot, firmware flashing, physical device replacement or any other procedure which can't be attempted on the fly), ``WEDGED=unknown`` will be sent instead. Userspace consumers can parse this event and attempt recovery as per the following expectations. =============== ======================================== Recovery method Consumer expectations =============== ======================================== none optional telemetry collection rebind unbind + bind driver bus-reset unbind + bus reset/re-enumeration + bind unknown consumer policy =============== ======================================== The only exception to this is ``WEDGED=none``, which signifies that the device was temporarily 'wedged' at some point but was recovered from driver context using device specific methods like reset. No explicit recovery is expected from the consumer in this case, but it can still take additional steps like gathering telemetry information (devcoredump, syslog). This is useful because the first hang is usually the most critical one which can result in consequential hangs or complete wedging. Consumer prerequisites ---------------------- It is the responsibility of the consumer to make sure that the device or its resources are not in use by any process before attempting recovery. With IOCTLs erroring out, all device memory should be unmapped and file descriptors should be closed to prevent leaks or undefined behaviour. The idea here is to clear the device of all user context beforehand and set the stage for a clean recovery. Example ------- Udev rule:: SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]", RUN+="/path/to/rebind.sh $env{DEVPATH}" Recovery script:: #!/bin/sh DEVPATH=$(readlink -f /sys/$1/device) DEVICE=$(basename $DEVPATH) DRIVER=$(readlink -f $DEVPATH/driver) echo -n $DEVICE > $DRIVER/unbind echo -n $DEVICE > $DRIVER/bind Customization ------------- Although basic recovery is possible with a simple script, consumers can define custom policies around recovery. For example, if the driver supports multiple recovery methods, consumers can opt for the suitable one depending on scenarios like repeat offences or vendor specific failures. Consumers can also choose to have the device available for debugging or telemetry collection and base their recovery decision on the findings. This is useful especially when the driver is unsure about recovery or method is unknown. v4: s/drm_dev_wedged/drm_dev_wedged_event Use drm_info() (Jani) Kernel doc adjustment (Aravind) v5: Send recovery method with uevent (Lina) v6: Access wedge_recovery_opts[] using helper function (Jani) Use snprintf() (Jani) v7: Convert recovery helpers into regular functions (Andy, Jani) Aesthetic adjustments (Andy) Handle invalid recovery method v8: Allow sending multiple methods with uevent (Lucas, Michal) static_assert() globally (Andy) v9: Provide 'none' method for device reset (Christian) Provide recovery opts using switch cases v11: Log device reset (André) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250204070528.1919158-2-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-04 07:05:24 +00:00
/*
* Recovery methods for wedged device in order of less to more side-effects.
* To be used with drm_dev_wedged_event() as recovery @method. Callers can
* use any one, multiple (or'd) or none depending on their needs.
*/
#define DRM_WEDGE_RECOVERY_NONE BIT(0) /* optional telemetry collection */
#define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */
#define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device + bind */
/**
* enum switch_power_state - power state of drm device
*/
enum switch_power_state {
/** @DRM_SWITCH_POWER_ON: Power state is ON */
DRM_SWITCH_POWER_ON = 0,
/** @DRM_SWITCH_POWER_OFF: Power state is OFF */
DRM_SWITCH_POWER_OFF = 1,
/** @DRM_SWITCH_POWER_CHANGING: Power state is changing */
DRM_SWITCH_POWER_CHANGING = 2,
/** @DRM_SWITCH_POWER_DYNAMIC_OFF: Suspended */
DRM_SWITCH_POWER_DYNAMIC_OFF = 3,
};
/**
* struct drm_device - DRM device structure
*
* This structure represent a complete card that
* may contain multiple heads.
*/
struct drm_device {
/** @if_version: Highest interface version set */
int if_version;
/** @ref: Object ref-count */
struct kref ref;
/** @dev: Device structure of bus-device */
struct device *dev;
/**
* @dma_dev:
*
* Device for DMA operations. Only required if the device @dev
* cannot perform DMA by itself. Should be NULL otherwise. Call
* drm_dev_dma_dev() to get the DMA device instead of using this
* field directly. Call drm_dev_set_dma_dev() to set this field.
*
* DRM devices are sometimes bound to virtual devices that cannot
* perform DMA by themselves. Drivers should set this field to the
* respective DMA controller.
*
* Devices on USB and other peripheral busses also cannot perform
* DMA by themselves. The @dma_dev field should point the bus
* controller that does DMA on behalve of such a device. Required
* for importing buffers via dma-buf.
*
* If set, the DRM core automatically releases the reference on the
* device.
*/
struct device *dma_dev;
drm: add managed resources tied to drm_device We have lots of these. And the cleanup code tends to be of dubious quality. The biggest wrong pattern is that developers use devm_, which ties the release action to the underlying struct device, whereas all the userspace visible stuff attached to a drm_device can long outlive that one (e.g. after a hotunplug while userspace has open files and mmap'ed buffers). Give people what they want, but with more correctness. Mostly copied from devres.c, with types adjusted to fit drm_device and a few simplifications - I didn't (yet) copy over everything. Since the types don't match code sharing looked like a hopeless endeavour. For now it's only super simplified, no groups, you can't remove actions (but kfree exists, we'll need that soon). Plus all specific to drm_device ofc, including the logging. Which I didn't bother to make compile-time optional, since none of the other drm logging is compile time optional either. One tricky bit here is the chicken&egg between allocating your drm_device structure and initiliazing it with drm_dev_init. For perfect onion unwinding we'd need to have the action to kfree the allocation registered before drm_dev_init registers any of its own release handlers. But drm_dev_init doesn't know where exactly the drm_device is emebedded into the overall structure, and by the time it returns it'll all be too late. And forcing drivers to be able clean up everything except the one kzalloc is silly. Work around this by having a very special final_kfree pointer. This also avoids troubles with the list head possibly disappearing from underneath us when we release all resources attached to the drm_device. v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless shuffling while getting everything into shape. v3: Add static to add/del_dr (Neil) Move typo fix to the right patch (Neil) v4: Enforce contract for drmm_add_final_kfree: Use ksize() to check that the drm_device is indeed contained somewhere in the final kfree(). Because we need that or the entire managed release logic blows up in a pile of use-after-frees. Motivated by a discussion with Laurent. v5: Review from Laurent: - %zu instead of casting size_t - header guards - sorting of includes - guarding of data assignment if we didn't allocate it for a NULL pointer - delete spurious newline - cast void* data parameter correctly in ->release call, no idea how this even worked before v6: Review from Sam - Add the kerneldoc for the managed sub-struct back in, even if it doesn't show up in the generated html somehow. - Explain why __always_inline. - Fix bisectability around the final kfree() in drm_dev_relase(). This is just interim code which will disappear again. - Some whitespace polish. - Add debug output when drmm_add_action or drmm_kmalloc fail. v7: My bisectability fix wasn't up to par as noticed by smatch. v8: Remove unecessary {} around if else v9: Use kstrdup_const, which requires kfree_const and introducing a free_dr() helper (Thomas). v10: kfree_const goes boom on the plain "kmalloc" assignment, somehow we need to wrap that in kstrdup_const() too!! Also renumber revision log, I somehow reset it midway thruh. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Neil Armstrong <narmstrong@baylibre.com Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200324124540.3227396-1-daniel.vetter@ffwll.ch
2020-03-24 12:45:40 +00:00
/**
* @managed:
*
* Managed resources linked to the lifetime of this &drm_device as
* tracked by @ref.
*/
struct {
/** @managed.resources: managed resources list */
struct list_head resources;
/** @managed.final_kfree: pointer for final kfree() call */
void *final_kfree;
/** @managed.lock: protects @managed.resources */
spinlock_t lock;
} managed;
/** @driver: DRM driver managing the device */
const struct drm_driver *driver;
/**
* @dev_private:
*
* DRM driver private data. This is deprecated and should be left set to
* NULL.
*
* Instead of using this pointer it is recommended that drivers use
* devm_drm_dev_alloc() and embed struct &drm_device in their larger
* per-device structure.
*/
void *dev_private;
drm: document better that drivers shouldn't use drm_minor directly The documentation for struct drm_minor already states this, but that's not always that easy to find. Also due to historical reasons we still have the minor-centric interfaces (like drm_debugfs_create_files), but since this is now getting fixed we can put a few more pointers in place as to how this should be done ideally. Note that debugfs isn't there yet for all cases (debugfs files on kms objects like crtc/connector aren't supported, neither debugfs files with full fops), so the debugfs side of this is still rather aspirational and more for new users than converting everything existing. todo.rst covers the additional work needed already. Motivated by some discussion with Rodrigo on irc about how drm/xe should lay out its sysfs interfaces. v2: Make the debugfs situation clearer in the commit message, but don't elaborate more in the actual kerneldoc to avoid distracting from the main message around sysfs (Jani) Also fix some typos. Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Maxime Ripard <maxime@cerno.tech> Acked-by: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Wambui Karuga <wambui.karugax@gmail.com> Cc: Maíra Canal <mcanal@igalia.com> Cc: Maxime Ripard <maxime@cerno.tech> Cc: Melissa Wen <mwen@igalia.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230109164604.3860862-1-daniel.vetter@ffwll.ch
2023-01-09 16:46:04 +00:00
/**
* @primary:
*
* Primary node. Drivers should not interact with this
* directly. debugfs interfaces can be registered with
* drm_debugfs_add_file(), and sysfs should be directly added on the
* hardware (and not character device node) struct device @dev.
*/
struct drm_minor *primary;
drm: document better that drivers shouldn't use drm_minor directly The documentation for struct drm_minor already states this, but that's not always that easy to find. Also due to historical reasons we still have the minor-centric interfaces (like drm_debugfs_create_files), but since this is now getting fixed we can put a few more pointers in place as to how this should be done ideally. Note that debugfs isn't there yet for all cases (debugfs files on kms objects like crtc/connector aren't supported, neither debugfs files with full fops), so the debugfs side of this is still rather aspirational and more for new users than converting everything existing. todo.rst covers the additional work needed already. Motivated by some discussion with Rodrigo on irc about how drm/xe should lay out its sysfs interfaces. v2: Make the debugfs situation clearer in the commit message, but don't elaborate more in the actual kerneldoc to avoid distracting from the main message around sysfs (Jani) Also fix some typos. Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Maxime Ripard <maxime@cerno.tech> Acked-by: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Wambui Karuga <wambui.karugax@gmail.com> Cc: Maíra Canal <mcanal@igalia.com> Cc: Maxime Ripard <maxime@cerno.tech> Cc: Melissa Wen <mwen@igalia.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230109164604.3860862-1-daniel.vetter@ffwll.ch
2023-01-09 16:46:04 +00:00
/**
* @render:
*
* Render node. Drivers should not interact with this directly ever.
* Drivers should not expose any additional interfaces in debugfs or
* sysfs on this node.
*/
struct drm_minor *render;
accel: add dedicated minor for accelerator devices The accelerator devices are exposed to user-space using a dedicated major. In addition, they are represented in /dev with new, dedicated device char names: /dev/accel/accel*. This is done to make sure any user-space software that tries to open a graphic card won't open the accelerator device by mistake. The above implies that the minor numbering should be separated from the rest of the DRM devices. However, to avoid code duplication, we want the drm_minor structure to be able to represent the accelerator device. To achieve this, we add a new drm_minor* to drm_device that represents the accelerator device. This pointer is initialized for drivers that declare they handle compute accelerator, using a new driver feature flag called DRIVER_COMPUTE_ACCEL. It is important to note that this driver feature is mutually exclusive with DRIVER_RENDER. Devices that want to expose both graphics and compute device char files should be handled by two drivers that are connected using the auxiliary bus framework. In addition, we define a different IDR to handle the accelerators minors. This is done to make the minor's index be identical to the device index in /dev/. Any access to the IDR is done solely by functions in accel_drv.c, as the IDR is define as static. The DRM core functions call those functions in case they detect the minor's type is DRM_MINOR_ACCEL. We define a separate accel_open function (from drm_open) that the accel drivers should set as their open callback function. Both these functions eventually call the same drm_open_helper(), which had to be changed to be non-static so it can be called from accel_drv.c. accel_open() only partially duplicates drm_open as I removed some code from it that handles legacy devices. To help new drivers, I defined DEFINE_DRM_ACCEL_FOPS macro to easily set the required function operations pointers structure. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Tested-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Melissa Wen <mwen@igalia.com>
2022-10-31 13:33:06 +00:00
/** @accel: Compute Acceleration node */
struct drm_minor *accel;
/**
* @registered:
*
* Internally used by drm_dev_register() and drm_connector_register().
*/
bool registered;
/**
* @master:
*
* Currently active master for this device.
* Protected by &master_mutex
*/
struct drm_master *master;
/**
* @driver_features: per-device driver features
*
* Drivers can clear specific flags here to disallow
* certain features on a per-device basis while still
* sharing a single &struct drm_driver instance across
* all devices.
*/
u32 driver_features;
/**
* @unplugged:
*
* Flag to tell if the device has been unplugged.
* See drm_dev_enter() and drm_dev_is_unplugged().
*/
bool unplugged;
/** @anon_inode: inode for private address-space */
struct inode *anon_inode;
/** @unique: Unique name of the device */
char *unique;
/**
* @struct_mutex:
*
* Lock for others (not &drm_minor.master and &drm_file.is_master)
*
* TODO: This lock used to be the BKL of the DRM subsystem. Move the
* lock into i915, which is the only remaining user.
*/
struct mutex struct_mutex;
/**
* @master_mutex:
*
* Lock for &drm_minor.master and &drm_file.is_master
*/
struct mutex master_mutex;
/**
* @open_count:
*
* Usage counter for outstanding files open,
* protected by drm_global_mutex
*/
atomic_t open_count;
/** @filelist_mutex: Protects @filelist. */
struct mutex filelist_mutex;
/**
* @filelist:
*
* List of userspace clients, linked through &drm_file.lhead.
*/
struct list_head filelist;
/**
* @filelist_internal:
*
* List of open DRM files for in-kernel clients.
* Protected by &filelist_mutex.
*/
struct list_head filelist_internal;
/**
* @clientlist_mutex:
*
* Protects &clientlist access.
*/
struct mutex clientlist_mutex;
/**
* @clientlist:
*
* List of in-kernel clients. Protected by &clientlist_mutex.
*/
struct list_head clientlist;
/**
* @vblank_disable_immediate:
*
* If true, vblank interrupt will be disabled immediately when the
* refcount drops to zero, as opposed to via the vblank disable
* timer.
*
* This can be set to true it the hardware has a working vblank counter
* with high-precision timestamping (otherwise there are races) and the
* driver uses drm_crtc_vblank_on() and drm_crtc_vblank_off()
* appropriately. Also, see @max_vblank_count,
* &drm_crtc_funcs.get_vblank_counter and
* &drm_vblank_crtc_config.disable_immediate.
*/
bool vblank_disable_immediate;
/**
* @vblank:
*
* Array of vblank tracking structures, one per &struct drm_crtc. For
* historical reasons (vblank support predates kernel modesetting) this
* is free-standing and not part of &struct drm_crtc itself. It must be
* initialized explicitly by calling drm_vblank_init().
*/
struct drm_vblank_crtc *vblank;
/**
* @vblank_time_lock:
*
* Protects vblank count and time updates during vblank enable/disable
*/
spinlock_t vblank_time_lock;
/**
* @vbl_lock: Top-level vblank references lock, wraps the low-level
* @vblank_time_lock.
*/
spinlock_t vbl_lock;
/**
* @max_vblank_count:
*
* Maximum value of the vblank registers. This value +1 will result in a
* wrap-around of the vblank register. It is used by the vblank core to
* handle wrap-arounds.
*
* If set to zero the vblank core will try to guess the elapsed vblanks
* between times when the vblank interrupt is disabled through
* high-precision timestamps. That approach is suffering from small
* races and imprecision over longer time periods, hence exposing a
* hardware vblank counter is always recommended.
*
* This is the statically configured device wide maximum. The driver
* can instead choose to use a runtime configurable per-crtc value
* &drm_vblank_crtc.max_vblank_count, in which case @max_vblank_count
* must be left at zero. See drm_crtc_set_max_vblank_count() on how
* to use the per-crtc value.
*
* If non-zero, &drm_crtc_funcs.get_vblank_counter must be set.
*/
u32 max_vblank_count;
/** @vblank_event_list: List of vblank events */
struct list_head vblank_event_list;
/**
* @event_lock:
*
* Protects @vblank_event_list and event delivery in
* general. See drm_send_event() and drm_send_event_locked().
*/
spinlock_t event_lock;
/** @num_crtcs: Number of CRTCs on this device */
unsigned int num_crtcs;
/** @mode_config: Current mode config */
struct drm_mode_config mode_config;
/** @object_name_lock: GEM information */
struct mutex object_name_lock;
/** @object_name_idr: GEM information */
struct idr object_name_idr;
/** @vma_offset_manager: GEM information */
struct drm_vma_offset_manager *vma_offset_manager;
/** @vram_mm: VRAM MM memory manager */
struct drm_vram_mm *vram_mm;
/**
* @switch_power_state:
*
* Power state of the client.
* Used by drivers supporting the switcheroo driver.
* The state is maintained in the
* &vga_switcheroo_client_ops.set_gpu_state callback
*/
enum switch_power_state switch_power_state;
/**
* @fb_helper:
*
* Pointer to the fbdev emulation structure.
* Set by drm_fb_helper_init() and cleared by drm_fb_helper_fini().
*/
struct drm_fb_helper *fb_helper;
/**
* @debugfs_root:
*
* Root directory for debugfs files.
*/
struct dentry *debugfs_root;
};
void drm_dev_set_dma_dev(struct drm_device *dev, struct device *dma_dev);
/**
* drm_dev_dma_dev - returns the DMA device for a DRM device
* @dev: DRM device
*
* Returns the DMA device of the given DRM device. By default, this
* the DRM device's parent. See drm_dev_set_dma_dev().
*
* Returns:
* A DMA-capable device for the DRM device.
*/
static inline struct device *drm_dev_dma_dev(struct drm_device *dev)
{
if (dev->dma_dev)
return dev->dma_dev;
return dev->dev;
}
#endif