Commit Graph

17 Commits

Author SHA1 Message Date
Ivan Vecera 72d53e7e70 hwmon: (mlxreg-fan) Return zero speed for broken fan
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2196494

commit a1ffd3c46267ee5c807acd780e15df9bb692223f
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Sun Feb 12 16:57:30 2023 +0200

    hwmon: (mlxreg-fan) Return zero speed for broken fan

    Currently for broken fan driver returns value calculated based on error
    code (0xFF) in related fan speed register.
    Thus, for such fan user gets fan{n}_fault to 1 and fan{n}_input with
    misleading value.

    Add check for fan fault prior return speed value and return zero if
    fault is detected.

    Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20230212145730.24247-1-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-05-18 12:20:52 +02:00
Ivan Vecera 99497f140e hwmon: (mlxreg-fan) Use pwm attribute for setting fan speed low limit
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit da74944d3a469ffc0e8229520afbf41ad01219b6
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Wed Jan 26 16:18:25 2022 +0200

    hwmon: (mlxreg-fan) Use pwm attribute for setting fan speed low limit

    Recently 'cur_state' user space 'sysfs' interface 'sysfs' has been
    deprecated. This interface is used in Nvidia systems for setting fan
    speed limit. Currently fan speed limit is set from the user space by
    setting 'sysfs' 'cur_state' attribute to 'max_state + n', where 'n' is
    required limit, for example: 15 for 50% speed limit, 20 for full fan
    speed enforcement.
    The purpose of this feature is to provides ability to limit fan speed
    according to some system wise considerations, like absence of some
    replaceable units (PSU or line cards), high system ambient temperature,
    unreliable transceivers temperature sensing or some other factors which
    indirectly impacts system's airflow.

    The motivation is to support fan low limit feature through 'hwmon'
    interface.

    Use 'hwmon' 'pwm' attribute for setting low limit for fan speed in
    case 'thermal' subsystem is configured in kernel. In this case setting
    fan speed through 'hwmon' will never let the 'thermal' subsystem to
    select a lower duty cycle than the duty cycle selected with the 'pwm'
    attribute.
    From other side, fan speed is to be updated in hardware through 'pwm'
    only in case the requested fan speed is above last speed set by
    'thermal' subsystem, otherwise requested fan speed will be just stored
    with no PWM update.

    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20220126141825.13545-1-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:17 +01:00
Ivan Vecera cfbbf38704 hwmon: (mlxreg-fan) Support distinctive names per different cooling devices
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit b2be2422c0c98b00f21c4331e2d9342bd58bfdba
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Sun Sep 26 08:35:41 2021 +0300

    hwmon: (mlxreg-fan) Support distinctive names per different cooling devices

    Provide different names for cooling devices registration to allow
    binding each cooling devices to relevant thermal zone. Thus, specific
    cooling device can be associated with related thermal sensor by setting
    thermal cooling device type for example to "mlxreg_fan2" and passing
    this type to thermal_zone_bind_cooling_device() through 'cdev->type'.

    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20210926053541.1806937-3-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Ivan Vecera 10f4759406 hwmon: (mlxreg-fan) Modify PWM connectivity validation
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit b1c24237341f6fb910c6cb15489222a9f47258d6
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Sun Sep 26 08:35:40 2021 +0300

    hwmon: (mlxreg-fan) Modify PWM connectivity validation

    Validate PWM connectivity only for additional PWM - "pwm1" is connected
    on all systems, while "pwm2" - "pwm4" are optional. Validate
    connectivity only for optional attributes by reading of related "pwm{n}"
    registers - in case "pwm{n}" is not connected, register value is
    supposed to be 0xff.

    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20210926053541.1806937-2-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Ivan Vecera 7a07fe3958 hwmon: (mlxreg-fan) Fix out of bounds read on array fan->pwm
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit 000cc5bc49aafc83a131bab2becf9922f5eec658
Author: Colin Ian King <colin.king@canonical.com>
Date:   Mon Sep 20 19:09:21 2021 +0100

    hwmon: (mlxreg-fan) Fix out of bounds read on array fan->pwm

    Array fan->pwm[] is MLXREG_FAN_MAX_PWM elements in size, however the
    for-loop has a off-by-one error causing index i to be out of range
    causing an out of bounds read on the array. Fix this by replacing
    the <= operator with < in the for-loop.

    Addresses-Coverity: ("Out-of-bounds read")
    Reported-by: Vadim Pasternak <vadimp@nvidia.com>
    Fixes: 35edbaab3bbf ("hwmon: (mlxreg-fan) Extend driver to support multiply cooling devices")
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Link: https://lore.kernel.org/r/20210920180921.16246-1-colin.king@canonical.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Ivan Vecera 496ec4a0b0 hwmon: (mlxreg-fan) Extend driver to support multiply cooling devices
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit d7efb2ebc7b3c952104b9ebfbf88da97ea99a0a0
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Fri Sep 17 00:31:28 2021 +0300

    hwmon: (mlxreg-fan) Extend driver to support multiply cooling devices

    Add support for additional cooling devices in order to support the
    systems, which can be equipped with up-to four PWM controllers.

    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Ivan Vecera 9f0c392982 hwmon: (mlxreg-fan) Extend driver to support multiply PWM
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit 150f1e0c6fa886e18e35594ae2ba5c81b5df1898
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Thu Sep 16 22:47:18 2021 +0300

    hwmon: (mlxreg-fan) Extend driver to support multiply PWM

    Add additional PWM attributes in order to support the systems, which
    can be equipped with up-to four PWM controllers. System capability of
    additional PWM support is validated through the reading of relevant
    registers.

    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20210916194719.871413-3-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Ivan Vecera dcbc1cc928 hwmon: (mlxreg-fan) Extend the maximum number of tachometers
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit bc8de07e8812548cc19161af3a2b83849ff03045
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Thu Sep 16 22:47:17 2021 +0300

    hwmon: (mlxreg-fan) Extend the maximum number of tachometers

    Extend support of maximum tachometers from 12 to 14 in order to support
    new systems, equipped with more fans.

    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20210916194719.871413-2-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Ivan Vecera 2a085fa8bb hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2140704

commit e6fab7af6ba1bc77c78713a83876f60ca7a4a064
Author: Vadim Pasternak <vadimp@nvidia.com>
Date:   Thu Sep 16 21:31:51 2021 +0300

    hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs

    Fan speed minimum can be enforced from sysfs. For example, setting
    current fan speed to 20 is used to enforce fan speed to be at 100%
    speed, 19 - to be not below 90% speed, etcetera. This feature provides
    ability to limit fan speed according to some system wise
    considerations, like absence of some replaceable units or high system
    ambient temperature.

    Request for changing fan minimum speed is configuration request and can
    be set only through 'sysfs' write procedure. In this situation value of
    argument 'state' is above nominal fan speed maximum.

    Return non-zero code in this case to avoid
    thermal_cooling_device_stats_update() call, because in this case
    statistics update violates thermal statistics table range.
    The issues is observed in case kernel is configured with option
    CONFIG_THERMAL_STATISTICS.

    Here is the trace from KASAN:
    [  159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0
    [  159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444
    [  159.545625] Call Trace:
    [  159.548366]  dump_stack+0x92/0xc1
    [  159.552084]  ? thermal_cooling_device_stats_update+0x7d/0xb0
    [  159.635869]  thermal_zone_device_update+0x345/0x780
    [  159.688711]  thermal_zone_device_set_mode+0x7d/0xc0
    [  159.694174]  mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core]
    [  159.700972]  ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core]
    [  159.731827]  mlxsw_thermal_init+0x763/0x880 [mlxsw_core]
    [  160.070233] RIP: 0033:0x7fd995909970
    [  160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ..
    [  160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    [  160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970
    [  160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001
    [  160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700
    [  160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013
    [  160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013
    [  160.143671]
    [  160.145338] Allocated by task 2924:
    [  160.149242]  kasan_save_stack+0x19/0x40
    [  160.153541]  __kasan_kmalloc+0x7f/0xa0
    [  160.157743]  __kmalloc+0x1a2/0x2b0
    [  160.161552]  thermal_cooling_device_setup_sysfs+0xf9/0x1a0
    [  160.167687]  __thermal_cooling_device_register+0x1b5/0x500
    [  160.173833]  devm_thermal_of_cooling_device_register+0x60/0xa0
    [  160.180356]  mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan]
    [  160.248140]
    [  160.249807] The buggy address belongs to the object at ffff888116163400
    [  160.249807]  which belongs to the cache kmalloc-1k of size 1024
    [  160.263814] The buggy address is located 64 bytes to the right of
    [  160.263814]  1024-byte region [ffff888116163400, ffff888116163800)
    [  160.277536] The buggy address belongs to the page:
    [  160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160
    [  160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0
    [  160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2)
    [  160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0
    [  160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000
    [  160.327033] page dumped because: kasan: bad access detected
    [  160.333270]
    [  160.334937] Memory state around the buggy address:
    [  160.356469] >ffff888116163800: fc ..

    Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
    Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20210916183151.869427-1-vadimp@nvidia.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-11-13 19:04:16 +01:00
Vadim Pasternak f7bf7eb2d7 hwmon: (mlxreg-fan) Add support for fan drawers capability and present registers
Add support for fan drawer's capability and present registers in order
to set mapping between the fan drawers and tachometers. Some systems
are equipped with fan drawers with one tachometer inside. Others with
fan drawers with several tachometers inside. Using present register
along with tachometer-to-drawer mapping allows to skip reading missed
tachometers and expose input for them as zero, instead of exposing
fault code returned by hardware.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210322172237.2213584-1-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-04-20 06:50:14 -07:00
Linus Torvalds a455eda33f Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal soc updates from Eduardo Valentin:

 - thermal core has a new devm_* API for registering cooling devices. I
   took the entire series, that is why you see changes on drivers/hwmon
   in this pull (Guenter Roeck)

 - rockchip thermal driver gains support to PX30 SoC (Elaine Zhang)

 - the generic-adc thermal driver now considers the lookup table DT
   property as optional (Jean-Francois Dagenais)

 - Refactoring of tsens thermal driver (Amit Kucheria)

 - Cleanups on cpu cooling driver (Daniel Lezcano)

 - broadcom thermal driver dropped support to ACPI (Srinath Mannam)

 - tegra thermal driver gains support to OC hw throttle and GPU throtle
   (Wei Ni)

 - Fixes in several thermal drivers.

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: (59 commits)
  hwmon: (pwm-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (npcm750-pwm-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (mlxreg-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (gpio-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (aspeed-pwm-tacho) Use devm_thermal_of_cooling_device_register
  thermal: rcar_gen3_thermal: Fix to show correct trip points number
  thermal: rcar_thermal: update calculation formula for R-Car Gen3 SoCs
  thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power
  thermal: rockchip: Support the PX30 SoC in thermal driver
  dt-bindings: rockchip-thermal: Support the PX30 SoC compatible
  thermal: rockchip: fix up the tsadc pinctrl setting error
  thermal: broadcom: Remove ACPI support
  thermal: Fix build error of missing devm_ioremap_resource on UM
  thermal/drivers/cpu_cooling: Remove pointless field
  thermal/drivers/cpu_cooling: Add Software Package Data Exchange (SPDX)
  thermal/drivers/cpu_cooling: Fixup the header and copyright
  thermal/drivers/cpu_cooling: Remove pointless test in power2state()
  thermal: rcar_gen3_thermal: disable interrupt in .remove
  thermal: rcar_gen3_thermal: fix interrupt type
  thermal: Introduce devm_thermal_of_cooling_device_register
  ...
2019-05-16 07:56:57 -07:00
Guenter Roeck 9ebe010e56 hwmon: (mlxreg-fan) Use devm_thermal_of_cooling_device_register
Call devm_thermal_of_cooling_device_register() to register the cooling
device. Also introduce struct device *dev = &pdev->dev; to make the code
easier to read.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-14 07:00:46 -07:00
Guenter Roeck 725dcf082c hwmon: (mlxreg-fan) Use HWMON_CHANNEL_INFO macro
The HWMON_CHANNEL_INFO macro simplifies the code, reduces the likelihood
of errors, and makes the code easier to read.

The conversion was done automatically with coccinelle. The semantic patch
used to make this change is as follows.

@r@
initializer list elements;
identifier i;
@@

-u32 i[] = {
-  elements,
-  0
-};

@s@
identifier r.i,j,ty;
@@

-struct hwmon_channel_info j = {
-       .type = ty,
-       .config = i,
-};

@script:ocaml t@
ty << s.ty;
elements << r.elements;
shorter;
elems;
@@

shorter :=
   make_ident (List.hd(List.rev (Str.split (Str.regexp "_") ty)));
elems :=
   make_ident
    (String.concat ","
     (List.map (fun x -> Printf.sprintf "\n\t\t\t   %s" x)
       (Str.split (Str.regexp " , ") elements)))

@@
identifier s.j,t.shorter;
identifier t.elems;
@@

- &j
+ HWMON_CHANNEL_INFO(shorter,elems)

This patch does not introduce functional changes. Many thanks to
Julia Lawall for providing the semantic patch.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-04-15 17:19:53 -07:00
Vadim Pasternak b429ebc86f hwmon: (mlxreg-fan) Add support for fan capability registers
Add support for fan capability registers in order to distinct between
the systems which have minor fan configuration differences. This
reduces the amount of code used to describe such systems.
The capability registers provides system specific information about the
number of physically connected tachometers and system specific fan
speed scale parameter.
For example one system can be equipped with twelve fan tachometers,
while the other with for example, eight or six. Or one system should
use default fan speed divider value, while the other has a scale
parameter defined in hardware, which should be used for divider
setting.
Reading this information from the capability registers allows to use the
same fan structure for the systems with the such differences.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-04-15 17:19:53 -07:00
Vadim Pasternak 3f9ffa5c3a hwmon: (mlxreg-fan) Modify macros for tachometer fault status reading
Modify macros for tachometer fault status reading for making it more
simple and clear.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-12-16 15:13:16 -08:00
Vadim Pasternak 243cfe3fb8 hwmon: (mlxreg-fan) Fix macros for tacho fault reading
Fix macros for tacometer fault reading.
This fix is relevant for three Mellanox systems MQMB7, MSN37, MSN34,
which are about to be released to the customers.
At the moment, none of them is at customers sites.

Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-11-16 08:10:23 -08:00
Vadim Pasternak 65afb4c8e7 hwmon: (mlxreg-fan) Add support for Mellanox FAN driver
Driver obtains PWM and tachometers registers location according to the
system configuration and creates FAN/PWM hwmon objects and a cooling
device. PWM and tachometers are controlled through the on-board
programmable device, which exports its register map. This device could be
attached to any bus type, for which register mapping is supported. Single
instance is created with one PWM control, up to 12 tachometers and one
cooling device. It could be as many instances as programmable device
supports.

Currently driver will be activated from the Mellanox platform driver:
drivers/platform/x86/mlx-platform.c.
For the future ARM based systems it could be activated from the ARM
platform module.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-07-08 20:08:13 -07:00