Commit Graph

145 Commits

Author SHA1 Message Date
Kamal Heib 270702234e RDMA/nldev: Set error code in rdma_nl_notify_event
JIRA: https://issues.redhat.com/browse/RHEL-77880

commit 13a6691910cc23ea9ba4066e098603088673d5b0
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Tue Dec 10 09:33:10 2024 +0200

    RDMA/nldev: Set error code in rdma_nl_notify_event

    In case of error set the error code before the goto.

    Fixes: 6ff57a2ea7c2 ("RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event")
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Closes: https://lore.kernel.org/linux-rdma/a84a2fc3-33b6-46da-a1bd-3343fa07eaf9@stanley.mountain/
    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
    Link: https://patch.msgid.link/13eb25961923f5de9eb9ecbbc94e26113d6049ef.1733815944.git.leonro@nvidia.com
    Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2025-02-06 09:33:43 -05:00
Kamal Heib 8c65118739 RDMA/nldev: Add IB device and net device rename events
JIRA: https://issues.redhat.com/browse/RHEL-77880

commit 7566752e4d7d7fc0186531aa800068a7243f95c1
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Thu Oct 31 11:31:14 2024 +0200

    RDMA/nldev: Add IB device and net device rename events

    Implement event sending for IB device rename and IB device
    port associated netdevice rename.

    In iproute2, rdma monitor displays the IB device name, port
    and the netdevice name when displaying event info. Since
    users can modiy these names, we track and notify on renaming
    events.

    Note: In order to receive netdevice rename events, drivers
    must use the ib_device_set_netdev() API when attaching net
    devices to IB devices.

    $ rdma monitor
    $ rmmod mlx5_ib
    [UNREGISTER]    dev 1  rocep8s0f1
    [UNREGISTER]    dev 0  rocep8s0f0

    $ modprobe mlx5_ib
    [REGISTER]      dev 2  mlx5_0
    [NETDEV_ATTACH] dev 2  mlx5_0 port 1 netdev 4 eth2
    [REGISTER]      dev 3  mlx5_1
    [NETDEV_ATTACH] dev 3  mlx5_1 port 1 netdev 5 eth3
    [RENAME]        dev 2  rocep8s0f0
    [RENAME]        dev 3  rocep8s0f1

    $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
    [UNREGISTER]    dev 2  rocep8s0f0
    [REGISTER]      dev 4  mlx5_0
    [NETDEV_ATTACH] dev 4  mlx5_0 port 30 netdev 4 eth2
    [RENAME]        dev 4  rdmap8s0f0

    $ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
    [NETDEV_ATTACH] dev 4  rdmap8s0f0 port 2 netdev 7 eth4
    [NETDEV_ATTACH] dev 4  rdmap8s0f0 port 3 netdev 8 eth5
    [NETDEV_ATTACH] dev 4  rdmap8s0f0 port 4 netdev 9 eth6
    [NETDEV_ATTACH] dev 4  rdmap8s0f0 port 5 netdev 10 eth7
    [REGISTER]      dev 5  mlx5_0
    [NETDEV_ATTACH] dev 5  mlx5_0 port 1 netdev 11 eth8
    [REGISTER]      dev 6  mlx5_1
    [NETDEV_ATTACH] dev 6  mlx5_1 port 1 netdev 12 eth9
    [RENAME]        dev 5  rocep8s0f0v0
    [RENAME]        dev 6  rocep8s0f0v1
    [REGISTER]      dev 7  mlx5_0
    [NETDEV_ATTACH] dev 7  mlx5_0 port 1 netdev 13 eth10
    [RENAME]        dev 7  rocep8s0f0v2
    [REGISTER]      dev 8  mlx5_0
    [NETDEV_ATTACH] dev 8  mlx5_0 port 1 netdev 14 eth11
    [RENAME]        dev 8  rocep8s0f0v3

    $ ip link set eth2 name myeth2
    [NETDEV_RENAME]  netdev 4 myeth2

    $ ip link set eth1 name myeth1

    ** no events received, because eth1 is not attached to
       an IB device **

    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Link: https://patch.msgid.link/093c978ef2766fd3ab4ff8798eeb68f2f11582f6.1730367038.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2025-02-04 14:40:52 -05:00
Kamal Heib 6bdff5cdaf RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event
JIRA: https://issues.redhat.com/browse/RHEL-56245

commit 6ff57a2ea7c2911f80457a5a3a5b4370756ad475
Author: Qianqiang Liu <qianqiang.liu@163.com>
Date:   Fri Sep 27 22:06:13 2024 +0800

    RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event

    nlmsg_put() may return a NULL pointer assigned to nlh, which will later
    be dereferenced in nlmsg_end().

    Fixes: 9cbed5aab5ae ("RDMA/nldev: Add support for RDMA monitoring")
    Link: https://patch.msgid.link/r/Zva71Yf3F94uxi5A@iZbp1asjb3cy8ks0srf007Z
    Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-27 19:32:23 -04:00
Kamal Heib 48cc917be4 RDMA/nldev: Add missing break in rdma_nl_notify_err_msg()
JIRA: https://issues.redhat.com/browse/RHEL-56245

commit 7acad3c442df6d5158c5b732a7a0ccf3a01d9b30
Author: Nathan Chancellor <nathan@kernel.org>
Date:   Mon Sep 16 06:24:34 2024 -0700

    RDMA/nldev: Add missing break in rdma_nl_notify_err_msg()

    Clang warns (or errors with CONFIG_WERROR=y):

      drivers/infiniband/core/nldev.c:2795:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
       2795 |         default:
            |         ^

    Clang is a little more pedantic than GCC, which does not warn when
    falling through to a case that is just break or return. Clang's version
    is more in line with the kernel's own stance in deprecated.rst, which
    states that all switch/case blocks must end in either break,
    fallthrough, continue, goto, or return. Add the missing break to silence
    the warning.

    Fixes: 9cbed5aab5ae ("RDMA/nldev: Add support for RDMA monitoring")
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Link: https://patch.msgid.link/20240916-rdma-fix-clang-fallthrough-nl_notify_err_msg-v1-1-89de6a7423f1@kernel.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-27 19:32:22 -04:00
Kamal Heib c442c52ec4 RDMA/nldev: Expose whether RDMA monitoring is supported
JIRA: https://issues.redhat.com/browse/RHEL-56245

commit 12fb1153c53bf9b53e299c9775b84fa7838640f7
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Mon Sep 9 20:30:25 2024 +0300

    RDMA/nldev: Expose whether RDMA monitoring is supported

    Extend the "rdma sys" command to display whether RDMA
    monitoring is supported.

    RDMA monitoring is not supported in mlx4 because it does
    not use the ib_device_set_netdev() API, which sends the
    RDMA events.

    Example output for kernel where monitoring is supported:
    $ rdma sys show
    netns shared privileged-qkey off monitor on copy-on-fork on

    Example output for kernel where monitoring is not supported:
    $ rdma sys show
    netns shared privileged-qkey off monitor off copy-on-fork on

    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
    Link: https://patch.msgid.link/20240909173025.30422-8-michaelgur@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-27 19:32:22 -04:00
Kamal Heib 57ac2a3723 RDMA/nldev: Add support for RDMA monitoring
JIRA: https://issues.redhat.com/browse/RHEL-56245

commit 9cbed5aab5aeea420d0aa945733bf608449d44fb
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Mon Sep 9 20:30:24 2024 +0300

    RDMA/nldev: Add support for RDMA monitoring

    Introduce a new netlink command to allow rdma event monitoring.
    The rdma events supported now are IB device
    registration/unregistration and net device attachment/detachment.

    Example output of rdma monitor and the commands which trigger
    the events:

    $ rdma monitor
    $ rmmod mlx5_ib
    [UNREGISTER]    dev 1 rocep8s0f1
    [UNREGISTER]    dev 0 rocep8s0f0

    $ modprobe mlx5_ib
    [REGISTER]      dev 2 mlx5_0
    [NETDEV_ATTACH] dev 2 mlx5_0 port 1 netdev 4 eth2
    [REGISTER]      dev 3 mlx5_1
    [NETDEV_ATTACH] dev 3 mlx5_1 port 1 netdev 5 eth3

    $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
    [UNREGISTER]    dev 2 rocep8s0f0
    [REGISTER]      dev 4 mlx5_0
    [NETDEV_ATTACH] dev 4 mlx5_0 port 30 netdev 4 eth2

    $ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
    [NETDEV_ATTACH] dev 4 rdmap8s0f0 port 2 netdev 7 eth4
    [NETDEV_ATTACH] dev 4 rdmap8s0f0 port 3 netdev 8 eth5
    [NETDEV_ATTACH] dev 4 rdmap8s0f0 port 4 netdev 9 eth6
    [NETDEV_ATTACH] dev 4 rdmap8s0f0 port 5 netdev 10 eth7
    [REGISTER]      dev 5 mlx5_0
    [NETDEV_ATTACH] dev 5 mlx5_0 port 1 netdev 11 eth8
    [REGISTER]      dev 6 mlx5_0
    [NETDEV_ATTACH] dev 6 mlx5_0 port 1 netdev 12 eth9
    [REGISTER]      dev 7 mlx5_0
    [NETDEV_ATTACH] dev 7 mlx5_0 port 1 netdev 13 eth10
    [REGISTER]      dev 8 mlx5_0
    [NETDEV_ATTACH] dev 8 mlx5_0 port 1 netdev 14 eth11

    $ echo 0 > /sys/class/net/eth2/device/sriov_numvfs
    [UNREGISTER]    dev 5 rocep8s0f0v0
    [UNREGISTER]    dev 6 rocep8s0f0v1
    [UNREGISTER]    dev 7 rocep8s0f0v2
    [UNREGISTER]    dev 8 rocep8s0f0v3
    [NETDEV_DETACH] dev 4 rdmap8s0f0 port 2
    [NETDEV_DETACH] dev 4 rdmap8s0f0 port 3
    [NETDEV_DETACH] dev 4 rdmap8s0f0 port 4
    [NETDEV_DETACH] dev 4 rdmap8s0f0 port 5

    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
    Link: https://patch.msgid.link/20240909173025.30422-7-michaelgur@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-27 19:32:22 -04:00
Kamal Heib 1817ac987d RDMA/nldev: Enhance netlink message parsing and validation
JIRA: https://issues.redhat.com/browse/RHEL-56245

commit df6d27a30970158466b632c82da09a9b24c30f4b
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Tue Jul 30 12:17:25 2024 +0300

    RDMA/nldev: Enhance netlink message parsing and validation

    Use strict parsing validation for set commands, and liberal
    validation for get commands. Additionally, remove all usage of
    nlmsg_parse_depricate().

    Strict parsing validation fails when encountering unrecognized
    attributes in the Netlink message, while liberal parsing
    validation ignores them.

    In 57d7a8fd904c ("rdma: Add an option to display driver-specific QPs in the rdma tool")
    in iproute2, the attribute RDMA_NLDEV_ATTR_DRIVER_DETAILS
    was added. This cause backwards compatibility issues when using
    the rdma tool with the new attribute and an older kernel which does
    recognize this attribute.
    In this case, the command "rdma stat show mr" would fail, because the
    new rdma tool would fill the netlink message with the new attribute and
    the older kernel would fail as it used strict parsing and did not
    recognize the new attribute.

    In general, strict validation is appropriate for set commands as they
    modify the system, while liberal validation is suitable for get
    commands which only query system information.

    Replace all uses of nlmsg_parse_deprecated() with __nlmsg_parse(),
    using the NL_VALIDATE_LIBERAL flag.
    The nlmsg_parse_deprecated() function internally calls
    __nlmsg_parse() with the NL_VALIDATE_LIBERAL flag, but its name
    is confusing.

    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
    Link: https://lore.kernel.org/r/f633a979a49db090d05c24a3ba83d30727bb777b.1722331020.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-27 19:32:21 -04:00
Kamal Heib e60ee32bfc RDMA/core: Introduce "name_assign_type" for an IB device
JIRA: https://issues.redhat.com/browse/RHEL-56247
Conflicts:
Drop the mlx5 hunks for now due to multiple missing changes in mlx5.

commit af48f95492dc1af36d9636a750ec492035c0ed7d
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Mon Jul 1 15:40:48 2024 +0300

    RDMA/core: Introduce "name_assign_type" for an IB device

    The name_assign_type indicates how the name is provided. Currently
    these types are supported:
    - RDMA_NAME_ASSIGN_TYPE_UNKNOWN: Unknown or not set;
    - RDMA_NAME_ASSIGN_TYPE_USER: Name is provided by the user; The
      user-created sub device, rxe and siw device has this type.

    When filling nl device info, it is set in the new attribute
    RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE. User-space tools like udev
    "rdma_rename" could check this attribute to determine if this
    device needs to be renamed or not.

    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Link: https://lore.kernel.org/r/522591bef9a369cc8e5dcb77787e017bffee37fe.1719837610.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:54 -04:00
Kamal Heib d8bb2a6ab8 RDMA/nldev: Add support to dump device type and parent device if exists
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit 294424839b5ec2ecd17f4c8409796846b2b8dd31
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Sun Jun 16 19:08:41 2024 +0300

    RDMA/nldev: Add support to dump device type and parent device if exists

    If a device has a specific type or a parent device, dump them as well.

    Example:
    $ rdma dev show smi1
    3: smi1: node_type ca fw 20.38.1002 node_guid 9803:9b03:009f:d5ef sys_image_guid 9803:9b03:009f:d5ee type smi parent ibp8s0f1

    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Link: https://lore.kernel.org/r/4c022e3e34b5de1254a3b367d502a362cdd0c53a.1718553901.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:54 -04:00
Kamal Heib ffd729fa4b RDMA/nldev: Add support to add/delete a sub IB device through netlink
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit 060c642b2ab8b40b39f9db99c1d14c7d19ba507f
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Sun Jun 16 19:08:40 2024 +0300

    RDMA/nldev: Add support to add/delete a sub IB device through netlink

    Add new netlink commands and attributes to support adding and deleting
    a sub IB device with admin privilege.

    Examples:
    $ rdma dev add smi1 type SMI parent ibp8s0f1
    $ rdma dev del smi1

    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Link: https://lore.kernel.org/r/77cbf1b36359642be8a8d8c5c2f4e585b544282f.1718553901.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:54 -04:00
Kamal Heib 5b3d6c6fa6 RDMA/core: Add an option to display driver-specific QPs in the rdmatool
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit e18fa0bbcedf82aaa1db27079ef6a43e11367592
Author: Chiara Meiohas <cmeiohas@nvidia.com>
Date:   Tue Apr 16 15:03:50 2024 +0300

    RDMA/core: Add an option to display driver-specific QPs in the rdmatool

    Utilize the -dd flag (driver-specific details) in the rdmatool
    to view driver-specific QPs which are not exposed yet.

    Add the netlink attribute to mark request to convey driver details and
    use it to return QP subtype as a string.

    $ rdma resource show qp link ibp8s0f1
    link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
    link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
    link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]

    $ rdma resource show qp link ibp8s0f1 -dd
    link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
    link ibp8s0f1/1 lqpn 465 type DRIVER subtype REG_UMR state RTS sq-psn 0 comm [mlx5_ib]
    link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
    link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]

    $ rdma resource show
    0: ibp8s0f0: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2
    1: ibp8s0f1: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2

    $ rdma resource show -dd
    0: ibp8s0f0: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2
    1: ibp8s0f1: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2

    Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
    Link: https://lore.kernel.org/r/2607bb3ddec3cae3443c2ea19e9f700825d20a98.1713268997.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:53 -04:00
Kamal Heib f9df2f2088 RDMA/core: Remove NULL check before dev_{put, hold}
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit 7a1c2abf9a2be7d969b25e8d65567933335ca88e
Author: Yang Li <yang.lee@linux.alibaba.com>
Date:   Tue Oct 24 08:38:15 2023 +0800

    RDMA/core: Remove NULL check before dev_{put, hold}

    The call netdev_{put, hold} of dev_{put, hold} will check NULL,
    so there is no need to check before using dev_{put, hold},
    remove it to silence the warning:

    ./drivers/infiniband/core/nldev.c:375:2-9: WARNING: NULL check before dev_{put, hold} functions is not needed.

    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7047
    Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
    Link: https://lore.kernel.org/r/20231024003815.89742-1-yang.lee@linux.alibaba.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:52 -04:00
Kamal Heib 476512d41d RDMA/core: Add support to set privileged QKEY parameter
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit 465d6b42f1a3b855c06da1d4d3b09907d261af69
Author: Patrisious Haddad <phaddad@nvidia.com>
Date:   Mon Oct 9 13:43:58 2023 +0300

    RDMA/core: Add support to set privileged QKEY parameter

    Add netlink command that enables/disables privileged QKEY by default.

    It is disabled by default, since according to IB spec only privileged
    users are allowed to use privileged QKEY.
    According to the IB specification rel-1.6, section 3.5.3:
    "QKEYs with the most significant bit set are considered controlled
    QKEYs, and a HCA does not allow a consumer to arbitrarily specify a
    controlled QKEY."

    Using rdma tool,
    $rdma system set privileged-qkey on

    When enabled non-privileged users would be able to use
    controlled QKEYs which are considered privileged.

    Using rdma tool,
    $rdma system set privileged-qkey off

    When disabled only privileged users would be able to use
    controlled QKEYs.

    You can also use the command below to check the parameter state:
    $rdma system show
    netns shared privileged-qkey off copy-on-fork on

    Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
    Link: https://lore.kernel.org/r/90398be70a9d23d2aa9d0f9fd11d2c264c1be534.1696848201.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:52 -04:00
Kamal Heib 622b4ac375 RDMA/core: Add support to dump SRQ resource in RAW format
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit aebf8145e11a29a77dac15ee041a190676fac05f
Author: wenglianfa <wenglianfa@huawei.com>
Date:   Mon Sep 18 21:11:09 2023 +0800

    RDMA/core: Add support to dump SRQ resource in RAW format

    Add support to dump SRQ resource in raw format. It enable drivers to
    return the entire device specific SRQ context without setting each
    field separately.

    Example:
    $ rdma res show srq -r
    dev hns3 149000...

    $ rdma res show srq -j -r
    [{"ifindex":0,"ifname":"hns3","data":[149,0,0,...]}]

    Signed-off-by: wenglianfa <wenglianfa@huawei.com>
    Link: https://lore.kernel.org/r/20230918131110.3987498-3-huangjunxian6@hisilicon.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:51 -04:00
Kamal Heib 8d75c9c8c5 RDMA/core: Add dedicated SRQ resource tracker function
JIRA: https://issues.redhat.com/browse/RHEL-56247

commit 0e32d7d43b0b2d870b45cf4dff8188203800aa91
Author: wenglianfa <wenglianfa@huawei.com>
Date:   Mon Sep 18 21:11:08 2023 +0800

    RDMA/core: Add dedicated SRQ resource tracker function

    Add a dedicated callback function for SRQ resource tracker.

    Signed-off-by: wenglianfa <wenglianfa@huawei.com>
    Link: https://lore.kernel.org/r/20230918131110.3987498-2-huangjunxian6@hisilicon.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2024-10-07 11:55:51 -04:00
Kamal Heib b543a19b1b RDMA/core: Require admin capabilities to set system parameters
JIRA: https://issues.redhat.com/browse/RHEL-1030

commit c38d23a54445f9a8aa6831fafc9af0496ba02f9e
Author: Leon Romanovsky <leon@kernel.org>
Date:   Wed Oct 4 21:17:49 2023 +0300

    RDMA/core: Require admin capabilities to set system parameters

    Like any other set command, require admin permissions to do it.

    Cc: stable@vger.kernel.org
    Fixes: 2b34c55802 ("RDMA/core: Add command to set ib_core device net namspace sharing mode")
    Link: https://lore.kernel.org/r/75d329fdd7381b52cbdf87910bef16c9965abb1f.1696443438.git.leon@kernel.org
    Reviewed-by: Parav Pandit <parav@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2023-11-20 15:13:16 -05:00
Kamal Heib 8d606597a7 RDMA/nldev: Fix failure to send large messages
Bugzilla: https://bugzilla.redhat.com/2168936

commit fc8f93ad3e5485d45c992233c96acd902992dfc4
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Mon Nov 28 13:52:46 2022 +0200

    RDMA/nldev: Fix failure to send large messages

    Return "-EMSGSIZE" instead of "-EINVAL" when filling a QP entry, so that
    new SKBs will be allocated if there's not enough room in current SKB.

    Fixes: 65959522f8 ("RDMA: Add support to dump resource tracker in RAW format")
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
    Link: https://lore.kernel.org/r/b5e9c62f6b8369acab5648b661bf539cbceeffdc.1669636336.git.leonro@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2023-03-31 14:16:05 -04:00
Kamal Heib ddf5434261 RDMA/nldev: Add NULL check to silence false warnings
Bugzilla: https://bugzilla.redhat.com/2168936

commit 67e6272d53386f9708f91c4d0015c4a1c470eef5
Author: Or Har-Toov <ohartoov@nvidia.com>
Date:   Mon Nov 28 13:52:45 2022 +0200

    RDMA/nldev: Add NULL check to silence false warnings

    Using nlmsg_put causes static analysis tools to many
    false positives of not checking the return value of nlmsg_put.

    In all uses in nldev.c, payload parameter is 0 so NULL will never
    be returned. So let's add useless checks to silence the warnings.

    Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
    Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
    Link: https://lore.kernel.org/r/bd924da89d5b4f5291a4a01d9b5ae47c0a9b6a3f.1669636336.git.leonro@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2023-03-31 14:16:05 -04:00
Kamal Heib 5344cfacbd RDMA/nldev: Add checks for nla_nest_start() in fill_stat_counter_qps()
Bugzilla: https://bugzilla.redhat.com/2168936

commit ea5ef136e215fdef35f14010bc51fcd6686e6922
Author: Yuan Can <yuancan@huawei.com>
Date:   Sat Nov 26 04:34:10 2022 +0000

    RDMA/nldev: Add checks for nla_nest_start() in fill_stat_counter_qps()

    As the nla_nest_start() may fail with NULL returned, the return value needs
    to be checked.

    Fixes: c4ffee7c9b ("RDMA/netlink: Implement counter dumpit calback")
    Signed-off-by: Yuan Can <yuancan@huawei.com>
    Link: https://lore.kernel.org/r/20221126043410.85632-1-yuancan@huawei.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2023-03-31 14:16:05 -04:00
Kamal Heib 0d3c4bfed6 RDMA/nldev: Return "-EAGAIN" if the cm_id isn't from expected port
Bugzilla: https://bugzilla.redhat.com/2168936

commit ecacb3751f254572af0009b9501e2cdc83a30b6a
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Mon Nov 7 10:51:36 2022 +0200

    RDMA/nldev: Return "-EAGAIN" if the cm_id isn't from expected port

    When filling a cm_id entry, return "-EAGAIN" instead of 0 if the cm_id
    doesn'the have the same port as requested, otherwise an incomplete entry
    may be returned, which causes "rdam res show cm_id" to return an error.

    For example on a machine with two rdma devices with "rping -C 1 -v -s"
    running background, the "rdma" command fails:
      $ rdma -V
      rdma utility, iproute2-5.19.0
      $ rdma res show cm_id
      link mlx5_0/- cm-idn 0 state LISTEN ps TCP pid 28056 comm rping src-addr 0.0.0.0:7174
      error: Protocol not available

    While with this fix it succeeds:
      $ rdma res show cm_id
      link mlx5_0/- cm-idn 0 state LISTEN ps TCP pid 26395 comm rping src-addr 0.0.0.0:7174
      link mlx5_1/- cm-idn 0 state LISTEN ps TCP pid 26395 comm rping src-addr 0.0.0.0:7174

    Fixes: 00313983cd ("RDMA/nldev: provide detailed CM_ID information")
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Link: https://lore.kernel.org/r/a08e898cdac5e28428eb749a99d9d981571b8ea7.1667810736.git.leonro@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2023-03-31 14:16:05 -04:00
Kamal Heib 4b227b2692 RDMA/core: Fix null-ptr-deref in ib_core_cleanup()
Bugzilla: https://bugzilla.redhat.com/2120668

commit 07c0d131cc0fe1f3981a42958fc52d573d303d89
Author: Chen Zhongjin <chenzhongjin@huawei.com>
Date:   Tue Oct 25 10:41:46 2022 +0800

    RDMA/core: Fix null-ptr-deref in ib_core_cleanup()

    KASAN reported a null-ptr-deref error:

      KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
      CPU: 1 PID: 379
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      RIP: 0010:destroy_workqueue+0x2f/0x740
      RSP: 0018:ffff888016137df8 EFLAGS: 00000202
      ...
      Call Trace:
       ib_core_cleanup+0xa/0xa1 [ib_core]
       __do_sys_delete_module.constprop.0+0x34f/0x5b0
       do_syscall_64+0x3a/0x90
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7fa1a0d221b7
      ...

    It is because the fail of roce_gid_mgmt_init() is ignored:

     ib_core_init()
       roce_gid_mgmt_init()
         gid_cache_wq = alloc_ordered_workqueue # fail
     ...
     ib_core_cleanup()
       roce_gid_mgmt_cleanup()
         destroy_workqueue(gid_cache_wq)
         # destroy an unallocated wq

    Fix this by catching the fail of roce_gid_mgmt_init() in ib_core_init().

    Fixes: 03db3a2d81 ("IB/core: Add RoCE GID table management")
    Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
    Link: https://lore.kernel.org/r/20221025024146.109137-1-chenzhongjin@huawei.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-11-29 11:40:49 -05:00
Kamal Heib 5ca13355ba RDMA: Split kernel-only global device caps from uverbs device caps
Bugzilla: https://bugzilla.redhat.com/2120662

commit e945c653c8e972d1b81a88e474d79f801b60213a
Author: Jason Gunthorpe <jgg@ziepe.ca>
Date:   Mon Apr 4 12:26:42 2022 -0300

    RDMA: Split kernel-only global device caps from uverbs device caps

    Split out flags from ib_device::device_cap_flags that are only used
    internally to the kernel into kernel_cap_flags that is not part of the
    uapi. This limits the device_cap_flags to being the same bitmap that will
    be copied to userspace.

    This cleanly splits out the uverbs flags from the kernel flags to avoid
    confusion in the flags bitmap.

    Add some short comments describing which each of the kernel flags is
    connected to. Remove unused kernel flags.

    Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-10-06 09:14:07 -04:00
Kamal Heib ca42b91d9a RDMA/nldev: Prevent underflow in nldev_stat_set_counter_dynamic_doit()
Bugzilla: http://bugzilla.redhat.com/2056772

commit 87e0eacb176f9500c2063d140c0a1d7fa51ab8a5
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Wed Mar 16 11:39:48 2022 +0300

    RDMA/nldev: Prevent underflow in nldev_stat_set_counter_dynamic_doit()

    This code checks "index" for an upper bound but it does not check for
    negatives.  Change the type to unsigned to prevent underflows.

    Fixes: 3c3c1f141639 ("RDMA/nldev: Allow optional-counter status configuration through RDMA netlink")
    Link: https://lore.kernel.org/r/20220316083948.GC30941@kili
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-05-10 11:45:11 +03:00
Kamal Heib a51886bfeb RDMA/nldev: Check stat attribute before accessing it
Bugzilla: http://bugzilla.redhat.com/2056770

commit d821f7c13ca03318ad1bdc64ce64afb43080a07a
Author: Leon Romanovsky <leon@kernel.org>
Date:   Wed Nov 17 14:27:04 2021 +0200

    RDMA/nldev: Check stat attribute before accessing it

    The access to non-existent netlink attribute causes to the following
    kernel panic. Fix it by checking existence before trying to read it.

      general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 0 PID: 6744 Comm: syz-executor.0 Not tainted 5.15.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:nla_get_u32 include/net/netlink.h:1554 [inline]
      RIP: 0010:nldev_stat_set_mode_doit drivers/infiniband/core/nldev.c:1909 [inline]
      RIP: 0010:nldev_stat_set_doit+0x578/0x10d0 drivers/infiniband/core/nldev.c:2040
      Code: fa 4c 8b a4 24 f8 02 00 00 48 b8 00 00 00 00 00 fc ff df c7 84 24 80 00 00 00 00 00 00 00 49 8d 7c 24 04 48 89
      fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 02
      RSP: 0018:ffffc90004acf2e8 EFLAGS: 00010247
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90002b94000
      RDX: 0000000000000000 RSI: ffffffff8684c5ff RDI: 0000000000000004
      RBP: ffff88807cda4000 R08: 0000000000000000 R09: ffff888023fb8027
      R10: ffffffff8684c5d7 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000001 R14: ffff888041024280 R15: ffff888031ade780
      FS:  00007eff9dddd700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2ef24000 CR3: 0000000036902000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195
       rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
       rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x86d/0xda0 net/netlink/af_netlink.c:1916
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:724
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

    Fixes: 822cf785ac6d ("RDMA/nldev: Split nldev_stat_set_mode_doit out of nldev_stat_set_doit")
    Link: https://lore.kernel.org/r/b21967c366f076ff1988862f9c8a1aa0244c599f.1637151999.git.leonro@nvidia.com
    Reported-by: syzbot+9111d2255a9710e87562@syzkaller.appspotmail.com
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-03-23 19:59:52 -04:00
Kamal Heib d2cb43899f RDMA/nldev: Allow optional-counter status configuration through RDMA netlink
Bugzilla: http://bugzilla.redhat.com/2056770

commit 3c3c1f1416392382faa0238e76a70d7810aab2ef
Author: Aharon Landau <aharonl@nvidia.com>
Date:   Fri Oct 8 15:24:35 2021 +0300

    RDMA/nldev: Allow optional-counter status configuration through RDMA netlink

    Provide an option to allow users to enable/disable optional counters
    through RDMA netlink. Limiting it to users with ADMIN capability only.

    Examples:
    1. Enable optional counters cc_rx_ce_pkts and cc_rx_cnp_pkts (and
       disable all others):
    $ sudo rdma statistic set link rocep8s0f0/1 optional-counters \
        cc_rx_ce_pkts,cc_rx_cnp_pkts

    2. Remove all optional counters:
    $ sudo rdma statistic unset link rocep8s0f0/1 optional-counters

    Link: https://lore.kernel.org/r/20211008122439.166063-10-markzhang@nvidia.com
    Signed-off-by: Aharon Landau <aharonl@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-03-23 19:59:10 -04:00
Kamal Heib 56b22c0b21 RDMA/nldev: Split nldev_stat_set_mode_doit out of nldev_stat_set_doit
Bugzilla: http://bugzilla.redhat.com/2056770

commit 822cf785ac6d9120386f59964d6d029f3f04a8e3
Author: Aharon Landau <aharonl@nvidia.com>
Date:   Fri Oct 8 15:24:34 2021 +0300

    RDMA/nldev: Split nldev_stat_set_mode_doit out of nldev_stat_set_doit

    In order to allow expansion of the set command with more set options, take
    the set mode out of the main set function.

    Link: https://lore.kernel.org/r/20211008122439.166063-9-markzhang@nvidia.com
    Signed-off-by: Aharon Landau <aharonl@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-03-23 19:59:10 -04:00
Kamal Heib 142ed1dc6f RDMA/nldev: Add support to get status of all counters
Bugzilla: http://bugzilla.redhat.com/2056770

commit 7301d0a9834c7f1f0c91c1f0a46c7b191b1fd0da
Author: Aharon Landau <aharonl@nvidia.com>
Date:   Fri Oct 8 15:24:33 2021 +0300

    RDMA/nldev: Add support to get status of all counters

    This patch adds the ability to get the name, index and status of all
    counters for each link through RDMA netlink. This can be used for
    user-space to get the current optional-counter mode.

    Examples:
    $ rdma statistic mode
    link rocep8s0f0/1 optional-counters cc_rx_ce_pkts

    $ rdma statistic mode supported
    link rocep8s0f0/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts
    link rocep8s0f1/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts

    Link: https://lore.kernel.org/r/20211008122439.166063-8-markzhang@nvidia.com
    Signed-off-by: Aharon Landau <aharonl@nvidia.com>
    Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-03-23 19:59:10 -04:00
Kamal Heib e15d888971 RDMA/counter: Add an is_disabled field in struct rdma_hw_stats
Bugzilla: http://bugzilla.redhat.com/2056770

commit 0dc89684605e8b60af645988d3e0d80a57b6e937
Author: Aharon Landau <aharonl@nvidia.com>
Date:   Fri Oct 8 15:24:31 2021 +0300

    RDMA/counter: Add an is_disabled field in struct rdma_hw_stats

    Add a bitmap in rdma_hw_stat structure, with each bit indicates whether
    the corresponding counter is currently disabled or not. By default
    hwcounters are enabled.

    Link: https://lore.kernel.org/r/20211008122439.166063-6-markzhang@nvidia.com
    Signed-off-by: Aharon Landau <aharonl@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-03-23 19:59:10 -04:00
Kamal Heib 80da9a9c75 RDMA/counter: Add a descriptor in struct rdma_hw_stats
Bugzilla: http://bugzilla.redhat.com/2056770

commit 13f30b0fa0a9fa4f713edbb262f2e451886ce242
Author: Aharon Landau <aharonl@nvidia.com>
Date:   Fri Oct 8 15:24:29 2021 +0300

    RDMA/counter: Add a descriptor in struct rdma_hw_stats

    Add a counter statistic descriptor structure in rdma_hw_stats. In addition
    to the counter name, more meta-information will be added.  This code
    extension is needed for optional-counter support in the following patches.

    Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
    Signed-off-by: Aharon Landau <aharonl@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Signed-off-by: Kamal Heib <kheib@redhat.com>
2022-03-23 19:59:10 -04:00
Jason Gunthorpe d8a5883814 RDMA/core: Replace the ib_port_data hw_stats pointers with a ib_port pointer
It is much saner to store a pointer to the kobject structure that contains
the cannonical stats pointer than to copy the stats pointers into a public
structure.

Future patches will require the sysfs pointer for other purposes.

Link: https://lore.kernel.org/r/f90551dfd296cde1cb507bbef27cca9891d19871.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Jason Gunthorpe 4b5f4d3fb4 RDMA: Split the alloc_hw_stats() ops to port and device variants
This is being used to implement both the port and device global stats,
which is causing some confusion in the drivers. For instance EFA and i40iw
both seem to be misusing the device stats.

Split it into two ops so drivers that don't support one or the other can
leave the op NULL'd, making the calling code a little simpler to
understand.

Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Gal Pressman 6cc9e215eb RDMA/nldev: Add copy-on-fork attribute to get sys command
The new attribute indicates that the kernel copies DMA pages on fork,
hence libibverbs' fork support through madvise and MADV_DONTFORK is not
needed.

The introduced attribute is always reported as supported since the kernel
has the patch that added the copy-on-fork behavior. This allows the
userspace library to identify older vs newer kernel versions.  Extra care
should be taken when backporting this patch as it relies on the fact that
the copy-on-fork patch is merged, hence no check for support is added.

Don't backport this patch unless you also have the following series:
commit 70e806e4e6 ("mm: Do early cow for pinned pages during fork() for
ptes") and commit 4eae4efa2c ("hugetlb: do early cow when page pinned on
src mm").

Fixes: 70e806e4e6 ("mm: Do early cow for pinned pages during fork() for ptes")
Fixes: 4eae4efa2c ("hugetlb: do early cow when page pinned on src mm")
Link: https://lore.kernel.org/r/20210418121025.66849-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-27 15:33:20 -03:00
Neta Ostrovsky c6c11ad3ab RDMA/nldev: Add QP numbers to SRQ information
Add QP numbers that are associated with the SRQ to the SRQ information.
The QPs are displayed in a range form.

Sample output:

$ rdma res show srq
dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib]
dev ibp8s0f0 srqn 4 type BASIC lqpn 125-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 5 type BASIC lqpn 141-156 pdn 10 pid 3584 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 6 type BASIC lqpn 157-172 pdn 11 pid 3590 comm ibv_srq_pingpon
dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib]
dev ibp8s0f1 srqn 1 type BASIC lqpn 329-344 pdn 4 pid 3586 comm ibv_srq_pingpon

$ rdma res show srq lqpn 126-141
dev ibp8s0f0 srqn 4 type BASIC lqpn 126-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 5 type BASIC lqpn 141 pdn 10 pid 3584 comm ibv_srq_pingpon

$ rdma res show srq lqpn 127
dev ibp8s0f0 srqn 4 type BASIC lqpn 127 pdn 9 pid 3581 comm ibv_srq_pingpon

Link: https://lore.kernel.org/r/79a4bd4caec2248fd9583cccc26786af8e4414fc.1618753110.git.leonro@nvidia.com
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22 10:30:27 -03:00
Neta Ostrovsky 391c6bd5ac RDMA/nldev: Return SRQ information
Extend the RDMA nldev return a SRQ information, like SRQ number, SRQ type,
PD number, CQ number and process ID that created that SRQ.

Sample output:

$ rdma res show srq
dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib]
dev ibp8s0f0 srqn 4 type BASIC pdn 9 pid 3581 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 5 type BASIC pdn 10 pid 3584 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 6 type BASIC pdn 11 pid 3590 comm ibv_srq_pingpon
dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib]
dev ibp8s0f1 srqn 1 type BASIC pdn 4 pid 3586 comm ibv_srq_pingpon

Link: https://lore.kernel.org/r/322f9210b95812799190dd4a0fb92f3a3bba0333.1618753110.git.leonro@nvidia.com
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22 10:30:27 -03:00
Neta Ostrovsky 12ce208f40 RDMA/nldev: Return context information
Extend the RDMA nldev return a context information, like ctx number and
process ID that created that context. This functionality is helpful to
find orphan contexts that are not closed for some reason.

Sample output:

$ rdma res show ctx
dev ibp8s0f0 ctxn 0 pid 980 comm ibv_rc_pingpong
dev ibp8s0f0 ctxn 1 pid 981 comm ibv_rc_pingpong
dev ibp8s0f0 ctxn 2 pid 992 comm ibv_rc_pingpong
dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong
dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong

$ rdma res show ctx dev ibp8s0f1
dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong
dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong

Link: https://lore.kernel.org/r/5c956acfeac4e9d532988575f3da7d64cb449374.1618753110.git.leonro@nvidia.com
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22 10:30:27 -03:00
Mark Bloch 1fb7f8973f RDMA: Support more than 255 rdma ports
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs.  HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 09:31:21 -03:00
Patrisious Haddad 33eb12f296 RDMA/nldev: Return an error message on failure to turn auto mode
The bounded counter can't be reconfigured to be in auto mode, in attempt
to do it, the user will get an error, but without any hint why. Update
nldev interface to return an error message through extack mechanism.

Link: https://lore.kernel.org/r/20201230130240.180737-1-leon@kernel.org
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18 16:02:14 -04:00
Francis Laniel 872f690341 treewide: rename nla_strlcpy to nla_strscpy.
Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new
name of this function.

Signed-off-by: Francis Laniel <laniel_francis@privacyrequired.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 08:08:54 -08:00
Mark Zhang 1d70ad0f85 RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP
When dumping QPs bound to a counter, raw QPs should be allowed to dump
without the CAP_NET_RAW privilege. This is consistent with what "rdma res
show qp" does.

Fixes: c4ffee7c9b ("RDMA/netlink: Implement counter dumpit calback")
Link: https://lore.kernel.org/r/20200727095828.496195-1-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29 15:51:19 -03:00
Mark Zhang 7c97f3aded RDMA/counter: Add PID category support in auto mode
With the "PID" category QPs have same PID will be bound to same counter;
If this category is not set then QPs have different PIDs will be bound
to same counter.

This is implemented for 2 reasons:
1. The counter is a limited resource, while there may be dozens of
   applications, each of which creates several types of QPs, which means
   it may doesn't have enough counter.
2. The system administrator needs all QPs created by all applications
   with same type bound to one counter.

The counter name and PID is only make sense when "PID" category are
configured.

This category can also be used in combine with others, e.g. QP type.

Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10 16:50:53 -03:00
Maor Gottlieb 65959522f8 RDMA: Add support to dump resource tracker in RAW format
Add support to get resource dump in raw format. It enable drivers to
return the entire device specific QP/CQ/MR context without a need from the
driver to set each field separately.

The raw query returns only the device specific data, general data is still
returned by using the existing queries.

Example:

$ rdma res show mr dev mlx5_1 mrn 2 -r -j
[{"ifindex":7,"ifname":"mlx5_1",
"data":[0,4,255,254,0,0,0,0,0,0,0,0,16,28,0,216,...]}]

Link: https://lore.kernel.org/r/20200623113043.1228482-9-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24 08:52:29 -03:00
Maor Gottlieb 211cd9459f RDMA: Add dedicated CM_ID resource tracker function
In order to avoid double multiplexing of the resource when it is a cm id,
add a dedicated callback function. In addition remove fill_res_entry which
is not used anymore.

Link: https://lore.kernel.org/r/20200623113043.1228482-8-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-23 11:46:27 -03:00
Maor Gottlieb 5cc34116cc RDMA: Add dedicated QP resource tracker function
In order to avoid double multiplexing of the resource when it is a QP, add
a dedicated callback function.

Link: https://lore.kernel.org/r/20200623113043.1228482-7-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-23 11:46:27 -03:00
Maor Gottlieb 9e2a187a93 RDMA: Add a dedicated CQ resource tracker function
In order to avoid double multiplexing of the resource when it is a CQ, add
a dedicated callback function.

Link: https://lore.kernel.org/r/20200623113043.1228482-6-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-23 11:46:27 -03:00
Maor Gottlieb f443452900 RDMA: Add dedicated MR resource tracker function
In order to avoid double multiplexing of the resource when it is a MR, add
a dedicated callback function.

Link: https://lore.kernel.org/r/20200623113043.1228482-5-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-23 11:46:27 -03:00
Maor Gottlieb 24fd6d6f85 RDMA/core: Don't call fill_res_entry for PD
None of the drivers implement it, remove it.

Link: https://lore.kernel.org/r/20200623113043.1228482-4-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-23 11:46:26 -03:00
Maor Gottlieb 50bbe3d34f RDMA/core: Fix double put of resource
Do not decrease the reference count of resource tracker object twice in
the error flow of res_get_common_doit.

Fixes: c5dfe0ea6f ("RDMA/nldev: Add resource tracker doit callback")
Link: https://lore.kernel.org/r/20200507062942.98305-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-12 11:47:48 -03:00
Jason Gunthorpe 7aefa6237c RDMA/nl: Do not permit empty devices names during RDMA_NLDEV_CMD_NEWLINK/SET
Empty device names cannot be added to sysfs and crash with:

  kobject: (00000000f9de3792): attempted to be registered with empty name!
  WARNING: CPU: 1 PID: 10856 at lib/kobject.c:234 kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 10856 Comm: syz-executor459 Not tainted 5.6.0-rc3-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x197/0x210 lib/dump_stack.c:118
   panic+0x2e3/0x75c kernel/panic.c:221
   __warn.cold+0x2f/0x3e kernel/panic.c:582
   report_bug+0x289/0x300 lib/bug.c:195
   fixup_bug arch/x86/kernel/traps.c:174 [inline]
   fixup_bug arch/x86/kernel/traps.c:169 [inline]
   do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
   do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
   invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
  RIP: 0010:kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234
  Code: 7a ca ca f9 e9 f0 f8 ff ff 4c 89 f7 e8 cd ca ca f9 e9 95 f9 ff ff e8 13 25 8c f9 4c 89 e6 48 c7 c7 a0 08 1a 89 e8 a3 76 5c f9 <0f> 0b 41 bd ea ff ff ff e9 52 ff ff ff e8 f2 24 8c f9 0f 0b e8 eb
  RSP: 0018:ffffc90002006eb0 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffffff815eae46 RDI: fffff52000400dc8
  RBP: ffffc90002006f08 R08: ffff8880972ac500 R09: ffffed1015d26659
  R10: ffffed1015d26658 R11: ffff8880ae9332c7 R12: ffff888093034668
  R13: 0000000000000000 R14: ffff8880a69d7600 R15: 0000000000000001
   kobject_add_varg lib/kobject.c:390 [inline]
   kobject_add+0x150/0x1c0 lib/kobject.c:442
   device_add+0x3be/0x1d00 drivers/base/core.c:2412
   ib_register_device drivers/infiniband/core/device.c:1371 [inline]
   ib_register_device+0x93e/0xe40 drivers/infiniband/core/device.c:1343
   rxe_register_device+0x52e/0x655 drivers/infiniband/sw/rxe/rxe_verbs.c:1231
   rxe_add+0x122b/0x1661 drivers/infiniband/sw/rxe/rxe.c:302
   rxe_net_add+0x91/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:539
   rxe_newlink+0x39/0x90 drivers/infiniband/sw/rxe/rxe.c:318
   nldev_newlink+0x28a/0x430 drivers/infiniband/core/nldev.c:1538
   rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
   rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
   rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259
   netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
   netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329
   netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918
   sock_sendmsg_nosec net/socket.c:652 [inline]
   sock_sendmsg+0xd7/0x130 net/socket.c:672
   ____sys_sendmsg+0x753/0x880 net/socket.c:2343
   ___sys_sendmsg+0x100/0x170 net/socket.c:2397
   __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
   __do_sys_sendmsg net/socket.c:2439 [inline]
   __se_sys_sendmsg net/socket.c:2437 [inline]
   __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
   do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Prevent empty names when checking the name provided from userspace during
newlink and rename.

Fixes: 3856ec4b93 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support")
Fixes: 05d940d3a3 ("RDMA/nldev: Allow IB device rename through RDMA netlink")
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20200309191648.GA30852@ziepe.ca
Reported-and-tested-by: syzbot+da615ac67d4dbea32cbc@syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-13 10:23:38 -03:00
Mark Zhang 78f34a16c2 RDMA/nldev: Fix crash when set a QP to a new counter but QPN is missing
This fixes the kernel crash when a RDMA_NLDEV_CMD_STAT_SET command is
received, but the QP number parameter is not available.

  iwpm_register_pid: Unable to send a nlmsg (client = 2)
  infiniband syz1: RDMA CMA: cma_listen_on_dev, error -98
  general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  CPU: 0 PID: 9754 Comm: syz-executor069 Not tainted 5.6.0-rc2-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:nla_get_u32 include/net/netlink.h:1474 [inline]
  RIP: 0010:nldev_stat_set_doit+0x63c/0xb70 drivers/infiniband/core/nldev.c:1760
  Code: fc 01 0f 84 58 03 00 00 e8 41 83 bf fb 4c 8b a3 58 fd ff ff 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 6d
  RSP: 0018:ffffc900068bf350 EFLAGS: 00010247
  RAX: dffffc0000000000 RBX: ffffc900068bf728 RCX: ffffffff85b60470
  RDX: 0000000000000000 RSI: ffffffff85b6047f RDI: 0000000000000004
  RBP: ffffc900068bf750 R08: ffff88808c3ee140 R09: ffff8880a25e6010
  R10: ffffed10144bcddc R11: ffff8880a25e6ee3 R12: 0000000000000000
  R13: ffff88809acb0000 R14: ffff888092a42c80 R15: 000000009ef2e29a
  FS:  0000000001ff0880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f4733e34000 CR3: 00000000a9b27000 CR4: 00000000001406f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
    rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
    rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
    rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259
    netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
    netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329
    netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg+0xd7/0x130 net/socket.c:672
    ____sys_sendmsg+0x753/0x880 net/socket.c:2343
    ___sys_sendmsg+0x100/0x170 net/socket.c:2397
    __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
    __do_sys_sendmsg net/socket.c:2439 [inline]
    __se_sys_sendmsg net/socket.c:2437 [inline]
    __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
    do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x4403d9
  Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
  RSP: 002b:00007ffc0efbc5c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
  RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004403d9
  RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000004
  RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8
  R10: 000000000000004a R11: 0000000000000246 R12: 0000000000401c60
  R13: 0000000000401cf0 R14: 0000000000000000 R15: 0000000000000000

Fixes: b389327df9 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink")
Link: https://lore.kernel.org/r/20200227125111.99142-1-leon@kernel.org
Reported-by: syzbot+bd4af81bc51ee0283445@syzkaller.appspotmail.com
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-04 14:17:10 -04:00
Jason Gunthorpe 5bd48c18c8 RDMA/core: Do not erase the type of ib_cq.uobject
This is a struct ib_ucq_object pointer, instead of using container_of()
all over the place just store it with its actual type.

Link: https://lore.kernel.org/r/1578504126-9400-7-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13 16:20:15 -04:00