Commit Graph

15 Commits

Author SHA1 Message Date
Ivan Vecera 427128699e net: free altname using an RCU callback
JIRA: https://issues.redhat.com/browse/RHEL-62123

commit 723de3ebef03bc14bd72531f00f9094337654009
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Jan 26 12:14:49 2024 -0800

    net: free altname using an RCU callback

    We had to add another synchronize_rcu() in recent fix.
    Bite the bullet and add an rcu_head to netdev_name_node,
    free from RCU.

    Note that name_node does not hold any reference on dev
    to which it points, but there must be a synchronize_rcu()
    on device removal path, so we should be fine.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-10-24 16:14:43 +02:00
Ivan Vecera 3676002068 net: fix removing a namespace with conflicting altnames
JIRA: https://issues.redhat.com/browse/RHEL-62123

commit d09486a04f5da0a812c26217213b89a3b1acf836
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu Jan 18 16:58:59 2024 -0800

    net: fix removing a namespace with conflicting altnames

    Mark reports a BUG() when a net namespace is removed.

        kernel BUG at net/core/dev.c:11520!

    Physical interfaces moved outside of init_net get "refunded"
    to init_net when that namespace disappears. The main interface
    name may get overwritten in the process if it would have
    conflicted. We need to also discard all conflicting altnames.
    Recent fixes addressed ensuring that altnames get moved
    with the main interface, which surfaced this problem.

    Reported-by: Марк Коренберг <socketpair@gmail.com>
    Link: https://lore.kernel.org/all/CAEmTpZFZ4Sv3KwqFOY2WKDHeZYdi0O7N5H1nTvcGp=SAEavtDg@mail.gmail.com/
    Fixes: 7663d522099e ("net: check for altname conflicts when changing netdev's netns")
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-10-24 16:14:42 +02:00
Ivan Vecera f6ec4f3e1b net: check for altname conflicts when changing netdev's netns
JIRA: https://issues.redhat.com/browse/RHEL-62123

commit 7663d522099ecc464512164e660bc771b2ff7b64
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue Oct 17 18:38:14 2023 -0700

    net: check for altname conflicts when changing netdev's netns

    It's currently possible to create an altname conflicting
    with an altname or real name of another device by creating
    it in another netns and moving it over:

     [ ~]$ ip link add dev eth0 type dummy

     [ ~]$ ip netns add test
     [ ~]$ ip -netns test link add dev ethX netns test type dummy
     [ ~]$ ip -netns test link property add dev ethX altname eth0
     [ ~]$ ip -netns test link set dev ethX netns 1

     [ ~]$ ip link
     ...
     3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
         link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
     ...
     5: ethX: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
         link/ether 26:b7:28:78:38:0f brd ff:ff:ff:ff:ff:ff
         altname eth0

    Create a macro for walking the altnames, this hopefully makes
    it clearer that the list we walk contains only altnames.
    Which is otherwise not entirely intuitive.

    Fixes: 36fbf1e52b ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-10-24 15:55:36 +02:00
Rado Vrbovsky 3438e40aac Merge: net: Provide SMP threads for backlog NAPI
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4253

JIRA: https://issues.redhat.com/browse/RHEL-9145

Signed-off-by: Wander Lairson Costa <wander@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Approved-by: Eder Zulian <ezulian@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-10-19 08:04:53 +00:00
Ivan Vecera ca118e46cc net-sysfs: use dev_addr_sem to remove races in address_show()
JIRA: https://issues.redhat.com/browse/RHEL-59100

commit c7d52737e7ebd31cc5fef46380d94b58becf9479
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Feb 13 06:32:38 2024 +0000

    net-sysfs: use dev_addr_sem to remove races in address_show()

    Using dev_base_lock is not preventing from reading garbage.

    Use dev_addr_sem instead.

    v4: place dev_addr_sem extern in net/core/dev.h (Jakub Kicinski)
     Link: https://lore.kernel.org/netdev/20240212175845.10f6680a@kernel.org/

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-09-17 12:17:14 +02:00
Ivan Vecera 4863bafaf6 net: core: synchronize link-watch when carrier is queried
JIRA: https://issues.redhat.com/browse/RHEL-59100

commit facd15dfd69122042502d99ab8c9f888b48ee994
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Mon Dec 4 21:47:07 2023 +0100

    net: core: synchronize link-watch when carrier is queried

    There are multiple ways to query for the carrier state: through
    rtnetlink, sysfs, and (possibly) ethtool. Synchronize linkwatch
    work before these operations so that we don't have a situation
    where userspace queries the carrier state between the driver's
    carrier off->on transition and linkwatch running and expects it
    to work, when really (at least) TX cannot work until linkwatch
    has run.

    I previously posted a longer explanation of how this applies to
    wireless [1] but with this wireless can simply query the state
    before sending data, to ensure the kernel is ready for it.

    [1] https://lore.kernel.org/all/346b21d87c69f817ea3c37caceb34f1f56255884.camel@sipsolutions.net/

    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Link: https://lore.kernel.org/r/20231204214706.303c62768415.I1caedccae72ee5a45c9085c5eb49c145ce1c0dd5@changeid
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-09-17 12:17:10 +02:00
Wander Lairson Costa 0a68918319
net: add skb_defer_max sysctl
JIRA: https://issues.redhat.com/browse/RHEL-9145

commit 39564c3fdc6684c6726b63e131d2a9f3809811cb
Author: Eric Dumazet <edumazet@google.com>
Date:   Sun May 15 21:24:55 2022 -0700

    net: add skb_defer_max sysctl

    commit 68822bdf76f1 ("net: generalize skb freeing
    deferral to per-cpu lists") added another per-cpu
    cache of skbs. It was expected to be small,
    and an IPI was forced whenever the list reached 128
    skbs.

    We might need to be able to control more precisely
    queue capacity and added latency.

    An IPI is generated whenever queue reaches half capacity.

    Default value of the new limit is 64.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
2024-09-16 16:04:28 -03:00
Ivan Vecera daa10dac98 netdev-genl: Add netlink framework functions for napi
JIRA: https://issues.redhat.com/browse/RHEL-30139

Conflicts:
- context conflict due to missing 9a675ba55a96 ("net, bpf: Add
  a warning if NAPI cb missed xdp_do_flush().")

commit 27f91aaf49b3a50e5a02ad5fa27b7c453d029a72
Author: Amritha Nambiar <amritha.nambiar@intel.com>
Date:   Fri Dec 1 15:28:56 2023 -0800

    netdev-genl: Add netlink framework functions for napi

    Implement the netdev netlink framework functions for
    napi support. The netdev structure tracks all the napi
    instances and napi fields. The napi instances and associated
    parameters can be retrieved this way.

    Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
    Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
    Link: https://lore.kernel.org/r/170147333637.5260.14807433239805550815.stgit@anambiarhost.jf.intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-06-05 17:57:52 +02:00
Ivan Vecera e2cfbf12c4 net: add new helper unregister_netdevice_many_notify
JIRA: https://issues.redhat.com/browse/RHEL-30344

commit 77f4aa9a2a1766a0b9343fd812b71f18d05178da
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Fri Oct 28 04:42:22 2022 -0400

    net: add new helper unregister_netdevice_many_notify

    Add new helper unregister_netdevice_many_notify(), pass netlink message
    header and portid, which could be used to notify userspace when flag
    NLM_F_ECHO is set.

    Make the unregister_netdevice_many() as a wrapper of new function
    unregister_netdevice_many_notify().

    Suggested-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Reviewed-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-02 11:15:37 +02:00
Ivan Vecera e3ad0f633f rtnetlink: pass netlink message header and portid to rtnl_configure_link()
JIRA: https://issues.redhat.com/browse/RHEL-30344

commit 1d997f1013079c05b642c739901e3584a3ae558d
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Fri Oct 28 04:42:21 2022 -0400

    rtnetlink: pass netlink message header and portid to rtnl_configure_link()

    This patch pass netlink message header and portid to rtnl_configure_link()
    All the functions in this call chain need to add the parameters so we can
    use them in the last call rtnl_notify(), and notify the userspace about
    the new link info if NLM_F_ECHO flag is set.

    - rtnl_configure_link()
      - __dev_notify_flags()
        - rtmsg_ifinfo()
          - rtmsg_ifinfo_event()
            - rtmsg_ifinfo_build_skb()
            - rtmsg_ifinfo_send()
              - rtnl_notify()

    Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub
    suggested.

    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Reviewed-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-02 11:15:37 +02:00
Ivan Vecera 8a9913f68e net: ensure net_todo_list is processed quickly
JIRA: https://issues.redhat.com/browse/RHEL-30344

Conflicts:
- we have already backported 6264f58ca0e54 ("net: extract a few
  internals from netdevice.h") so the net_todo_list has to be placed in
  net/core/dev.h instead of include/linux/netdevice.h

commit 0b5c21bbc01e92745ca1ca4f6fd87d878fa3ea5e
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Mon Apr 4 11:38:47 2022 +0200

    net: ensure net_todo_list is processed quickly

    In [1], Will raised a potential issue that the cfg80211 code,
    which does (from a locking perspective)

      rtnl_lock()
      wiphy_lock()
      rtnl_unlock()

    might be suspectible to ABBA deadlocks, because rtnl_unlock()
    calls netdev_run_todo(), which might end up calling rtnl_lock()
    again, which could then deadlock (see the comment in the code
    added here for the scenario).

    Some back and forth and thinking ensued, but clearly this can't
    happen if the net_todo_list is empty at the rtnl_unlock() here.
    Clearly, the code here cannot actually put an entry on it, and
    all other users of rtnl_unlock() will empty it since that will
    always go through netdev_run_todo(), emptying the list.

    So the only other way to get there would be to add to the list
    and then unlock the RTNL without going through rtnl_unlock(),
    which is only possible through __rtnl_unlock(). However, this
    isn't exported and not used in many places, and none of them
    seem to be able to unregister before using it.

    Therefore, add a WARN_ON() in the code to ensure this invariant
    won't be broken, so that the cfg80211 (or any similar) code
    stays safe.

    [1] https://lore.kernel.org/r/Yjzpo3TfZxtKPMAG@google.com

    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Link: https://lore.kernel.org/r/20220404113847.0ee02e4a70da.Ic73d206e217db20fd22dcec14fe5442ca732804b@changeid
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-03-26 15:41:49 +01:00
Xin Long 2db946b2f7 net: add gso_ipv4_max_size and gro_ipv4_max_size per device
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2185290
Tested: compile only

Conflicts:
  - move netif_set_gro_max_size() from include/linux/netdevice.h to
    net/core/dev.h, then make the change, as commit 744d49daf8bd was
    backported earlier than eac1b93c14d6. netif_set_gro_max_size()
    was missed the oppotunity to be moved to net/core/dev.h.

  - different context in net/core/dev.h, rps_cpumask_housekeeping()
    is added due to 370ca718fd5e already in RHEL-9.

commit 9eefedd58ae1daece2ba907849a44db2941fb4b0
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sat Jan 28 10:58:38 2023 -0500

    net: add gso_ipv4_max_size and gro_ipv4_max_size per device

    This patch introduces gso_ipv4_max_size and gro_ipv4_max_size
    per device and adds netlink attributes for them, so that IPV4
    BIG TCP can be guarded by a separate tunable in the next patch.

    To not break the old application using "gso/gro_max_size" for
    IPv4 GSO packets, this patch updates "gso/gro_ipv4_max_size"
    in netif_set_gso/gro_max_size() if the new size isn't greater
    than GSO_LEGACY_MAX_SIZE, so that nothing will change even if
    userspace doesn't realize the new netlink attributes.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Xin Long <lxin@redhat.com>
2023-05-02 10:36:11 -04:00
Paolo Abeni 6ac38a092f net-sysctl: factor-out rpm mask manipulation helpers
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168875
Tested: LNST, Tier1

Upstream commit:
commit 370ca718fd5e1fd45ccfdf7a9d76d010f561e607
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Tue Feb 7 19:44:56 2023 +0100

    net-sysctl: factor-out rpm mask manipulation helpers

    Will simplify the following patch. No functional change
    intended.

    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-03 14:15:15 +01:00
Ivan Vecera f05cf731a8 net: move netif_set_gso_max helpers
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2128180

Conflicts:
- adjusted due to missing eac1b93c14d6 ("gro: add ability to control
  gro max packet size")

commit 744d49daf8bd3b17b345c836f2e6f97d49fa6ae8
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu May 5 19:51:34 2022 -0700

    net: move netif_set_gso_max helpers

    These are now internal to the core, no need to expose them.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-10-18 10:27:21 +02:00
Ivan Vecera 5a0eef8003 net: extract a few internals from netdevice.h
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2128180

Conflicts:
- slightly modified due to missing 0b5c21bbc01e ("net: ensure
  net_todo_list is processed quickly") and d07b26f5bbea ("dev_addr:
  add a modification check")

commit 6264f58ca0e54e41d63c2d00334a48bac28fbf30
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Apr 6 14:37:54 2022 -0700

    net: extract a few internals from netdevice.h

    There's a number of functions and static variables used
    under net/core/ but not from the outside. We currently
    dump most of them into netdevice.h. That bad for many
    reasons:
     - netdevice.h is very cluttered, hard to figure out
       what the APIs are;
     - netdevice.h is very long;
     - we have to touch netdevice.h more which causes expensive
       incremental builds.

    Create a header under net/core/ and move some declarations.

    The new header is also a bit of a catch-all but that's
    fine, if we create more specific headers people will
    likely over-think where their declaration fit best.
    And end up putting them in netdevice.h, again.

    More work should be done on splitting netdevice.h into more
    targeted headers, but that'd be more time consuming so small
    steps.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-10-18 10:27:16 +02:00