Commit Graph

326 Commits

Author SHA1 Message Date
CKI Backport Bot e56d4cdd9a net: bridge: mcast: wait for previous gc cycles when removing port
JIRA: https://issues.redhat.com/browse/RHEL-56229
CVE: CVE-2024-44934

commit 92c4ee25208d0f35dafc3213cdf355fbe449e078
Author: Nikolay Aleksandrov <razor@blackwall.org>
Date:   Fri Aug 2 11:07:30 2024 +0300

    net: bridge: mcast: wait for previous gc cycles when removing port

    syzbot hit a use-after-free[1] which is caused because the bridge doesn't
    make sure that all previous garbage has been collected when removing a
    port. What happens is:
          CPU 1                   CPU 2
     start gc cycle           remove port
                             acquire gc lock first
     wait for lock
                             call br_multicasg_gc() directly
     acquire lock now but    free port
     the port can be freed
     while grp timers still
     running

    Make sure all previous gc cycles have finished by using flush_work before
    freeing the port.

    [1]
      BUG: KASAN: slab-use-after-free in br_multicast_port_group_expired+0x4c0/0x550 net/bridge/br_multicast.c:861
      Read of size 8 at addr ffff888071d6d000 by task syz.5.1232/9699

      CPU: 1 PID: 9699 Comm: syz.5.1232 Not tainted 6.10.0-rc5-syzkaller-00021-g24ca36a562d6 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0xc3/0x620 mm/kasan/report.c:488
       kasan_report+0xd9/0x110 mm/kasan/report.c:601
       br_multicast_port_group_expired+0x4c0/0x550 net/bridge/br_multicast.c:861
       call_timer_fn+0x1a3/0x610 kernel/time/timer.c:1792
       expire_timers kernel/time/timer.c:1843 [inline]
       __run_timers+0x74b/0xaf0 kernel/time/timer.c:2417
       __run_timer_base kernel/time/timer.c:2428 [inline]
       __run_timer_base kernel/time/timer.c:2421 [inline]
       run_timer_base+0x111/0x190 kernel/time/timer.c:2437

    Reported-by: syzbot+263426984509be19c9a0@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=263426984509be19c9a0
    Fixes: e12cec65b5 ("net: bridge: mcast: destroy all entries via gc")
    Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
    Link: https://patch.msgid.link/20240802080730.3206303-1-razor@blackwall.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-08-27 12:15:34 +00:00
Ivan Vecera 90d64eeb87 bridge: mcast: fix disabled snooping after long uptime
JIRA: https://issues.redhat.com/browse/RHEL-36219

commit f5c3eb4b7251baba5cd72c9e93920e710ac8194a
Author: Linus Lüssing <linus.luessing@c0d3.blue>
Date:   Sat Jan 27 18:50:32 2024 +0100

    bridge: mcast: fix disabled snooping after long uptime

    The original idea of the delay_time check was to not apply multicast
    snooping too early when an MLD querier appears. And to instead wait at
    least for MLD reports to arrive before switching from flooding to group
    based, MLD snooped forwarding, to avoid temporary packet loss.

    However in a batman-adv mesh network it was noticed that after 248 days of
    uptime 32bit MIPS based devices would start to signal that they had
    stopped applying multicast snooping due to missing queriers - even though
    they were the elected querier and still sending MLD queries themselves.

    While time_is_before_jiffies() generally is safe against jiffies
    wrap-arounds, like the code comments in jiffies.h explain, it won't
    be able to track a difference larger than ULONG_MAX/2. With a 32bit
    large jiffies and one jiffies tick every 10ms (CONFIG_HZ=100) on these MIPS
    devices running OpenWrt this would result in a difference larger than
    ULONG_MAX/2 after 248 (= 2^32/100/60/60/24/2) days and
    time_is_before_jiffies() would then start to return false instead of
    true. Leading to multicast snooping not being applied to multicast
    packets anymore.

    Fix this issue by using a proper timer_list object which won't have this
    ULONG_MAX/2 difference limitation.

    Fixes: b00589af3b ("bridge: disable snooping if there is no querier")
    Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Link: https://lore.kernel.org/r/20240127175033.9640-1-linus.luessing@c0d3.blue
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-05-17 13:49:30 +02:00
Ivan Vecera 743df48756 bridge: mcast: Rename MDB entry get function
JIRA: https://issues.redhat.com/browse/RHEL-36219

commit 6d0259dd6c533e4ccc41b40075c1bdfd0f1efbd7
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Wed Oct 25 15:30:11 2023 +0300

    bridge: mcast: Rename MDB entry get function

    The current name is going to conflict with the upcoming net device
    operation for the MDB get operation.

    Rename the function to br_mdb_entry_skb_get(). No functional changes
    intended.

    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-05-17 13:48:13 +02:00
Jan Stancek 23fb4fe1a8 Merge: CNB: net: adopt u64_stats_t & remove obsolete u64_stats_fetch_*_irq() functions
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2471

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2193170
Tested: Only build

Commits:
```
09cca53c1656 ("vlan: adopt u64_stats_t")
5665f48ef309 ("ipvlan: adopt u64_stats_t")
3a960ca7f6e5 ("sit: use dev_sw_netstats_rx_add()")
afd2051b1840 ("ip6_tunnel: use dev_sw_netstats_rx_add()")
eeb15885ca30 ("wireguard: receive: use dev_sw_netstats_rx_add()")
9962acefbcb9 ("net: adopt u64_stats_t in struct pcpu_sw_netstats")
c6cce71e7468 ("drop_monitor: adopt u64_stats_t")
9ec321aba2ea ("team: adopt u64_stats_t")
8469b645c9a1 ("net: hns3: split function hns3_nic_get_stats64()")
97c4090badca ("bpf: Remove the obsolte u64_stats_fetch_*_irq() users.")
93cc2559d3fd ("spi: Remove the obsolte u64_stats_fetch_*_irq() users.")
a21ee5b2fcb8 ("net: ifb: support ethtools stats")
068c38ad88cc ("net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).")
d120d1a63b2c ("net: Remove the obsolte u64_stats_fetch_*_irq() users (net).")
dec5efcffad4 ("u64_stat: Remove the obsolete fetch_irq() variants.")
```

Signed-off-by: Ivan Vecera <ivecera@redhat.com>

Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>
Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Michal Schmidt <mschmidt@redhat.com>
Approved-by: Sabrina Dubroca <sdubroca@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-06-13 14:02:50 +02:00
Ivan Vecera 1cb324e3cc net: Remove the obsolte u64_stats_fetch_*_irq() users (net).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2193170

Conflicts:
* net/netfilter/ipvs/ip_vs_ctl.c
  - the change was already applied by RHEL commit 914c1e31d9 ("ipvs:
    use u64_stats_t for the per-cpu counters")
* net/core/devlink.c
  - hunk was applied in different file (net/devlink/leftover.c)

commit d120d1a63b2c484d6175873d8ee736a633f74b70
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Oct 26 15:22:15 2022 +0200

    net: Remove the obsolte u64_stats_fetch_*_irq() users (net).

    Now that the 32bit UP oddity is gone and 32bit uses always a sequence
    count, there is no need for the fetch_irq() variants anymore.

    Convert to the regular interface.

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-06-08 13:38:11 +02:00
Íñigo Huguet 67a7e6210b net: bridge: Add netlink knobs for number / maximum MDB entries
Bugzilla: https://bugzilla.redhat.com/2184372

commit a1aee20d5db29dc73331067b6a338eb650f0b5f1
Author: Petr Machata <petrm@nvidia.com>
Date:   Thu Feb 2 18:59:26 2023 +0100

    net: bridge: Add netlink knobs for number / maximum MDB entries
    
    The previous patch added accounting for number of MDB entries per port and
    per port-VLAN, and the logic to verify that these values stay within
    configured bounds. However it didn't provide means to actually configure
    those bounds or read the occupancy. This patch does that.
    
    Two new netlink attributes are added for the MDB occupancy:
    IFLA_BRPORT_MCAST_N_GROUPS for the per-port occupancy and
    BRIDGE_VLANDB_ENTRY_MCAST_N_GROUPS for the per-port-VLAN occupancy.
    And another two for the maximum number of MDB entries:
    IFLA_BRPORT_MCAST_MAX_GROUPS for the per-port maximum, and
    BRIDGE_VLANDB_ENTRY_MCAST_MAX_GROUPS for the per-port-VLAN one.
    
    Note that the two new IFLA_BRPORT_ attributes prompt bumping of
    RTNL_SLAVE_MAX_TYPE to size the slave attribute tables large enough.
    
    The new attributes are used like this:
    
     # ip link add name br up type bridge vlan_filtering 1 mcast_snooping 1 \
                                          mcast_vlan_snooping 1 mcast_querier 1
     # ip link set dev v1 master br
     # bridge vlan add dev v1 vid 2
    
     # bridge vlan set dev v1 vid 1 mcast_max_groups 1
     # bridge mdb add dev br port v1 grp 230.1.2.3 temp vid 1
     # bridge mdb add dev br port v1 grp 230.1.2.4 temp vid 1
     Error: bridge: Port-VLAN is already in 1 groups, and mcast_max_groups=1.
    
     # bridge link set dev v1 mcast_max_groups 1
     # bridge mdb add dev br port v1 grp 230.1.2.3 temp vid 2
     Error: bridge: Port is already in 1 groups, and mcast_max_groups=1.
    
     # bridge -d link show
     5: v1@v2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br [...]
         [...] mcast_n_groups 1 mcast_max_groups 1
    
     # bridge -d vlan show
     port              vlan-id
     br                1 PVID Egress Untagged
                         state forwarding mcast_router 1
     v1                1 PVID Egress Untagged
                         [...] mcast_n_groups 1 mcast_max_groups 1
                       2
                         [...] mcast_n_groups 0 mcast_max_groups 0
    
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-17 12:54:07 +02:00
Íñigo Huguet da251cef95 net: bridge: Maintain number of MDB entries in net_bridge_mcast_port
Bugzilla: https://bugzilla.redhat.com/2184372

commit b57e8d870d522d905720052e6fd9c3bc9bc5f6fb
Author: Petr Machata <petrm@nvidia.com>
Date:   Thu Feb 2 18:59:25 2023 +0100

    net: bridge: Maintain number of MDB entries in net_bridge_mcast_port
    
    The MDB maintained by the bridge is limited. When the bridge is configured
    for IGMP / MLD snooping, a buggy or malicious client can easily exhaust its
    capacity. In SW datapath, the capacity is configurable through the
    IFLA_BR_MCAST_HASH_MAX parameter, but ultimately is finite. Obviously a
    similar limit exists in the HW datapath for purposes of offloading.
    
    In order to prevent the issue of unilateral exhaustion of MDB resources,
    introduce two parameters in each of two contexts:
    
    - Per-port and per-port-VLAN number of MDB entries that the port
      is member in.
    
    - Per-port and (when BROPT_MCAST_VLAN_SNOOPING_ENABLED is enabled)
      per-port-VLAN maximum permitted number of MDB entries, or 0 for
      no limit.
    
    The per-port multicast context is used for tracking of MDB entries for the
    port as a whole. This is available for all bridges.
    
    The per-port-VLAN multicast context is then only available on
    VLAN-filtering bridges on VLANs that have multicast snooping on.
    
    With these changes in place, it will be possible to configure MDB limit for
    bridge as a whole, or any one port as a whole, or any single port-VLAN.
    
    Note that unlike the global limit, exhaustion of the per-port and
    per-port-VLAN maximums does not cause disablement of multicast snooping.
    It is also permitted to configure the local limit larger than hash_max,
    even though that is not useful.
    
    In this patch, introduce only the accounting for number of entries, and the
    max field itself, but not the means to toggle the max. The next patch
    introduces the netlink APIs to toggle and read the values.
    
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-17 12:54:07 +02:00
Íñigo Huguet fa7bd1d6b9 net: bridge: Change a cleanup in br_multicast_new_port_group() to goto
Bugzilla: https://bugzilla.redhat.com/2184372

commit eceb30854f6b7d354ae52551b11aef2e2fa3e82e
Author: Petr Machata <petrm@nvidia.com>
Date:   Thu Feb 2 18:59:23 2023 +0100

    net: bridge: Change a cleanup in br_multicast_new_port_group() to goto
    
    This function is getting more to clean up in the following patches.
    Structuring the cleanups in one labeled block will allow reusing the same
    cleanup from several places.
    
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-17 12:54:07 +02:00
Íñigo Huguet e2c8560e46 net: bridge: Add br_multicast_del_port_group()
Bugzilla: https://bugzilla.redhat.com/2184372

commit 976b3858dd14914c5a9254535ad7440c99467944
Author: Petr Machata <petrm@nvidia.com>
Date:   Thu Feb 2 18:59:22 2023 +0100

    net: bridge: Add br_multicast_del_port_group()
    
    Since cleaning up the effects of br_multicast_new_port_group() just
    consists of delisting and freeing the memory, the function
    br_mdb_add_group_star_g() inlines the corresponding code. In the following
    patches, number of per-port and per-port-VLAN MDB entries is going to be
    maintained, and that counter will have to be updated. Because that logic
    is going to be hidden in the br_multicast module, introduce a new hook
    intended to again remove a newly-created group.
    
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-17 12:54:07 +02:00
Íñigo Huguet 70bdf0834a net: bridge: Move extack-setting to br_multicast_new_port_group()
Bugzilla: https://bugzilla.redhat.com/2184372

commit 1c85b80b20a13d07ec3a7d746ad52b7972c8c730
Author: Petr Machata <petrm@nvidia.com>
Date:   Thu Feb 2 18:59:21 2023 +0100

    net: bridge: Move extack-setting to br_multicast_new_port_group()
    
    Now that br_multicast_new_port_group() takes an extack argument, move
    setting the extack there. The downside is that the error messages end
    up being less specific (the function cannot distinguish between (S,G)
    and (*,G) groups). However, the alternative is to check in the caller
    whether the callee set the extack, and if it didn't, set it. But that
    is only done when the callee is not exactly known. (E.g. in case of a
    notifier invocation.)
    
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-17 12:54:06 +02:00
Íñigo Huguet 95b6cfb7d0 net: bridge: Add extack to br_multicast_new_port_group()
Bugzilla: https://bugzilla.redhat.com/2184372

commit 60977a0c63373bfc596b562b1e34e64ede6ef492
Author: Petr Machata <petrm@nvidia.com>
Date:   Thu Feb 2 18:59:20 2023 +0100

    net: bridge: Add extack to br_multicast_new_port_group()
    
    Make it possible to set an extack in br_multicast_new_port_group().
    Eventually, this function will check for per-port and per-port-vlan
    MDB maximums, and will use the extack to communicate the reason for
    the bounce.
    
    Signed-off-by: Petr Machata <petrm@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-17 12:54:06 +02:00
Íñigo Huguet c442caf155 treewide: Convert del_timer*() to timer_shutdown*()
Bugzilla: https://bugzilla.redhat.com/2184372

Conflicts: bridge parts only

commit 292a089d78d3e2f7944e60bb897c977785a321e3
Author: Steven Rostedt (Google) <rostedt@goodmis.org>
Date:   Tue Dec 20 13:45:19 2022 -0500

    treewide: Convert del_timer*() to timer_shutdown*()
    
    Due to several bugs caused by timers being re-armed after they are
    shutdown and just before they are freed, a new state of timers was added
    called "shutdown".  After a timer is set to this state, then it can no
    longer be re-armed.
    
    The following script was run to find all the trivial locations where
    del_timer() or del_timer_sync() is called in the same function that the
    object holding the timer is freed.  It also ignores any locations where
    the timer->function is modified between the del_timer*() and the free(),
    as that is not considered a "trivial" case.
    
    This was created by using a coccinelle script and the following
    commands:
    
        $ cat timer.cocci
        @@
        expression ptr, slab;
        identifier timer, rfield;
        @@
        (
        -       del_timer(&ptr->timer);
        +       timer_shutdown(&ptr->timer);
        |
        -       del_timer_sync(&ptr->timer);
        +       timer_shutdown_sync(&ptr->timer);
        )
          ... when strict
              when != ptr->timer
        (
                kfree_rcu(ptr, rfield);
        |
                kmem_cache_free(slab, ptr);
        |
                kfree(ptr);
        )
    
        $ spatch timer.cocci . > /tmp/t.patch
        $ patch -p1 < /tmp/t.patch
    
    Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
    Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
    Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-11 09:39:19 +02:00
Íñigo Huguet 0a9858b76b bridge: mcast: Add a flag for user installed source entries
Bugzilla: https://bugzilla.redhat.com/2184372

commit a01ecb1712ddbcd41360ad0c554b460adbac0528
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Sat Dec 10 16:56:26 2022 +0200

    bridge: mcast: Add a flag for user installed source entries
    
    There are a few places where the bridge driver differentiates between
    (S, G) entries installed by the kernel (in response to Membership
    Reports) and those installed by user space. One of them is when deleting
    an (S, G) entry corresponding to a source entry that is being deleted.
    
    While user space cannot currently add a source entry to a (*, G), it can
    add an (S, G) entry that later corresponds to a source entry created by
    the reception of a Membership Report. If this source entry is later
    deleted because its source timer expired or because the (*, G) entry is
    being deleted, the bridge driver will not delete the corresponding (S,
    G) entry if it was added by user space as permanent.
    
    This is going to be a problem when the ability to install a (*, G) with
    a source list is exposed to user space. In this case, when user space
    installs the (*, G) as permanent, then all the (S, G) entries
    corresponding to its source list will also be installed as permanent.
    When user space deletes the (*, G), all the source entries will be
    deleted and the expectation is that the corresponding (S, G) entries
    will be deleted as well.
    
    Solve this by introducing a new source entry flag denoting that the
    entry was installed by user space. When the entry is deleted, delete the
    corresponding (S, G) entry even if it was installed by user space as
    permanent, as the flag tells us that it was installed in response to the
    source entry being created.
    
    The flag will be set in a subsequent patch where source entries are
    created in response to user requests.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-11 09:39:18 +02:00
Íñigo Huguet 9d2959c354 bridge: mcast: Expose __br_multicast_del_group_src()
Bugzilla: https://bugzilla.redhat.com/2184372

commit 083e353482b4c9b727846643ad6ca7b784dd486b
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Sat Dec 10 16:56:25 2022 +0200

    bridge: mcast: Expose __br_multicast_del_group_src()
    
    Expose __br_multicast_del_group_src() which is symmetric to
    br_multicast_new_group_src() and does not remove the installed {S, G}
    forwarding entry, unlike br_multicast_del_group_src().
    
    The function will be used in the error path when user space was able to
    add a new source entry, but failed to install a corresponding forwarding
    entry.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-11 09:39:18 +02:00
Íñigo Huguet f74c439eb1 bridge: mcast: Expose br_multicast_new_group_src()
Bugzilla: https://bugzilla.redhat.com/2184372

commit fd0c696164cf13ae0128f14209e2dbfcd86584b8
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Sat Dec 10 16:56:24 2022 +0200

    bridge: mcast: Expose br_multicast_new_group_src()
    
    Currently, new group source entries are only created in response to
    received Membership Reports. Subsequent patches are going to allow user
    space to install (*, G) entries with a source list.
    
    As a preparatory step, expose br_multicast_new_group_src() so that it
    could later be invoked from the MDB code (i.e., br_mdb.c) that handles
    RTM_NEWMDB messages.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-11 09:39:18 +02:00
Íñigo Huguet ad2a0e6722 bridge: mcast: Constify 'group' argument in br_multicast_new_port_group()
Bugzilla: https://bugzilla.redhat.com/2184372

commit f86c3e2c1b5ea5c959ef176541c2f831231fa631
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Tue Dec 6 12:58:09 2022 +0200

    bridge: mcast: Constify 'group' argument in br_multicast_new_port_group()
    
    The 'group' argument is not modified, so mark it as 'const'. It will
    allow us to constify arguments of the callers of this function in future
    patches.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-11 09:39:17 +02:00
Íñigo Huguet 624f850796 bridge: mcast: Use spin_lock() instead of spin_lock_bh()
Bugzilla: https://bugzilla.redhat.com/2184372

commit 262985fad1bd819d1323c6dbd72a8d9ed1c6090c
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Tue Oct 18 09:40:00 2022 +0300

    bridge: mcast: Use spin_lock() instead of spin_lock_bh()
    
    IGMPv3 / MLDv2 Membership Reports are only processed from the data path
    with softIRQ disabled, so there is no need to call spin_lock_bh(). Use
    spin_lock() instead.
    
    This is consistent with how other IGMP / MLD packets are processed.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-05-11 09:20:11 +02:00
Desnes Nunes 30247fa579 treewide: Convert del_timer*() to timer_shutdown*()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2190250
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=292a089d78d3e2f7944e60bb897c977785a321e3
Conflicts:
* Avoiding commit <d87d44f7ab35> ("ARM: omap1: move CF chipselect setup to
  board file") and commit <df99e7bbbec3> ("ARM: omap1: use
  pci_remap_iospace() for omap_cf") with their ARM series. Also, this
  considers the fixes on i40e_main.c that have been partially applied
  through RHEL commit <3731942e6257>.

commit 292a089d78d3e2f7944e60bb897c977785a321e3
Author: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Date: Tue, 20 Dec 2022 13:45:19 -0500

  Due to several bugs caused by timers being re-armed after they are
  shutdown and just before they are freed, a new state of timers was added
  called "shutdown".  After a timer is set to this state, then it can no
  longer be re-armed.

  The following script was run to find all the trivial locations where
  del_timer() or del_timer_sync() is called in the same function that the
  object holding the timer is freed.  It also ignores any locations where
  the timer->function is modified between the del_timer*() and the free(),
  as that is not considered a "trivial" case.

  This was created by using a coccinelle script and the following
  commands:

      $ cat timer.cocci
      @@
      expression ptr, slab;
      identifier timer, rfield;
      @@
      (
      -       del_timer(&ptr->timer);
      +       timer_shutdown(&ptr->timer);
      |
      -       del_timer_sync(&ptr->timer);
      +       timer_shutdown_sync(&ptr->timer);
      )
        ... when strict
            when != ptr->timer
      (
              kfree_rcu(ptr, rfield);
      |
              kmem_cache_free(slab, ptr);
      |
              kfree(ptr);
      )

      $ spatch timer.cocci . > /tmp/t.patch
      $ patch -p1 < /tmp/t.patch

  Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
  Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
  Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
  Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
  Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
  Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Desnes Nunes <desnesn@redhat.com>
2023-05-08 15:02:53 -03:00
Ivan Vecera 2e7fec660b net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2081601

commit c832962ac972082b3a1f89775c9d4274c8cb5670
Author: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Date:   Tue Feb 15 18:53:03 2022 +0200

    net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled

    Whenever bridge driver hits the max capacity of MDBs, it disables
    the MC processing (by setting corresponding bridge option), but never
    notifies switchdev about such change (the notifiers are called only upon
    explicit setting of this option, through the registered netlink interface).

    This could lead to situation when Software MDB processing gets disabled,
    but this event never gets offloaded to the underlying Hardware.

    Fix this by adding a notify message in such case.

    Fixes: 147c1e9b90 ("switchdev: bridge: Offload multicast disabled")
    Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
    Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Link: https://lore.kernel.org/r/20220215165303.31908-1-oleksandr.mazur@plvision.eu
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-05-04 09:55:35 +02:00
Ivan Vecera d2141be260 net: bridge: mcast: add and enforce startup query interval minimum
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit f83a112bd91a494cdee671aec74e777470fb4a07
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Dec 27 19:21:16 2021 +0200

    net: bridge: mcast: add and enforce startup query interval minimum

    As reported[1] if startup query interval is set too low in combination with
    large number of startup queries and we have multiple bridges or even a
    single bridge with multiple querier vlans configured we can crash the
    machine. Add a 1 second minimum which must be enforced by overwriting the
    value if set lower (i.e. without returning an error) to avoid breaking
    user-space. If that happens a log message is emitted to let the admin know
    that the startup interval has been set to the minimum. It doesn't make
    sense to make the startup interval lower than the normal query interval
    so use the same value of 1 second. The issue has been present since these
    intervals could be user-controlled.

    [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/

    Fixes: d902eee43f ("bridge: Add multicast count/interval sysfs entries")
    Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:55 +01:00
Ivan Vecera f4d0eb3c8d net: bridge: mcast: add and enforce query interval minimum
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 99b40610956a8a8755653a67392e2a8b772453be
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Dec 27 19:21:15 2021 +0200

    net: bridge: mcast: add and enforce query interval minimum

    As reported[1] if query interval is set too low and we have multiple
    bridges or even a single bridge with multiple querier vlans configured
    we can crash the machine. Add a 1 second minimum which must be enforced
    by overwriting the value if set lower (i.e. without returning an error) to
    avoid breaking user-space. If that happens a log message is emitted to let
    the administrator know that the interval has been set to the minimum.
    The issue has been present since these intervals could be user-controlled.

    [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/

    Fixes: d902eee43f ("bridge: Add multicast count/interval sysfs entries")
    Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:55 +01:00
Ivan Vecera 3f80abc59a net: bridge: mcast: Associate the seqcount with its protecting lock.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit f936bb42aeb94a069bec7c9e04100d199c372956
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Sep 28 16:10:49 2021 +0200

    net: bridge: mcast: Associate the seqcount with its protecting lock.

    The sequence count bridge_mcast_querier::seq is protected by
    net_bridge::multicast_lock but seqcount_init() does not associate the
    seqcount with the lock. This leads to a warning on PREEMPT_RT because
    preemption is still enabled.

    Let seqcount_init() associate the seqcount with lock that protects the
    write section. Remove lockdep_assert_held_once() because lockdep already checks
    whether the associated lock is held.

    Fixes: 67b746f94ff39 ("net: bridge: mcast: make sure querier port/address updates are consistent")
    Reported-by: Mike Galbraith <efault@gmx.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Tested-by: Mike Galbraith <efault@gmx.de>
    Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Link: https://lore.kernel.org/r/20210928141049.593833-1-bigeasy@linutronix.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:53 +01:00
Ivan Vecera c9c281210c net: bridge: mcast: fix vlan port router deadlock
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit ddd0d5293810c1882e2a96f8cce1678823b1dd38
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Sep 3 12:34:15 2021 +0300

    net: bridge: mcast: fix vlan port router deadlock

    Before vlan/port mcast router support was added
    br_multicast_set_port_router was used only with bh already disabled due
    to the bridge port lock, but that is no longer the case and when it is
    called to configure a vlan/port mcast router we can deadlock with the
    timer, so always disable bh to make sure it can be called from contexts
    with both enabled and disabled bh.

    Fixes: 2796d846d74a ("net: bridge: vlan: convert mcast router global option to per-vlan entry")
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:53 +01:00
Ivan Vecera b2bb0ecf79 net: bridge: use mld2r_ngrec instead of icmpv6_dataun
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 6baeb3951c271cff30828c4763fa1362da56454a
Author: MichelleJin <shjy180909@gmail.com>
Date:   Sun Aug 29 04:32:29 2021 +0000

    net: bridge: use mld2r_ngrec instead of icmpv6_dataun

    br_ip6_multicast_mld2_report function uses icmp6h
    to parse mld2_report packet.

    mld2r_ngrec defines mld2r_hdr.icmp6_dataun.un_data16[1]
    in include/net/mld.h.

    So, it is more compact to use mld2r rather than icmp6h.

    By doing printk test, it is confirmed that
    icmp6h->icmp6_dataun.un_data16[1] and mld2r->mld2r_ngrec are
    indeed equivalent.

    Also, sizeof(*mld2r) and sizeof(*icmp6h) are equivalent, too.

    Signed-off-by: MichelleJin <shjy180909@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:51 +01:00
Ivan Vecera f5b0bea501 net: bridge: vlan: convert mcast router global option to per-vlan entry
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 2796d846d74a18cc6563e96eff8bf28c5e06f912
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 20 15:42:55 2021 +0300

    net: bridge: vlan: convert mcast router global option to per-vlan entry

    The per-vlan router option controls the port/vlan and host vlan entries'
    mcast router config. The global option controlled only the host vlan
    config, but that is unnecessary and incosistent as it's not really a
    global vlan option, but rather bridge option to control host router
    config, so convert BRIDGE_VLANDB_GOPTS_MCAST_ROUTER to
    BRIDGE_VLANDB_ENTRY_MCAST_ROUTER which can be used to control both host
    vlan and port vlan mcast router config.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:51 +01:00
Ivan Vecera 770192b6df net: bridge: mcast: br_multicast_set_port_router takes multicast context as argument
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit a53581d5559eaacaac1b4aed8e2f22c40efa5acc
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 20 15:42:54 2021 +0300

    net: bridge: mcast: br_multicast_set_port_router takes multicast context as argument

    Change br_multicast_set_port_router to take port multicast context as
    its first argument so we can later use it to control port/vlan mcast
    router option.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:51 +01:00
Ivan Vecera 5147757439 net: bridge: mcast: toggle also host vlan state in br_multicast_toggle_vlan
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit affce9a774ca2514aaa5638fde92c57a476dfd79
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Aug 16 17:57:07 2021 +0300

    net: bridge: mcast: toggle also host vlan state in br_multicast_toggle_vlan

    When changing vlan mcast state by br_multicast_toggle_vlan it iterates
    over all ports and enables/disables the port mcast ctx based on the new
    state, but I forgot to update the host vlan (bridge master vlan entry)
    with the new state so it will be left out. Also that function is not
    used outside of br_multicast.c, so make it static.

    Fixes: f4b7002a7076 ("net: bridge: add vlan mcast snooping knob")
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:50 +01:00
Ivan Vecera f7440be964 net: bridge: mcast: use the correct vlan group helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 3f0d14efe2fa8656a1c46f1d13d42bb5bd88f32f
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Aug 16 17:57:06 2021 +0300

    net: bridge: mcast: use the correct vlan group helper

    When dereferencing the port vlan group we should use the rcu helper
    instead of the one relying on rtnl. In br_multicast_pg_to_port_ctx the
    entry cannot disappear as we hold the multicast lock and rcu as explained
    in the comment above it.
    For the same reason we're ok in br_multicast_start_querier.

     =============================
     WARNING: suspicious RCU usage
     5.14.0-rc5+ #429 Tainted: G        W
     -----------------------------
     net/bridge/br_private.h:1478 suspicious rcu_dereference_protected() usage!

     other info that might help us debug this:

     rcu_scheduler_active = 2, debug_locks = 1
     3 locks held by swapper/2/0:
      #0: ffff88822be85eb0 ((&p->timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x2da
      #1: ffff88810b32f260 (&br->multicast_lock){+.-.}-{3:3}, at: br_multicast_port_group_expired+0x28/0x13d [bridge]
      #2: ffffffff824f6c80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire.constprop.0+0x0/0x22 [bridge]

     stack backtrace:
     CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Tainted: G        W         5.14.0-rc5+ #429
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
     Call Trace:
      <IRQ>
      dump_stack_lvl+0x45/0x59
      nbp_vlan_group+0x3e/0x44 [bridge]
      br_multicast_pg_to_port_ctx+0xd6/0x10d [bridge]
      br_multicast_star_g_handle_mode+0xa1/0x2ce [bridge]
      ? netlink_broadcast+0xf/0x11
      ? nlmsg_notify+0x56/0x99
      ? br_mdb_notify+0x224/0x2e9 [bridge]
      ? br_multicast_del_pg+0x1dc/0x26d [bridge]
      br_multicast_del_pg+0x1dc/0x26d [bridge]
      br_multicast_port_group_expired+0xaa/0x13d [bridge]
      ? __grp_src_delete_marked.isra.0+0x35/0x35 [bridge]
      ? __grp_src_delete_marked.isra.0+0x35/0x35 [bridge]
      call_timer_fn+0x134/0x2da
      __run_timers+0x169/0x193
      run_timer_softirq+0x19/0x2d
      __do_softirq+0x1bc/0x42a
      __irq_exit_rcu+0x5c/0xb3
      irq_exit_rcu+0xa/0x12
      sysvec_apic_timer_interrupt+0x5e/0x75
      </IRQ>
      asm_sysvec_apic_timer_interrupt+0x12/0x20
     RIP: 0010:default_idle+0xc/0xd
     Code: e8 14 40 71 ff e8 10 b3 ff ff 4c 89 e2 48 89 ef 31 f6 5d 41 5c e9 a9 e8 c2 ff cc cc cc cc 0f 1f 44 00 00 e8 7f 55 65 ff fb f4 <c3> 0f 1f 44 00 00 55 65 48 8b 2c 25 40 6f 01 00 53 f0 80 4d 02 20
     RSP: 0018:ffff88810033bf00 EFLAGS: 00000206
     RAX: ffffffff819cf828 RBX: ffff888100328000 RCX: 0000000000000001
     RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff819cfa2d
     RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
     R10: ffff8881008302c0 R11: 00000000000006db R12: 0000000000000000
     R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
      ? __sched_text_end+0x4/0x4
      ? default_idle_call+0x15/0x7b
      default_idle_call+0x4d/0x7b
      do_idle+0x124/0x2a2
      cpu_startup_entry+0x1d/0x1f
      secondary_startup_64_no_verify+0xb0/0xbb

    Fixes: 74edfd483de8 ("net: bridge: multicast: add helper to get port mcast context from port group")
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:50 +01:00
Ivan Vecera 975f9c4890 net: bridge: mcast: account for ipv6 size when dumping querier state
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 175e66924719090f3f43884a419e7c32dabb800f
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Aug 16 13:11:34 2021 +0300

    net: bridge: mcast: account for ipv6 size when dumping querier state

    We need to account for the IPv6 attributes when dumping querier state.

    Fixes: 5e924fe6ccfd ("net: bridge: mcast: dump ipv6 querier state")
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:49 +01:00
Ivan Vecera b7f5b733bb net: bridge: mcast: drop sizeof for nest attribute's zero size
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit cdda378bd8d9076319e5713595b4944b32d95a40
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Aug 16 13:11:33 2021 +0300

    net: bridge: mcast: drop sizeof for nest attribute's zero size

    This was a dumb error I made instead of writing nla_total_size(0)
    for a nest attribute, I wrote nla_total_size(sizeof(0)).

    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Fixes: 606433fe3e11 ("net: bridge: mcast: dump ipv4 querier state")
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:49 +01:00
Ivan Vecera 78d3ffb48b net: bridge: mcast: don't dump querier state if snooping is disabled
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit f137b7d4ecf8fca0891f435a198b3c8beec8a9d2
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Aug 16 13:11:32 2021 +0300

    net: bridge: mcast: don't dump querier state if snooping is disabled

    A minor improvement to avoid dumping mcast ctx querier state if snooping
    is disabled for that context (either bridge or vlan).

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:47 +01:00
Ivan Vecera 68b76c2561 net: bridge: mcast: dump ipv6 querier state
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 85b4108211742c5dd4f9f56c1d0704b4e0d4c98e
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 13 18:00:01 2021 +0300

    net: bridge: mcast: dump ipv6 querier state

    Add support for dumping global IPv6 querier state, we dump the state
    only if our own querier is enabled or there has been another external
    querier which has won the election. For the bridge global state we use
    a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside.
    The structure is:
      [IFLA_BR_MCAST_QUERIER_STATE]
       `[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier
       `[BRIDGE_QUERIER_IPV6_PORT]    - bridge port ifindex where the querier
                                        was seen (set only if external querier)
       `[BRIDGE_QUERIER_IPV6_OTHER_TIMER]   -  other querier timeout

    IPv4 and IPv6 attributes are embedded at the same level of
    IFLA_BR_MCAST_QUERIER_STATE. If we didn't dump anything we cancel the nest
    and return.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:47 +01:00
Ivan Vecera bd619e2028 net: bridge: mcast: dump ipv4 querier state
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit c7fa1d9b1fb179375e889ff076a1566ecc997bfc
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 13 18:00:00 2021 +0300

    net: bridge: mcast: dump ipv4 querier state

    Add support for dumping global IPv4 querier state, we dump the state
    only if our own querier is enabled or there has been another external
    querier which has won the election. For the bridge global state we use
    a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside.
    The structure is:
     [IFLA_BR_MCAST_QUERIER_STATE]
      `[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier
      `[BRIDGE_QUERIER_IP_PORT]    - bridge port ifindex where the querier was
                                     seen (set only if external querier)
      `[BRIDGE_QUERIER_IP_OTHER_TIMER]   -  other querier timeout

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:47 +01:00
Ivan Vecera db0705df45 net: bridge: mcast: consolidate querier selection for ipv4 and ipv6
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit c3fb3698f935381161101d2479d66dd48c106183
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 13 17:59:59 2021 +0300

    net: bridge: mcast: consolidate querier selection for ipv4 and ipv6

    We can consolidate both functions as they share almost the same logic.
    This is easier to maintain and we have a single querier update function.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:46 +01:00
Ivan Vecera d2269dee97 net: bridge: mcast: make sure querier port/address updates are consistent
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 67b746f94ff39d8b998c4ea9493c6ab2d6c225d4
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 13 17:59:58 2021 +0300

    net: bridge: mcast: make sure querier port/address updates are consistent

    Use a sequence counter to make sure port/address updates can be read
    consistently without requiring the bridge multicast_lock. We need to
    zero out the port and address when the other querier has expired and
    we're about to select ourselves as querier. br_multicast_read_querier
    will be used later when dumping querier state. Updates are done only
    with the multicast spinlock and softirqs disabled, while reads are done
    from process context and from softirqs (due to notifications).

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:46 +01:00
Ivan Vecera 6132f4586a net: bridge: mcast: record querier port device ifindex instead of pointer
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit bb18ef8e7e180d8590df2808ec4014af114756cb
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Fri Aug 13 17:59:57 2021 +0300

    net: bridge: mcast: record querier port device ifindex instead of pointer

    Currently when a querier port is detected its net_bridge_port pointer is
    recorded, but it's used only for comparisons so it's fine to have stale
    pointer, in order to dereference and use the port pointer a proper
    accounting of its usage must be implemented adding unnecessary
    complexity. To solve the problem we can just store the netdevice ifindex
    instead of the port pointer and retrieve the bridge port. It is a best
    effort and the device needs to be validated that is still part of that
    bridge before use, but that is small price to pay for avoiding querier
    reference counting for each port/vlan.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:46 +01:00
Ivan Vecera e184e38c82 net: bridge: vlan: add support for mcast router global option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit a97df080b6a86c105f98052ca3a9d66149b461b3
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Tue Aug 10 18:29:31 2021 +0300

    net: bridge: vlan: add support for mcast router global option

    Add support to change and retrieve global vlan multicast router state
    which is used for the bridge itself. We just need to pass multicast context
    to br_multicast_set_router instead of bridge device and the rest of the
    logic remains the same.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:45 +01:00
Ivan Vecera b5c3650521 net: bridge: vlan: add support for mcast querier global option
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 62938182c35906c0ed4beb7845b93b8ffb937597
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Tue Aug 10 18:29:30 2021 +0300

    net: bridge: vlan: add support for mcast querier global option

    Add support to change and retrieve global vlan multicast querier state.
    We just need to pass multicast context to br_multicast_set_querier
    instead of bridge device and the rest of the logic remains the same.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:43 +01:00
Ivan Vecera baf21ad05d net: bridge: mcast: querier and query state affect only current context type
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit cb486ce99576741a84c75623daeffb2f7758cbf9
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Tue Aug 10 18:29:29 2021 +0300

    net: bridge: mcast: querier and query state affect only current context type

    It is a minor optimization and better behaviour to make sure querier and
    query sending routines affect only the matching multicast context
    depending if vlan snooping is enabled (vlan ctx vs bridge ctx).
    It also avoids sending unnecessary extra query packets.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:42 +01:00
Ivan Vecera 762ea286a6 net: bridge: mcast: move querier state to the multicast context
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 4d5b4e84c72451face4d7817697684196cbee50d
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Tue Aug 10 18:29:28 2021 +0300

    net: bridge: mcast: move querier state to the multicast context

    We need to have the querier state per multicast context in order to have
    per-vlan control, so remove the internal option bit and move it to the
    multicast context. Also annotate the lockless reads of the new variable.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:42 +01:00
Ivan Vecera 74ecf85fd8 net: bridge: vlan: add support for mcast igmp/mld version global options
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit df271cd641f101decaa4f7c1dd5c62939900bd4c
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Tue Aug 10 18:29:19 2021 +0300

    net: bridge: vlan: add support for mcast igmp/mld version global options

    Add support to change and retrieve global vlan IGMP/MLD versions.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:38 +01:00
Ivan Vecera c93c243ddd net: bridge: multicast: add context support for host-joined groups
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 58d913a32664fae5ac2ccd9a9c23b8e7037df740
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Wed Jul 21 17:01:27 2021 +0300

    net: bridge: multicast: add context support for host-joined groups

    Adding bridge multicast context support for host-joined groups is easy
    because we only need the proper timer value. We pass the already chosen
    context and use its timer value.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:30 +01:00
Ivan Vecera a80bbb8b4e net: bridge: multicast: fix igmp/mld port context null pointer dereferences
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 54cb43199e14c1181ddcd4a3782f1f7eb56bdab8
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Wed Jul 21 13:06:24 2021 +0300

    net: bridge: multicast: fix igmp/mld port context null pointer dereferences

    With the recent change to use bridge/port multicast context pointers
    instead of bridge/port I missed to convert two locations which pass the
    port pointer as-is, but with the new model we need to verify the port
    context is non-NULL first and retrieve the port from it. The first
    location is when doing querier selection when a query is received, the
    second location is when leaving a group. The port context will be null
    if the packets originated from the bridge device (i.e. from the host).
    The fix is simple just check if the port context exists and retrieve
    the port pointer from it.

    Fixes: adc47037a7d5 ("net: bridge: multicast: use multicast contexts instead of bridge or port")
    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:29 +01:00
Ivan Vecera 83c1bf2586 net: bridge: vlan: add mcast snooping control
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 9dee572c384846f4ece029ab5688faed0682e48a
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:37 2021 +0300

    net: bridge: vlan: add mcast snooping control

    Add a new global vlan option which controls whether multicast snooping
    is enabled or disabled for a single vlan. It controls the vlan private
    flag: BR_VLFLAG_GLOBAL_MCAST_ENABLED.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:26 +01:00
Ivan Vecera 047b3d9e2c net: bridge: multicast: include router port vlan id in notifications
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 1e9ca45662d6bb65fb60d3fbb7737b081d9cffc9
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:33 2021 +0300

    net: bridge: multicast: include router port vlan id in notifications

    Use the port multicast context to check if the router port is a vlan and
    in case it is include its vlan id in the notification.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:25 +01:00
Ivan Vecera da39f622f1 net: bridge: multicast: add vlan querier and query support
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 615cc23e6283e143933ecf2bf3645fe0cea5ae6a
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:32 2021 +0300

    net: bridge: multicast: add vlan querier and query support

    Add basic vlan context querier support, if the contexts passed to
    multicast_alloc_query are vlan then the query will be tagged. Also
    handle querier start/stop of vlan contexts.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:25 +01:00
Ivan Vecera 9d1d89d3c9 net: bridge: multicast: check if should use vlan mcast ctx
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 4cdd0d10f31da9fab65eb6411441ffb71a653997
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:31 2021 +0300

    net: bridge: multicast: check if should use vlan mcast ctx

    Add helpers which check if the current bridge/port multicast context
    should be used (i.e. they're not disabled) and use them for Rx IGMP/MLD
    processing, timers and new group addition. It is important for vlans to
    disable processing of timer/packet after the multicast_lock is obtained
    if the vlan context doesn't have BR_VLFLAG_MCAST_ENABLED. There are two
    cases when that flag is missing:
     - if the vlan is getting destroyed it will be removed and timers will
       be stopped
     - if the vlan mcast snooping is being disabled

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:25 +01:00
Ivan Vecera cbf25f0134 net: bridge: multicast: use the port group to port context helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit eb1593a0b4c49443acbe2ebaa7a9947fa5471c01
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:30 2021 +0300

    net: bridge: multicast: use the port group to port context helper

    We need to use the new port group to port context helper in places where
    we cannot pass down the proper context (i.e. functions that can be
    called by timers or outside the packet snooping paths).

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:24 +01:00
Ivan Vecera e88e92c72a net: bridge: multicast: add helper to get port mcast context from port group
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit 74edfd483de8010596d556a2339f9fb8a4ab6688
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:29 2021 +0300

    net: bridge: multicast: add helper to get port mcast context from port group

    Add br_multicast_pg_to_port_ctx() which returns the proper port multicast
    context from either port or vlan based on bridge option and vlan flags.
    As the comment inside explains the locking is a bit tricky, we rely on
    the fact that BR_VLFLAG_MCAST_ENABLED requires multicast_lock to change
    and we also require it to be held to call that helper. If we find the
    vlan under rcu and it still has the flag then we can be sure it will be
    alive until we unlock multicast_lock which should be enough.
    Note that the context might change from vlan to bridge between different
    calls to this helper as the mcast vlan knob requires only rtnl so it should
    be used carefully and for read-only/check purposes.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:22 +01:00
Ivan Vecera 9d7e9a87d1 net: bridge: add vlan mcast snooping knob
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037335

commit f4b7002a7076f025dce59647a77c8251175d2b34
Author: Nikolay Aleksandrov <nikolay@nvidia.com>
Date:   Mon Jul 19 20:06:28 2021 +0300

    net: bridge: add vlan mcast snooping knob

    Add a global knob that controls if vlan multicast snooping is enabled.
    The proper contexts (vlan or bridge-wide) will be chosen based on the knob
    when processing packets and changing bridge device state. Note that
    vlans have their individual mcast snooping enabled by default, but this
    knob is needed to turn on bridge vlan snooping. It is disabled by
    default. To enable the knob vlan filtering must also be enabled, it
    doesn't make sense to have vlan mcast snooping without vlan filtering
    since that would lead to inconsistencies. Disabling vlan filtering will
    also automatically disable vlan mcast snooping.

    Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-01-05 16:14:22 +01:00