linux-kernelorg-stable/include/net
Vladimir Oltean 8ca07176ab net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
Currently DSA has an issue with FDB entries pointing towards the bridge
in the presence of br_fdb_replay() being called at port join and leave
time.

In particular, each bridge port will ask for a replay for the FDB
entries pointing towards the bridge when it joins, and for another
replay when it leaves.

This means that for example, a bridge with 4 switch ports will notify
DSA 4 times of the bridge MAC address.

But if the MAC address of the bridge changes during the normal runtime
of the system, the bridge notifies switchdev [ once ] of the deletion of
the old MAC address as a local FDB towards the bridge, and of the
insertion [ again once ] of the new MAC address as a local FDB.

This is a problem, because DSA keeps the old MAC address as a host FDB
entry with refcount 4 (4 ports asked for it using br_fdb_replay). So the
old MAC address will not be deleted. Additionally, the new MAC address
will only be installed with refcount 1, and when the first switch port
leaves the bridge (leaving 3 others as still members), it will delete
with it the new MAC address of the bridge from the local FDB entries
kept by DSA (because the br_fdb_replay call on deletion will bring the
entry's refcount from 1 to 0).

So the problem, really, is that the number of br_fdb_replay() calls is
not matched with the refcount that a host FDB is offloaded to DSA during
normal runtime.

An elegant way to solve the problem would be to make the switchdev
notification emitted by br_fdb_change_mac_address() result in a host FDB
kept by DSA which has a refcount exactly equal to the number of ports
under that bridge. Then, no matter how many DSA ports join or leave that
bridge, the host FDB entry will always be deleted when there are exactly
zero remaining DSA switch ports members of the bridge.

To implement the proposed solution, we remember that the switchdev
objects and port attributes have some helpers provided by switchdev,
which can be optionally called by drivers:
switchdev_handle_port_obj_{add,del} and switchdev_handle_port_attr_set.
These helpers:
- fan out a switchdev object/attribute emitted for the bridge towards
  all the lower interfaces that pass the check_cb().
- fan out a switchdev object/attribute emitted for a bridge port that is
  a LAG towards all the lower interfaces that pass the check_cb().

In other words, this is the model we need for the FDB events too:
something that will keep an FDB entry emitted towards a physical port as
it is, but translate an FDB entry emitted towards the bridge into N FDB
entries, one per physical port.

Of course, there are many differences between fanning out a switchdev
object (VLAN) on 3 lower interfaces of a LAG and fanning out an FDB
entry on 3 lower interfaces of a LAG. Intuitively, an FDB entry towards
a LAG should be treated specially, because FDB entries are unicast, we
can't just install the same address towards 3 destinations. It is
imaginable that drivers might want to treat this case specifically, so
create some methods for this case and do not recurse into the LAG lower
ports, just the bridge ports.

DSA also listens for FDB entries on "foreign" interfaces, aka interfaces
bridged with us which are not part of our hardware domain: think an
Ethernet switch bridged with a Wi-Fi AP. For those addresses, DSA
installs host FDB entries. However, there we have the same problem
(those host FDB entries are installed with a refcount of only 1) and an
even bigger one which we did not have with FDB entries towards the
bridge:

br_fdb_replay() is currently not called for FDB entries on foreign
interfaces, just for the physical port and for the bridge itself.

So when DSA sniffs an address learned by the software bridge towards a
foreign interface like an e1000 port, and then that e1000 leaves the
bridge, DSA remains with the dangling host FDB address. That will be
fixed separately by replaying all FDB entries and not just the ones
towards the port and the bridge.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 07:04:27 -07:00
..
9p
bluetooth Bluetooth: Fix Set Extended (Scan Response) Data 2021-06-26 07:12:44 +02:00
caif net: remove the caif_hsi driver 2021-07-01 13:19:48 -07:00
iucv net/af_iucv: don't track individual TX skbs for TRANS_HIPER sockets 2021-01-28 20:36:21 -08:00
netfilter netfilter: conntrack: nf_ct_gre_keymap_flush() removal 2021-07-02 02:07:01 +02:00
netns net/tcp_fastopen: remove tcp_fastopen_ctx_lock 2021-07-20 12:07:07 +02:00
nfc NFC: nci: fix memory leak in nci_allocate_device 2021-05-17 13:56:29 -07:00
phonet
sctp sctp: move 198 addresses from unusable to private scope 2021-07-01 11:47:13 -07:00
tc_act net/sched: act_vlan: Fix modify to allow 0 2021-06-01 16:54:42 -07:00
6lowpan.h
Space.h
act_api.h net: sched: fix err handler in tcf_action_init() 2021-04-08 13:47:33 -07:00
addrconf.h net: bridge: mcast: fix broken length + header check for MRDv6 Adv. 2021-04-27 14:02:06 -07:00
af_ieee802154.h
af_rxrpc.h afs: Don't truncate iter during data fetch 2021-04-23 10:17:26 +01:00
af_unix.h af_unix: Implement unix_dgram_bpf_recvmsg() 2021-07-15 18:17:50 -07:00
af_vsock.h af_vsock: rest of SEQPACKET support 2021-06-11 13:32:46 -07:00
ah.h
arp.h
atmclip.h
ax25.h
ax88796.h
bareudp.h
bond_3ad.h
bond_alb.h
bond_options.h
bonding.h bonding: Add struct bond_ipesc to manage SA 2021-07-06 10:36:59 -07:00
bpf_sk_storage.h bpf: struct sock is declared twice in bpf_sk_storage header 2021-03-26 17:43:55 +01:00
busy_poll.h net: annotate data race around sk_ll_usec 2021-07-01 11:23:50 -07:00
calipso.h
cfg80211-wext.h
cfg80211.h Lots of changes: 2021-06-28 13:06:12 -07:00
cfg802154.h
checksum.h csum_and_copy_to_iter(): massage into form closer to csum_and_copy_from_iter() 2021-06-10 11:45:14 -04:00
cipso_ipv4.h
cls_cgroup.h
codel.h
codel_impl.h
codel_qdisc.h
compat.h
datalink.h
dcbevent.h
dcbnl.h
devlink.h net: core: devlink: add dropped stats traps field 2021-06-14 13:04:25 -07:00
dn.h
dn_dev.h
dn_fib.h
dn_neigh.h
dn_nsp.h
dn_route.h
dsa.h net: dsa: make tag_8021q operations part of the core 2021-07-20 06:36:42 -07:00
dsfield.h
dst.h net: Consolidate common blackhole dst ops 2021-03-10 12:24:18 -08:00
dst_cache.h
dst_metadata.h net: validate lwtstate->data before returning from skb_tunnel_info() 2021-07-09 13:55:53 -07:00
dst_ops.h
erspan.h
esp.h
espintcp.h
ethoc.h
failover.h
fib_notifier.h
fib_rules.h
firewire.h
flow.h flow: remove spi key from flowi struct 2021-04-19 12:25:11 +02:00
flow_dissector.h flow_dissector: constify raw input data argument 2021-03-14 14:46:32 -07:00
flow_offload.h flow_offload: action should not be NULL when it is referenced 2021-06-28 14:24:06 -07:00
fou.h
fq.h
fq_impl.h
garp.h
gen_stats.h
genetlink.h mptcp: avoid lock_fast usage in accept path 2021-02-12 16:31:46 -08:00
geneve.h
gre.h ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
gro.h gro: add combined call_gro_receive() + INDIRECT_CALL_INET() helper 2021-03-18 19:51:12 -07:00
gro_cells.h
gtp.h
gue.h
hwbm.h
icmp.h ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages 2021-06-28 14:29:45 -07:00
ieee80211_radiotap.h
ieee802154_netdev.h
if_inet6.h mld: add mc_lock for protecting per-interface mld data 2021-03-26 15:14:56 -07:00
ife.h
ila.h
inet6_connection_sock.h
inet6_hashtables.h
inet_common.h
inet_connection_sock.h tcp: change ICSK_CA_PRIV_SIZE definition 2021-06-29 11:54:36 -07:00
inet_ecn.h
inet_frag.h
inet_hashtables.h
inet_sock.h
inet_timewait_sock.h
inetpeer.h
ip.h net: lwtunnel: handle MTU calculation in forwading 2021-06-28 12:42:14 -07:00
ip6_checksum.h
ip6_fib.h IPv6: Add "offload failed" indication to routes 2021-02-08 16:47:03 -08:00
ip6_route.h net: ipv6: fix return value of ip6_skb_dst_mtu 2021-07-02 11:57:01 -07:00
ip6_tunnel.h
ip_fib.h ipv4: Add a sysctl to control multipath hash fields 2021-05-18 13:27:32 -07:00
ip_tunnels.h
ip_vs.h netfilter: move handlers to net/ip_vs.h 2021-02-04 18:37:57 -08:00
ipcomp.h
ipconfig.h
ipv6.h ipv6: Add a sysctl to control multipath hash fields 2021-05-18 13:27:32 -07:00
ipv6_frag.h
ipv6_stubs.h ipv6: add ipv6_dev_find to stubs 2021-03-30 13:29:39 -07:00
ipx.h
iw_handler.h
kcm.h
l3mdev.h
lag.h
lapb.h net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
lib80211.h
llc.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
lwtunnel.h
mac80211.h mac80211: Switch to a virtual time-based airtime scheduler 2021-06-23 18:12:00 +02:00
mac802154.h
macsec.h net: macsec: fix the length used to copy the key for offloading 2021-06-24 12:41:12 -07:00
mip6.h
mld.h mld: add new workqueues for process mld events 2021-03-26 15:14:56 -07:00
mpls.h
mpls_iptunnel.h
mptcp.h mptcp: avoid processing packet if a subflow reset 2021-07-09 18:38:53 -07:00
mrp.h
ncsi.h
ndisc.h
neighbour.h
net_failover.h
net_namespace.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-06-18 19:47:02 -07:00
net_ratelimit.h
netevent.h
netlabel.h
netlink.h
netprio_cgroup.h
netrom.h
nexthop.h nexthop: Rename artifacts related to legacy multipath nexthop groups 2021-03-28 17:53:39 -07:00
nl802154.h
nsh.h
p8022.h
page_pool.h page_pool: Allow drivers to hint on SKB recycling 2021-06-07 14:11:47 -07:00
pie.h
ping.h
pkt_cls.h net: zero-initialize tc skb extension on allocation 2021-05-25 15:36:42 -07:00
pkt_sched.h net: sched: fix tx action rescheduling issue during deactivation 2021-05-14 15:05:46 -07:00
pptp.h
protocol.h net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
psample.h psample: Add additional metadata attributes 2021-03-14 15:00:43 -07:00
psnap.h
raw.h
rawv6.h
red.h sch_red: fix off-by-one checks in red_check_params() 2021-03-25 17:40:43 -07:00
regulatory.h
request_sock.h
rose.h
route.h
rpl.h
rsi_91x.h
rtnetlink.h rtnetlink: add alloc() method to rtnl_link_ops 2021-06-12 13:16:45 -07:00
rtnh.h
sch_generic.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-06-29 15:45:27 -07:00
scm.h
secure_seq.h
seg6.h
seg6_hmac.h
seg6_local.h
selftests.h net: selftest: fix build issue if INET is disabled 2021-04-28 14:06:45 -07:00
slhc_vj.h
smc.h
snmp.h
sock.h net: sock: extend SO_TIMESTAMPING for PHC binding 2021-07-01 13:08:18 -07:00
sock_reuseport.h tcp: Add reuseport_migrate_sock() to select a new listener. 2021-06-15 18:01:05 +02:00
stp.h
strparser.h
switchdev.h net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE 2021-07-20 07:04:27 -07:00
tcp.h tcp: consistently disable header prediction for mptcp 2021-07-01 13:22:40 -07:00
tcp_states.h
timewait_sock.h
tipc.h
tls.h net: sock: introduce sk_error_report 2021-06-29 11:28:21 -07:00
tls_toe.h
transp_v6.h
tso.h
tun_proto.h
udp.h skmsg: Pass psock pointer to ->psock_update_sk_prot() 2021-04-12 17:34:27 +02:00
udp_tunnel.h udp: call udp_encap_enable for v6 sockets when enabling encap 2021-02-04 18:37:14 -08:00
udplite.h
vsock_addr.h
vxlan.h
wext.h
x25.h
x25device.h
xdp.h bpf: Add function for XDP meta data length check 2021-07-07 19:51:12 -07:00
xdp_priv.h
xdp_sock.h xdp: Add proper __rcu annotations to redirect map entries 2021-06-24 19:41:15 +02:00
xdp_sock_drv.h
xfrm.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-06-29 15:45:27 -07:00
xsk_buff_pool.h xsk: Fix missing validation for skb and unaligned mode 2021-06-18 16:57:19 +02:00