JIRA: https://issues.redhat.com/browse/RHEL-57756
Upstream commit(s):
commit cd7209628cdb2a7edd7656c126d2455e7102e949
Author: Jakub Kicinski <kuba@kernel.org>
Date: Fri Mar 29 10:57:10 2024 -0700
genetlink: remove linux/genetlink.h
genetlink.h is a shell of what used to be a combined uAPI
and kernel header over a decade ago. It has fewer than
10 lines of code. Merge it into net/genetlink.h.
In some ways it'd be better to keep the combined header
under linux/ but it would make looking through git history
harder.
Acked-by: Sven Eckelmann <sven@narfation.org>
Link: https://lore.kernel.org/r/20240329175710.291749-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Petr Oros <poros@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-57756
Upstream commit(s):
commit f97c9b533a1dc60a77ff329e0117acc5ae17def5
Author: Jakub Kicinski <kuba@kernel.org>
Date: Fri Mar 29 10:57:09 2024 -0700
net: openvswitch: remove unnecessary linux/genetlink.h include
The only legit reason I could think of for net/genetlink.h
and linux/genetlink.h to be separate would be if one was
included by other headers and we wanted to keep it lightweight.
That is not the case, net/openvswitch/meter.h includes
linux/genetlink.h but for no apparent reason (for struct genl_family
perhaps? it's not necessary, types of externs do not need
to be known).
Link: https://lore.kernel.org/r/20240329175710.291749-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Petr Oros <poros@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-57768
commit 938863727076f684abb39d1d0f9dce1924e9028e
Author: Boris Sukholitko <boris.sukholitko@broadcom.com>
Date: Thu Aug 22 13:35:08 2024 +0300
tc: adjust network header after 2nd vlan push
<tldr>
skb network header of the single-tagged vlan packet continues to point the
vlan payload (e.g. IP) after second vlan tag is pushed by tc act_vlan. This
causes problem at the dissector which expects double-tagged packet network
header to point to the inner vlan.
The fix is to adjust network header in tcf_act_vlan.c but requires
refactoring of skb_vlan_push function.
</tldr>
Consider the following shell script snippet configuring TC rules on the
veth interface:
ip link add veth0 type veth peer veth1
ip link set veth0 up
ip link set veth1 up
tc qdisc add dev veth0 clsact
tc filter add dev veth0 ingress pref 10 chain 0 flower \
num_of_vlans 2 cvlan_ethtype 0x800 action goto chain 5
tc filter add dev veth0 ingress pref 20 chain 0 flower \
num_of_vlans 1 action vlan push id 100 \
protocol 0x8100 action goto chain 5
tc filter add dev veth0 ingress pref 30 chain 5 flower \
num_of_vlans 2 cvlan_ethtype 0x800 action simple sdata "success"
Sending double-tagged vlan packet with the IP payload inside:
cat <<ENDS | text2pcap - - | tcpreplay -i veth1 -
0000 00 00 00 00 00 11 00 00 00 00 00 22 81 00 00 64 ..........."...d
0010 81 00 00 14 08 00 45 04 00 26 04 d2 00 00 7f 11 ......E..&......
0020 18 ef 0a 00 00 01 14 00 00 02 00 00 00 00 00 12 ................
0030 e1 c7 00 00 00 00 00 00 00 00 00 00 ............
ENDS
will match rule 10, goto rule 30 in chain 5 and correctly emit "success" to
the dmesg.
OTOH, sending single-tagged vlan packet:
cat <<ENDS | text2pcap - - | tcpreplay -i veth1 -
0000 00 00 00 00 00 11 00 00 00 00 00 22 81 00 00 14 ..........."....
0010 08 00 45 04 00 2a 04 d2 00 00 7f 11 18 eb 0a 00 ..E..*..........
0020 00 01 14 00 00 02 00 00 00 00 00 16 e1 bf 00 00 ................
0030 00 00 00 00 00 00 00 00 00 00 00 00 ............
ENDS
will match rule 20, will push the second vlan tag but will *not* match
rule 30. IOW, the match at rule 30 fails if the second vlan was freshly
pushed by the kernel.
Lets look at __skb_flow_dissect working on the double-tagged vlan packet.
Here is the relevant code from around net/core/flow_dissector.c:1277
copy-pasted here for convenience:
if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX &&
skb && skb_vlan_tag_present(skb)) {
proto = skb->protocol;
} else {
vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
data, hlen, &_vlan);
if (!vlan) {
fdret = FLOW_DISSECT_RET_OUT_BAD;
break;
}
proto = vlan->h_vlan_encapsulated_proto;
nhoff += sizeof(*vlan);
}
The "else" clause above gets the protocol of the encapsulated packet from
the skb data at the network header location. printk debugging has showed
that in the good double-tagged packet case proto is
htons(0x800 == ETH_P_IP) as expected. However in the single-tagged packet
case proto is garbage leading to the failure to match tc filter 30.
proto is being set from the skb header pointed by nhoff parameter which is
defined at the beginning of __skb_flow_dissect
(net/core/flow_dissector.c:1055 in the current version):
nhoff = skb_network_offset(skb);
Therefore the culprit seems to be that the skb network offset is different
between double-tagged packet received from the interface and single-tagged
packet having its vlan tag pushed by TC.
Lets look at the interesting points of the lifetime of the single/double
tagged packets as they traverse our packet flow.
Both of them will start at __netif_receive_skb_core where the first vlan
tag will be stripped:
if (eth_type_vlan(skb->protocol)) {
skb = skb_vlan_untag(skb);
if (unlikely(!skb))
goto out;
}
At this stage in double-tagged case skb->data points to the second vlan tag
while in single-tagged case skb->data points to the network (eg. IP)
header.
Looking at TC vlan push action (net/sched/act_vlan.c) we have the following
code at tcf_vlan_act (interesting points are in square brackets):
if (skb_at_tc_ingress(skb))
[1] skb_push_rcsum(skb, skb->mac_len);
....
case TCA_VLAN_ACT_PUSH:
err = skb_vlan_push(skb, p->tcfv_push_proto, p->tcfv_push_vid |
(p->tcfv_push_prio << VLAN_PRIO_SHIFT),
0);
if (err)
goto drop;
break;
....
out:
if (skb_at_tc_ingress(skb))
[3] skb_pull_rcsum(skb, skb->mac_len);
And skb_vlan_push (net/core/skbuff.c:6204) function does:
err = __vlan_insert_tag(skb, skb->vlan_proto,
skb_vlan_tag_get(skb));
if (err)
return err;
skb->protocol = skb->vlan_proto;
[2] skb->mac_len += VLAN_HLEN;
in the case of pushing the second tag. Lets look at what happens with
skb->data of the single-tagged packet at each of the above points:
1. As a result of the skb_push_rcsum, skb->data is moved back to the start
of the packet.
2. First VLAN tag is moved from the skb into packet buffer, skb->mac_len is
incremented, skb->data still points to the start of the packet.
3. As a result of the skb_pull_rcsum, skb->data is moved forward by the
modified skb->mac_len, thus pointing to the network header again.
Then __skb_flow_dissect will get confused by having double-tagged vlan
packet with the skb->data at the network header.
The solution for the bug is to preserve "skb->data at second vlan header"
semantics in the skb_vlan_push function. We do this by manipulating
skb->network_header rather than skb->mac_len. skb_vlan_push callers are
updated to do skb_reset_mac_len.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-68063
Upstream Status: linux.git
commit 57fb67783c4011581882f32e656d738da1f82042
Author: Menglong Dong <menglong8.dong@gmail.com>
Date: Wed Aug 21 20:32:52 2024 +0800
net: ovs: fix ovs_drop_reasons error
There is something wrong with ovs_drop_reasons. ovs_drop_reasons[0] is
"OVS_DROP_LAST_ACTION", but OVS_DROP_LAST_ACTION == __OVS_DROP_REASON + 1,
which means that ovs_drop_reasons[1] should be "OVS_DROP_LAST_ACTION".
And as Adrian tested, without the patch, adding flow to drop packets
results in:
drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:19:17 2024 859853461 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_ACTION_ERROR
With the patch, the same results in:
drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:16:13 2024 475856608 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_LAST_ACTION
Fix this by initializing ovs_drop_reasons with index.
Fixes: 9d802da40b7c ("net: openvswitch: add last-action drop reason")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240821123252.186305-1-dongml2@chinatelecom.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-59091
commit 05c1280a2bcfca187fe7fa90bb240602cf54af0a
Author: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Thu Aug 29 14:33:38 2024 +0200
netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local
"Interface can't change network namespaces" is rather an attribute,
not a feature, and it can't be changed via Ethtool.
Make it a "cold" private flag instead of a netdev_feature and free
one more bit.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Conflicts:
drivers/net/amt.c
drivers/net/ethernet/adi/adin1110.c
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-59091
commit 00d066a4d4edbe559ba6c35153da71d4b2b8a383
Author: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Thu Aug 29 14:33:37 2024 +0200
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature,
rather an attribute, very similar to IFF_NO_QUEUE (and hot).
Free one netdev_features_t bit and make it a "hot" private flag.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Conflicts:
drivers/net/macsec.c
drivers/net/veth.c
net/ipv6/ip6_tunnel.c
- Context.
drivers/net/amt.c
drivers/net/netkit.c
- Non-existent in RHEL 9.
drivers/net/ethernet/chelsio/cxgb/cxgb2.c
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
- Drivers disabled in RHEL 9. Skipped.
net/dsa/user.c
- This is slave.c in RHEL 9, but CONFIG_NET_DSA is disabled,
so skipped the hunk.
net/core/net-sysfs.c
- Code not present because of missing commit 74293ea1c4db
("net: sysfs: Do not create sysfs for non BQL device")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-44213
CVE: CVE-2024-38558
commit 7c988176b6c16c516474f6fceebe0f055af5eb56
Author: Ilya Maximets <i.maximets@ovn.org>
Date: Thu May 9 11:38:05 2024 +0200
net: openvswitch: fix overwriting ct original tuple for ICMPv6
OVS_PACKET_CMD_EXECUTE has 3 main attributes:
- OVS_PACKET_ATTR_KEY - Packet metadata in a netlink format.
- OVS_PACKET_ATTR_PACKET - Binary packet content.
- OVS_PACKET_ATTR_ACTIONS - Actions to execute on the packet.
OVS_PACKET_ATTR_KEY is parsed first to populate sw_flow_key structure
with the metadata like conntrack state, input port, recirculation id,
etc. Then the packet itself gets parsed to populate the rest of the
keys from the packet headers.
Whenever the packet parsing code starts parsing the ICMPv6 header, it
first zeroes out fields in the key corresponding to Neighbor Discovery
information even if it is not an ND packet.
It is an 'ipv6.nd' field. However, the 'ipv6' is a union that shares
the space between 'nd' and 'ct_orig' that holds the original tuple
conntrack metadata parsed from the OVS_PACKET_ATTR_KEY.
ND packets should not normally have conntrack state, so it's fine to
share the space, but normal ICMPv6 Echo packets or maybe other types of
ICMPv6 can have the state attached and it should not be overwritten.
The issue results in all but the last 4 bytes of the destination
address being wiped from the original conntrack tuple leading to
incorrect packet matching and potentially executing wrong actions
in case this packet recirculates within the datapath or goes back
to userspace.
ND fields should not be accessed in non-ND packets, so not clearing
them should be fine. Executing memset() only for actual ND packets to
avoid the issue.
Initializing the whole thing before parsing is needed because ND packet
may not contain all the options.
The issue only affects the OVS_PACKET_CMD_EXECUTE path and doesn't
affect packets entering OVS datapath from network interfaces, because
in this case CT metadata is populated from skb after the packet is
already parsed.
Fixes: 9dd7f8907c ("openvswitch: Add original direction conntrack tuple to sw_flow_key.")
Reported-by: Antonin Bas <antonin.bas@broadcom.com>
Closes: https://github.com/openvswitch/ovs-issues/issues/327
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20240509094228.1035477-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-37650
Upstream Status: commit 30a92c9e3d6b0
commit 30a92c9e3d6b073932762bef2ac66f4ee784c657
Author: Aaron Conole <aconole@redhat.com>
Date: Thu May 16 16:09:41 2024 -0400
openvswitch: Set the skbuff pkt_type for proper pmtud support.
Open vSwitch is originally intended to switch at layer 2, only dealing with
Ethernet frames. With the introduction of l3 tunnels support, it crossed
into the realm of needing to care a bit about some routing details when
making forwarding decisions. If an oversized packet would need to be
fragmented during this forwarding decision, there is a chance for pmtu
to get involved and generate a routing exception. This is gated by the
skbuff->pkt_type field.
When a flow is already loaded into the openvswitch module this field is
set up and transitioned properly as a packet moves from one port to
another. In the case that a packet execute is invoked after a flow is
newly installed this field is not properly initialized. This causes the
pmtud mechanism to omit sending the required exception messages across
the tunnel boundary and a second attempt needs to be made to make sure
that the routing exception is properly setup. To fix this, we set the
outgoing packet's pkt_type to PACKET_OUTGOING, since it can only get
to the openvswitch module via a port device or packet command.
Even for bridge ports as users, the pkt_type needs to be reset when
doing the transmit as the packet is truly outgoing and routing needs
to get involved post packet transformations, in the case of
VXLAN/GENEVE/udp-tunnel packets. In general, the pkt_type on output
gets ignored, since we go straight to the driver, but in the case of
tunnel ports they go through IP routing layer.
This issue is periodically encountered in complex setups, such as large
openshift deployments, where multiple sets of tunnel traversal occurs.
A way to recreate this is with the ovn-heater project that can setup
a networking environment which mimics such large deployments. We need
larger environments for this because we need to ensure that flow
misses occur. In these environment, without this patch, we can see:
./ovn_cluster.sh start
podman exec ovn-chassis-1 ip r a 170.168.0.5/32 dev eth1 mtu 1200
podman exec ovn-chassis-1 ip netns exec sw01p1 ip r flush cache
podman exec ovn-chassis-1 ip netns exec sw01p1 \
ping 21.0.0.3 -M do -s 1300 -c2
PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
From 21.0.0.3 icmp_seq=2 Frag needed and DF set (mtu = 1142)
--- 21.0.0.3 ping statistics ---
...
Using tcpdump, we can also see the expected ICMP FRAG_NEEDED message is not
sent into the server.
With this patch, setting the pkt_type, we see the following:
podman exec ovn-chassis-1 ip netns exec sw01p1 \
ping 21.0.0.3 -M do -s 1300 -c2
PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
From 21.0.0.3 icmp_seq=1 Frag needed and DF set (mtu = 1222)
ping: local error: message too long, mtu=1222
--- 21.0.0.3 ping statistics ---
...
In this case, the first ping request receives the FRAG_NEEDED message and
a local routing exception is created.
Tested-by: Jaime Caamano <jcaamano@redhat.com>
Reported-at: https://issues.redhat.com/browse/FDP-164
Fixes: 58264848a5 ("openvswitch: Add vxlan tunneling support.")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20240516200941.16152-1-aconole@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4578
JIRA: https://issues.redhat.com/browse/RHEL-36364
CVE: CVE-2024-27395
```
net: openvswitch: Fix Use-After-Free in ovs_ct_exit
Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
of ovs_ct_limit_exit, is not part of the RCU read critical section, it
is possible that the RCU grace period will pass during the traversal and
the key will be free.
To prevent this, it should be changed to hlist_for_each_entry_safe.
Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2)
```
Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>
Merged-by: Lucas Zampieri <lzampier@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4658
JIRA: https://issues.redhat.com/browse/RHEL-31876
Upstream-Status: net-next.git
Tested: manual testing + OVS testsuite including psample-specific tests
from [1] + upstream kernel selftests tests including psample-specific
tests.
OpenvSwitch currently supports a feature called "per-flow sampling" by
which a controller such as OVN can configure certain flows that make the
matched packet get "sampled". The sample is sent via IPFIX alongside
OVN-generated metadata. This is very useful to enhance visibility on the
datapath. E.g: it can be used to know what NetworkPolicy impacted a certain
packet (and the packet header contents).
However, a big limitation makes this solution non-production ready:
samples have to go through ovs-vswitchd via upcall (userspace action) sharing
both netlink socket buffer and ovs-vswitchd thread time with actual packet
processing.
This series adds support for a new action called "psample" that, when used by
OVS, allows samples to go directly to some external observer through the
psample netlink multicast group fixing the current limitation and enabling
observability solutions to be built on top of OVS/OVN.
[1]
https://patchwork.ozlabs.org/project/openvswitch/cover/20240707200905.2719071-1-amorenoz@redhat.com/
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>
Merged-by: Lucas Zampieri <lzampier@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-31876
Upstream-Status: net-next.git
commit 71763d8a8203c28178d7be7f18af73d4dddb36ba
Author: Adrian Moreno <amorenoz@redhat.com>
Date: Thu Jul 4 10:56:57 2024 +0200
net: openvswitch: store sampling probability in cb.
When a packet sample is observed, the sampling rate that was used is
important to estimate the real frequency of such event.
Store the probability of the parent sample action in the skb's cb area
and use it in psample action to pass it down to psample module.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-7-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-31876
Upstream-status: net-next.git
commit aae0b82b46cb5004bdf82a000c004d69a0885c33
Author: Adrian Moreno <amorenoz@redhat.com>
Date: Thu Jul 4 10:56:56 2024 +0200
net: openvswitch: add psample action
Add support for a new action: psample.
This action accepts a u32 group id and a variable-length cookie and uses
the psample multicast group to make the packet available for
observability.
The maximum length of the user-defined cookie is set to 16, same as
tc_cookie, to discourage using cookies that will not be offloadable.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-6-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-36364
CVE: CVE-2024-27395
commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2
Author: Hyunwoo Kim <v4bel@theori.io>
Date: Mon Apr 22 05:37:17 2024 -0400
net: openvswitch: Fix Use-After-Free in ovs_ct_exit
Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
of ovs_ct_limit_exit, is not part of the RCU read critical section, it
is possible that the RCU grace period will pass during the traversal and
the key will be free.
To prevent this, it should be changed to hlist_for_each_entry_safe.
Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-44560
Tested: compile only
commit a23ac973f67f37e77b3c634e8b1ad5b0164fcc1f
Author: Xin Long <lucien.xin@gmail.com>
Date: Wed Jun 19 18:08:56 2024 -0400
openvswitch: get related ct labels from its master if it is not confirmed
Ilya found a failure in running check-kernel tests with at_groups=144
(144: conntrack - FTP SNAT orig tuple) in OVS repo. After his further
investigation, the root cause is that the labels sent to userspace
for related ct are incorrect.
The labels for unconfirmed related ct should use its master's labels.
However, the changes made in commit 8c8b73320805 ("openvswitch: set
IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
led to getting labels from this related ct.
So fix it in ovs_ct_get_labels() by changing to copy labels from its
master ct if it is a unconfirmed related ct. Note that there is no
fix needed for ct->mark, as it was already copied from its master
ct for related ct in init_conntrack().
Fixes: 8c8b73320805 ("openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Tested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Xin Long <lxin@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-40130
Conflicts:
- hunk for non-existing net/ipv4/fou_bpf.c skipped
- conflict in ip_gre.c resolved in the same way as upstream merge
commit cf1ca1f66d30 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") did
- simple context conflict ip_tunnel.c due to missing commit
c4794d22251b9 ("ipv4: tunnels: use DEV_STATS_INC()")
- simple context conflict in ip6_gre.c and ip6_tunnel.c due to missing
commit 2fad1ba354d4a ("ipv6: tunnels: use DEV_STATS_INC()")
- simple conflict in nft_tunnel.c due to missing ffb3d9a30cc67 ("netfilter:
nf_tables: use correct integer types")
commit 5832c4a77d6931cebf9ba737129ae8f14b66ee1d
Author: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Wed Mar 27 16:23:53 2024 +0100
ip_tunnel: convert __be16 tunnel flags to bitmaps
Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT
have been defined as __be16. Now all of those 16 bits are occupied
and there's no more free space for new flags.
It can't be simply switched to a bigger container with no
adjustments to the values, since it's an explicit Endian storage,
and on LE systems (__be16)0x0001 equals to
(__be64)0x0001000000000000.
We could probably define new 64-bit flags depending on the
Endianness, i.e. (__be64)0x0001 on BE and (__be64)0x00010000... on
LE, but that would introduce an Endianness dependency and spawn a
ton of Sparse warnings. To mitigate them, all of those places which
were adjusted with this change would be touched anyway, so why not
define stuff properly if there's no choice.
Define IP_TUNNEL_*_BIT counterparts as a bit number instead of the
value already coded and a fistful of <16 <-> bitmap> converters and
helpers. The two flags which have a different bit position are
SIT_ISATAP_BIT and VTI_ISVTI_BIT, as they were defined not as
__cpu_to_be16(), but as (__force __be16), i.e. had different
positions on LE and BE. Now they both have strongly defined places.
Change all __be16 fields which were used to store those flags, to
IP_TUNNEL_DECLARE_FLAGS() -> DECLARE_BITMAP(__IP_TUNNEL_FLAG_NUM) ->
unsigned long[1] for now, and replace all TUNNEL_* occurrences to
their bitmap counterparts. Use the converters in the places which talk
to the userspace, hardware (NFP) or other hosts (GRE header). The rest
must explicitly use the new flags only. This must be done at once,
otherwise there will be too many conversions throughout the code in
the intermediate commits.
Finally, disable the old __be16 flags for use in the kernel code
(except for the two 'irregular' flags mentioned above), to prevent
any accidental (mis)use of them. For the userspace, nothing is
changed, only additions were made.
Most noticeable bloat-o-meter difference (.text):
vmlinux: 307/-1 (306)
gre.ko: 62/0 (62)
ip_gre.ko: 941/-217 (724) [*]
ip_tunnel.ko: 390/-900 (-510) [**]
ip_vti.ko: 138/0 (138)
ip6_gre.ko: 534/-18 (516) [*]
ip6_tunnel.ko: 118/-10 (108)
[*] gre_flags_to_tnl_flags() grew, but still is inlined
[**] ip_tunnel_find() got uninlined, hence such decrease
The average code size increase in non-extreme case is 100-200 bytes
per module, mostly due to sizeof(long) > sizeof(__be16), as
%__IP_TUNNEL_FLAG_NUM is less than %BITS_PER_LONG and the compilers
are able to expand the majority of bitmap_*() calls here into direct
operations on scalars.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3929
# Merge Request Required Information
## Summary of Changes
JIRA: https://issues.redhat.com/browse/RHEL-23575
CVE: CVE-2024-1151
```
commit 6e2f90d31fe09f2b852de25125ca875aabd81367
Author: Aaron Conole <aconole@redhat.com>
Date: Fri Feb 09 21:54:38 2024 +0100
net: openvswitch: limit the number of recursions from action sets
The ovs module allows for some actions to recursively contain an action
list for complex scenarios, such as sampling, checking lengths, etc.
When these actions are copied into the internal flow table, they are
evaluated to validate that such actions make sense, and these calls
happen recursively.
The ovs-vswitchd userspace won't emit more than 16 recursion levels
deep. However, the module has no such limit and will happily accept
limits larger than 16 levels nested. Prevent this by tracking the
number of recursions happening and manually limiting it to 16 levels
deep. However, the module has no such limit and will happily accept
limits larger than 16 levels nested. Prevent this by tracking the
number of recursions happening and manually limiting it to 16 levels
nested.
The initial implementation of the sample action would track this depth
and prevent more than 3 levels of recursion, but this was removed to
support the clone use case, rather than limited at the current userspace
limit.
Fixes: 798c166173 ("openvswitch: Optimize sample action for the clone use cases")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240207132416.1488485-2-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
```
Signed-off-by: Aaron Conole <aconole@redhat.com>
## Approved Development Ticket
All submissions to CentOS Stream must reference an approved ticket in [Red Hat Jira](https://issues.redhat.com/). Please follow the CentOS Stream [contribution documentation](https://docs.centos.org/en-US/stream-contrib/quickstart/) for how to file this ticket and have it approved.
Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Merged-by: Lucas Zampieri <lzampier@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-23575
CVE: CVE-2024-1151
commit 6e2f90d31fe09f2b852de25125ca875aabd81367
Author: Aaron Conole <aconole@redhat.com>
Date: Fri Feb 09 21:54:38 2024 +0100
net: openvswitch: limit the number of recursions from action sets
The ovs module allows for some actions to recursively contain an action
list for complex scenarios, such as sampling, checking lengths, etc.
When these actions are copied into the internal flow table, they are
evaluated to validate that such actions make sense, and these calls
happen recursively.
The ovs-vswitchd userspace won't emit more than 16 recursion levels
deep. However, the module has no such limit and will happily accept
limits larger than 16 levels nested. Prevent this by tracking the
number of recursions happening and manually limiting it to 16 levels
nested.
The initial implementation of the sample action would track this depth
and prevent more than 3 levels of recursion, but this was removed to
support the clone use case, rather than limited at the current userspace
limit.
Fixes: 798c166173 ("openvswitch: Optimize sample action for the clone use cases")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240207132416.1488485-2-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Aaron Conole <aconole@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-30656
commit bffcc6882a1bb2be8c9420184966f4c2c822078e
Author: Jakub Kicinski <kuba@kernel.org>
Date: Mon Aug 14 14:47:16 2023 -0700
genetlink: remove userhdr from struct genl_info
Only three families use info->userhdr today and going forward
we discourage using fixed headers in new families.
So having the pointer to user header in struct genl_info
is an overkill. Compute the header pointer at runtime.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/20230814214723.2924989-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: net.git
commit 4539f91f2a801c0c028c252bffae56030cfb2cae
Author: Ilya Maximets <i.maximets@ovn.org>
Date: Wed Apr 3 22:38:01 2024 +0200
net: openvswitch: fix unwanted error log on timeout policy probing
On startup, ovs-vswitchd probes different datapath features including
support for timeout policies. While probing, it tries to execute
certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE
attributes set. These attributes tell the openvswitch module to not
log any errors when they occur as it is expected that some of the
probes will fail.
For some reason, setting the timeout policy ignores the PROBE attribute
and logs a failure anyway. This is causing the following kernel log
on each re-start of ovs-vswitchd:
kernel: Failed to associated timeout policy `ovs_test_tp'
Fix that by using the same logging macro that all other messages are
using. The message will still be printed at info level when needed
and will be rate limited, but with a net rate limiter instead of
generic printk one.
The nf_ct_set_timeout() itself will still print some info messages,
but at least this change makes logging in openvswitch module more
consistent.
Fixes: 06bd2bdf19 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: linux.git
commit 7713ec844756a9883ba9a91381369256275de4fb
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Sat Oct 14 08:34:53 2023 +0200
net: openvswitch: Annotate struct mask_array with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/ca5c8049f58bb933f231afd0816e30a5aaa0eddd.1697264974.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: linux.git
commit 16ae53d80c00445c903128f2a64af87b5a03d474
Author: Kees Cook <keescook@chromium.org>
Date: Fri Sep 22 10:28:54 2023 -0700
net: openvswitch: Annotate struct dp_meter with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct dp_meter.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: dev@openvswitch.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230922172858.3822653-12-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: linux.git
commit e7b34822fa4dcf6101deb3d51a77efd77533571d
Author: Kees Cook <keescook@chromium.org>
Date: Fri Sep 22 10:28:52 2023 -0700
net: openvswitch: Annotate struct dp_meter_instance with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct dp_meter_instance.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: dev@openvswitch.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230922172858.3822653-10-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-30344
commit f3a63cce1b4fbde7738395c5a2dea83f05de3407
Author: Hangbin Liu <liuhangbin@gmail.com>
Date: Fri Oct 28 04:42:24 2022 -0400
rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link
This patch use the new helper unregister_netdevice_many_notify() for
rtnl_delete_link(), so that the kernel could reply unicast when userspace
set NLM_F_ECHO flag to request the new created interface info.
At the same time, the parameters of rtnl_delete_link() need to be updated
since we need nlmsghdr and portid info.
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-21360
Upstream Status: net.git commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba
commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba
Author: Vlad Buslov <vladbu@nvidia.com>
Date: Fri Nov 3 16:14:10 2023 +0100
net/sched: act_ct: Always fill offloading tuple iifidx
Referenced commit doesn't always set iifidx when offloading the flow to
hardware. Fix the following cases:
- nf_conn_act_ct_ext_fill() is called before extension is created with
nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
unspecified iifidx when connection is offloaded after only single
original-direction packet has been processed by tc data path. Always fill
the new nf_conn_act_ct_ext instance after creating it in
nf_conn_act_ct_ext_add().
- Offloading of unidirectional UDP NEW connections is now supported, but ct
flow iifidx field is not updated when connection is promoted to
bidirectional which can result reply-direction iifidx to be zero when
refreshing the connection. Fill in the extension and update flow iifidx
before calling flow_offload_refresh().
Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2843
JIRA: https://issues.redhat.com/browse/RHEL-1848
Already in CS9
Omitted-fix: 327b18b7aaed ("mm/kfence: select random number before taking raw lock")
Omitted-fix: bfbfb6182ad1 ("nfsd_splice_actor(): handle compound pages")
Omitted-fix: ac8db824ead0 ("NFSD: Fix reads with a non-zero offset that don't end on a page boundary")
Omitted-fix: b3719108ae60 ("perf kmem: Support legacy tracepoints")
Omitted-fix: dce088ab0d51 ("perf kmem: Support field "node" in evsel__process_alloc_event() coping with recent tracepoint restructuring")
Omitted-fix: c18c20f16219 ("mm, slab: remove duplicate kernel-doc comment for ksize()")
Omitted-fix: cfccd2e63e7e ("mm, compaction: finish pageblocks on complete migration failure")
Omitted-fix: 6342140db660 ("selftests/timens: add a test for vfork+exit")
Omitted-fix: be6667b0db97 ("selftests/vm: dedup hugepage allocation logic")
Omitted-fix: 9d0d94684007 ("selftests/vm: add selftest to verify multi THP collapse")
Omitted-fix: 1370a21fe470 ("selftests/vm: add selftest to verify recollapse of THPs")
Omitted-fix: b25806dcd3d5 ("mm: memcontrol: deprecate swapaccounting=0 mode")
Omitted-fix: b94c4e949c36 ("mm: memcontrol: use do_memsw_account() in a few more places")
Omitted-fix: e55b9f96860f ("mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol")
Omitted-fix: 6f777dcef774 ("docs: kmsan: fix formatting of "Example report"")
Omitted fix: 26e1a0c3277d ("mm: use pmdp_get_lockless() without surplus barrier()")
Omitted-fix: 0cb8fd4d1416 ("mm/migrate: remove cruft from migration_entry_wait()s")
patches resulting in empty commits after conflict resolution
Omitted-fix: 4a7e922587d2 ("selftests: vm: add /dev/userfaultfd test cases to run_vmtests.sh")
patches that are functionally identical
Omitted-fix: 6f777dcef774 ("docs: kmsan: fix formatting of "Example report"")
Is identical to 436fa4a699bc ("docs: kmsan: fix formatting of "Example report"")
Defer to crypto group
Omitted-fix: f900fde28883 ("crypto: testmgr - fix RNG performance in fuzz tests")
Not including since we're specifically excluding the Maple Tree VMA Iterator
Omitted-fix: 524e00b36e8c ("mm: remove rb tree.")
'series' patches that won't be addressed by this MR
Omitted-fix: 9905eed48e82 ("Merge branch 'af_unix-OOB-fixes'")
Omitted-fix: 2e4b231ac125 ("scsi: NCR5380: Use sc_data_direction instead of rq_data_dir()")
Omitted-fix: 40e16ce7b6fa ("scsi: advansys: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 11bf4ec58073 ("scsi: aha1542: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 3ada9c791b1d ("scsi: dpt_i2o: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 240ec1197786 ("scsi: ips: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: ce425dd7dbc9 ("scsi: mvumi: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 2fd8f23aae36 ("scsi: myrb: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 43b2d1b14ed0 ("scsi: myrs: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 0f8f3ea84a89 ("scsi: ncr53c8xx: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 3f5e62c5e074 ("scsi: qla1280: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: ba4baf0951bb ("scsi: qlogicpti: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: ec808ef9b838 ("scsi: snic: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: bbfa8d7d1283 ("scsi: stex: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 6c5d5422c533 ("scsi: sun3_scsi: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 77ff7756c73e ("scsi: sym53c8xx: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 80ca10b6052d ("scsi: xen-scsifront: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 332f606b32b6 ("ovl: enable RCU'd ->get_acl()")
Omitted-fix: b3b6f5b92255 ("btrfs: handle idmaps in btrfs_new_inode()")
Omitted-fix: ca07274c3da9 ("btrfs: allow idmapped rename inode op")
Omitted-fix: c020d2eaf1a8 ("btrfs: allow idmapped getattr inode op")
Omitted-fix: 72105277dcfc ("btrfs: allow idmapped mknod inode op")
Omitted-fix: e93ca491d03f ("btrfs: allow idmapped create inode op")
Omitted-fix: b0b3e44d346c ("btrfs: allow idmapped mkdir inode op")
Omitted-fix: 5a0521086e5f ("btrfs: allow idmapped symlink inode op")
Omitted-fix: 98b6ab5fc098 ("btrfs: allow idmapped tmpfile inode op")
Omitted-fix: d4d094646142 ("btrfs: allow idmapped setattr inode op")
Omitted-fix: 3bc71ba02cf5 ("btrfs: allow idmapped permission inode op")
Omitted-fix: 5474bf400f16 ("btrfs: check whether fsgid/fsuid are mapped during subvolume creation")
Omitted-fix: 4d4340c912cc ("btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls")
Omitted-fix: c4ed533bdc79 ("btrfs: allow idmapped SNAP_DESTROY ioctls")
Omitted-fix: aabb34e7a31c ("btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids")
Omitted-fix: e4fed17a32b6 ("btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls")
Omitted-fix: 39e1674ff035 ("btrfs: allow idmapped SUBVOL_SETFLAGS ioctl")
Omitted-fix: 6623d9a0b0ce ("btrfs: allow idmapped INO_LOOKUP_USER ioctl")
Omitted-fix: 4a8b34afa9c9 ("btrfs: handle ACLs on idmapped mounts")
Omitted-fix: 5b9b26f5d0b8 ("btrfs: allow idmapped mount")
Omitted-fix: 8cc5c54de44c ("docs: update mapping documentation")
Omitted-fix: 02e407991350 ("fs: remove unused low-level mapping helpers")
Omitted-fix: ce70fd9a551a ("scsi: core: Remove the cmd field from struct scsi_request")
Omitted-fix: 5b794f98074a ("scsi: core: Remove the sense and sense_len fields from struct scsi_request")
Omitted-fix: a9a4ea1166d6 ("scsi: core: Move the resid_len field from struct scsi_request to struct scsi_cmnd")
Omitted-fix: dbb4c84d87af ("scsi: core: Move the result field from struct scsi_request to struct scsi_cmnd")
Omitted-fix: 6aded12b10e0 ("scsi: core: Remove struct scsi_request")
Omitted-fix: 264403033105 ("scsi: core: Remove <scsi/scsi_request.h>")
Omitted-fix: cd4b46cdb491 ("scsi: 53c700: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 417c434aa1b4 ("docs/zh_CN: core-api: Update the translation of cachetlb.rst to 5.19-rc3")
Omitted-fix: 1ebfae49fd44 ("docs/zh_CN: core-api: Update the translation of cpu_hotplug.rst to 5.19-rc3")
Omitted-fix: 722ecdbce68a ("docs/zh_CN: core-api: Update the translation of irq/irq-domain.rst to 5.19-rc3")
Omitted-fix: b2fdf7f080b4 ("docs/zh_CN: core-api: Update the translation of kernel-api.rst to 5.19-rc3")
Omitted-fix: e86a0e297f0b ("docs/zh_CN: core-api: Update the translation of printk-format.rst to 5.19-rc3")
Omitted-fix: c290f175e73f ("docs/zh_CN: core-api: Update the translation of workqueue.rst to 5.19-rc3")
Omitted-fix: 4a6d00a43ef7 ("docs/zh_CN: core-api: Update the translation of xarray.rst to 5.19-rc3")
Omitted-fix: e8f60cd7db24 ("Merge tag 'perf-tools-fixes-for-v6.2-2-2023-01-11' of git://git.kernel.org/pub/scm/linux/ker…")
Omitted-fix: 3a761d72fa62 ("exportfs: support idmapped mounts")
Omitted-fix: 22f289ce1f8b ("ovl: use ovl_lookup_upper() wrapper")
Omitted-fix: 50db8d027355 ("ovl: handle idmappings for layer fileattrs")
Omitted-fix: c85bcc912f4f ("kselftests: memcg: update the oom group leaf events test")
Omitted-fix: be74553f250f ("kselftests: memcg: speed up the memory.high test")
Omitted-fix: 1bd1a4dd3e8c ("MAINTAINERS: add corresponding kselftests to cgroup entry")
Omitted-fix: 3a761d72fa62 ("exportfs: support idmapped mounts")
Omitted-fix: 22f289ce1f8b ("ovl: use ovl_lookup_upper() wrapper")
Omitted-fix: 50db8d027355 ("ovl: handle idmappings for layer fileattrs")
Omitted-fix: c85bcc912f4f ("kselftests: memcg: update the oom group leaf events test")
Omitted-fix: be74553f250f ("kselftests: memcg: speed up the memory.high test")
Omitted-fix: 1bd1a4dd3e8c ("MAINTAINERS: add corresponding kselftests to cgroup entry")
Omitted-fix: cdc69458a5f3 ("cgroup: account for memory_recursiveprot in test_memcg_low()")
Omitted-fix: 72b1e03aa725 ("cgroup: account for memory_localevents in test_memcg_oom_group_leaf_events()")
Omitted-fix: 830316807e02 ("cgroup: remove racy check in test_memcg_sock()")
Omitted-fix: c1a31a2f7a9c ("cgroup: fix racy check in alloc_pagecache_max_30M() helper function")
Omitted-fix: c01d4d0a82b7 ("random: quiet urandom warning ratelimit suppression message")
Omitted-fix: 21873bd66b6e ("Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux")
Omitted-fix: ff3b72a5d614 ("selftests: memcg: fix compilation")
Omitted-fix: 1d09069f5313 ("selftests: memcg: expect no low events in unprotected sibling")
Omitted-fix: 63fbdd3c77ec ("net: use DEBUG_NET_WARN_ON_ONCE() in __release_sock()")
Omitted-fix: 76458faeb285 ("net: use DEBUG_NET_WARN_ON_ONCE() in dev_loopback_xmit()")
Omitted-fix: 3e7f2b8d3088 ("net: use WARN_ON_ONCE() in inet_sock_destruct()")
Omitted-fix: 7890e2f09d43 ("net: use DEBUG_NET_WARN_ON_ONCE() in skb_release_head_state()")
Omitted-fix: ee2640df2393 ("net: add debug checks in napi_consume_skb and __napi_alloc_skb()")
Omitted-fix: 39e0f991a62e ("random: mark bootloader randomness code as __init")
Omitted-fix: 6342140db660 ("selftests/timens: add a test for vfork+exit")
Omitted-fix: cf21b355ccb3 ("af_unix: Optimise hash table layout.")
Omitted-fix: c12db92d62bf ("ovl: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 73db6a063c78 ("ovl: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 1e8a9191ccc2 ("f2fs: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: a03a972b26da ("fuse: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 00d369bc2de5 ("fuse: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 276a3f7cf1d9 ("ksmbd: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 45c311501c77 ("fs: use mount types in iattr")
Omitted-fix: 1f36146a5a3d ("fs: introduce tiny iattr ownership update helpers")
Omitted-fix: 35faf3109a78 ("fs: port to iattr ownership update helpers")
Omitted-fix: 71e7b535b890 ("quota: port quota helpers mount ids")
Omitted-fix: b27c82e12965 ("attr: port attribute changes to new types")
Omitted-fix: cf21b355ccb3 ("af_unix: Optimise hash table layout.")
Omitted-fix: e95ab1d85289 ("selftests: net: af_unix: Test connect() with different netns.")
Omitted-fix: 169005eae2af ("docs/zh_CN: Update the translation of mm-api to 6.1-rc8")
Omitted-fix: 659797dc4d64 ("Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8")
Omitted-fix: 6a5057e9dc13 ("Docs/zh_CN: Update the translation of sparse to 5.19-rc8")
Omitted-fix: 63c1d2516b05 ("Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8")
Omitted-fix: 83b41bb27b25 ("Docs/zh_CN: Update the translation of usage to 5.19-rc8")
Omitted-fix: c78478e164d4 ("Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8")
Omitted-fix: ce1120076c53 ("Docs/zh_CN: Update the translation of pci to 5.19-rc8")
Omitted-fix: 4116ff79749d ("Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8")
Omitted-fix: 7f02464739da ("9p: convert to advancing variant of iov_iter_get_pages_alloc()")
Omitted-fix: 5b09c9fec086 ("do_proc_readlink(): constify path")
Omitted-fix: ea4af4aa03c3 ("nd_jump_link(): constify path")
Omitted-fix: 20f45ad50d65 ("spufs: constify path")
Omitted-fix: 88569546e8a1 ("ecryptfs: constify path")
Omitted-fix: 9204a97f7ae8 ("sched: Change wait_task_inactive()s match_state")
Omitted-fix: 04c6b79ae4f0 ("btrfs: convert __process_pages_contig() to use filemap_get_folios_contig()")
Omitted-fix: a75b81c3f63b ("btrfs: convert end_compressed_writeback() to use filemap_get_folios()")
Omitted-fix: 47d554199513 ("btrfs: convert process_page_range() to use filemap_get_folios_contig()")
Omitted-fix: 24a1efb4a912 ("nilfs2: convert nilfs_find_uncommited_extent() to use filemap_get_folios_contig()")
Omitted-fix: 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early")
Omitted-fix: 7d37539037c2 ("fuse: implement ->tmpfile()")
Omitted-fix: f743f16c548b ("treewide: use get_random_{u8,u16}() when possible, part 2")
Omitted-fix: 6ab587e8e8b4 ("docs/zh_CN: Update the translation of delay-accounting to 6.1-rc8")
Omitted-fix: cf306a26cb3a ("docs/zh_CN: Update the translation of kernel-api to 6.1-rc8")
Omitted-fix: e07e9f22259e ("docs/zh_CN: Update the translation of testing-overview to 6.1-rc8")
Omitted-fix: ffdd9bd7a278 ("docs/zh_CN: Update the translation of reclaim to 6.1-rc8")
Omitted-fix: 9a833802a04d ("docs/zh_CN: Update the translation of start to 6.1-rc8")
Omitted-fix: 7cb52d4b3724 ("docs/zh_CN: Update the translation of usage to 6.1-rc8")
Omitted-fix: 03474d581df3 ("docs/zh_CN: Update the translation of msi-howto to 6.1-rc8")
Omitted-fix: 7df047be4363 ("docs/zh_CN: Update the translation of energy-model to 6.1-rc8")
Omitted-fix: e0068090095c ("docs/zh_CN: Update the translation of highmem to 6.1-rc8")
Omitted-fix: 0f3d70cb01da ("docs/zh_CN: Update the translation of ksm to 6.1-rc8")
Omitted-fix: 11018ef90ce7 ("s390/checksum: remove not needed uaccess.h include")
Omitted-fix: 2ea3498980f5 ("mm/damon/core: split out DAMOS-charged region skip logic into a new function")
Omitted-fix: e63a30c51f84 ("mm/damon/core: split damos application logic into a new function")
Omitted-fix: d1cbbf621fc2 ("mm/damon/core: split out scheme stat update logic into a new function")
Omitted-fix: 898810e5ca54 ("mm/damon/core: split out scheme quota adjustment logic into a new function")
Omitted-fix: 789a230613c8 ("mm/damon/sysfs: use damon_addr_range for region's start and end values")
Omitted-fix: 1f71981408ef ("mm/damon/sysfs: remove parameters of damon_sysfs_region_alloc()")
Omitted-fix: 39240595917e ("mm/damon/sysfs: move sysfs_lock to common module")
Omitted-fix: d332fe11debe ("mm/damon/sysfs: move unsigned long range directory to common module")
Omitted-fix: 4acd715ff57f ("mm/damon/sysfs: split out kdamond-independent schemes stats update logic into a new function")
Omitted-fix: c8e7b4d0ba34 ("mm/damon/sysfs: split out schemes directory implementation to separate file")
Omitted fix: dfe843dce775 ("s390/checksum: support GENERIC_CSUM, enable it for KASAN")
Omitted fix: e42ac7789df6 ("s390/checksum: always use cksm instruction")
Omitted fix: 1a167ddd3c56 ("x86: kmsan: pgtable: reduce vmalloc space")
Omitted fix: 7cf8f44a5a1c ("x86: fs: kmsan: disable CONFIG_DCACHE_WORD_ACCESS")
Omitted fix: 1468c6f4558b ("mm: fs: initialize fsdata passed to write_begin/write_end interface")
Omitted fix: 0aa8ea3c5d35 ("mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages")
Omitted fix: 42855f588e18 ("x86/purgatory: disable KMSAN instrumentation")
Omitted fix: 11385b261200 ("x86/uaccess: instrument copy_from_user_nmi()")
Omitted fix: f70da5ee8fe1 ("mm/damon: convert damon_pa_mark_accessed_or_deactivate() to use folios")
Omitted fix: 5a9e34747c9f ("mm/swap: convert deactivate_page() to folio_deactivate()")
Omitted fix: 0aa8ea3c5d35 ("mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages")
Omitted fix: de1f5055523e ("mm/mempolicy: convert queue_pages_pmd() to queue_folios_pmd()")
Omitted fix: 3dae02bbd07f ("mm/mempolicy: convert queue_pages_pte_range() to queue_folios_pte_range()")
Omitted fix: 0a2c1e818316 ("mm/mempolicy: convert queue_pages_hugetlb() to queue_folios_hugetlb()")
Omitted fix: d451b89dcd18 ("mm/mempolicy: convert queue_pages_required() to queue_folio_required()")
Omitted fix: 4a64981dfee9 ("mm/mempolicy: convert migrate_page_add() to migrate_folio_add()")
Omitted fix: 0aa8ea3c5d35 ("mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages")
Omitted fix: 46c475bd676b ("mm/pgtable: kmap_local_page() instead of kmap_atomic()")
Omitted fix: 0d940a9b270b ("mm/pgtable: allow pte_offset_map[_lock]() to fail")
Omitted fix: 65747aaf42b7 ("mm/filemap: allow pte_offset_map_lock() to fail")
Omitted fix: 45fe85e9811e ("mm/page_vma_mapped: delete bogosity in page_vma_mapped_walk()")
Omitted fix: 90f43b0a13cd ("mm/page_vma_mapped: reformat map_pte() with less indentation")
Omitted fix: 2798bbe75b9c ("mm/page_vma_mapped: pte_offset_map_nolock() not pte_lockptr()")
Omitted fix: 7780d04046a2 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails")
Omitted fix: be872f83bf57 ("mm/pagewalk: walk_pte_range() allow for pte_offset_map()")
Omitted fix: e5ad581c7f1c ("mm/vmwgfx: simplify pmd & pud mapping dirty helpers")
Omitted fix: 0d1c81edc61e ("mm/vmalloc: vmalloc_to_page() use pte_offset_kernel()")
Omitted fix: 6ec1905f6ec7 ("mm/hmm: retry if pte_offset_map() fails")
Omitted fix: 2b683a4ff6ee ("mm/userfaultfd: retry if pte_offset_map() fails")
Omitted fix: 3622d3cde308 ("mm/userfaultfd: allow pte_offset_map_lock() to fail")
Omitted fix: 9f2bad096d2f ("mm/debug_vm_pgtable,page_table_check: warn pte map fails")
Omitted fix: 04dee9e85cf5 ("mm/various: give up if pte_offset_map[_lock]() fails")
Omitted fix: 670ddd8cdcbd ("mm/mprotect: delete pmd_none_or_clear_bad_unless_trans_huge()")
Omitted fix: a5be621ee292 ("mm/mremap: retry if either pte_offset_map_*lock() fails")
Omitted fix: 179d3e4f3bfa ("mm/madvise: clean up force_shm_swapin_readahead()")
Omitted fix: d850fa729873 ("mm/swapoff: allow pte_offset_map[_lock]() to fail")
Omitted fix: 52fc048320ad ("mm/mglru: allow pte_offset_map_nolock() to fail")
Omitted fix: 4b56069c95d6 ("mm/migrate_device: allow pte_offset_map_lock() to fail")
Omitted fix: 2378118bd9da ("mm/gup: remove FOLL_SPLIT_PMD use of pmd_trans_unstable()")
Omitted fix: c9c1ee20ee84 ("mm/huge_memory: split huge pmd under one pte_offset_map()")
Omitted fix: 895f5ee464cc ("mm/khugepaged: allow pte_offset_map[_lock]() to fail")
Omitted fix: 3db82b9374ca ("mm/memory: allow pte_offset_map[_lock]() to fail")
Omitted fix: c7ad08804fae ("mm/memory: handle_pte_fault() use pte_offset_map_nolock()")
Omitted fix: 20b18aada185 ("madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check")
Omitted fix: 3db82b9374ca ("mm/memory: allow pte_offset_map[_lock]() to fail")
Omitted fix: c7ad08804fae ("mm/memory: handle_pte_fault() use pte_offset_map_nolock()")
Omitted fix: 20b18aada185 ("madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check")
Coming Soon:
Omitted-fix: 6f0df8e16eb5 ("memcontrol: ensure memcg acquired by id is properly set up")
Omitted-fix: ee40d543e97d ("mm/pagewalk: fix bootstopping regression from extra pte_unmap()")
Omitted-fix: ab048302026d ("ovl: fix failed copyup of fileattr on a symlink")
Omitted-fix: 92fe9dcbe4e1 ("hugetlbfs: clear resv_map pointer if mmap fails")
Omitted-fix: bf4916922c60 ("hugetlbfs: extend hugetlb_vma_lock to private VMAs")
Omitted-fix: 2820b0f09be9 ("hugetlbfs: close race between MADV_DONTNEED and page fault")
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=56452800
Tested: KT1+mm regression: https://beaker.engineering.redhat.com/jobs/8467307
Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Jan Stancek <jstancek@redhat.com>
Approved-by: Mika Penttilä <mpenttil@redhat.com>
Approved-by: Jerry Snitselaar <jsnitsel@redhat.com>
Approved-by: Alex Gladkov <agladkov@redhat.com>
Approved-by: Vladis Dronov <vdronov@redhat.com>
Approved-by: Dean Nelson <dnelson@redhat.com>
Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Baoquan He <5820488-baoquan_he@users.noreply.gitlab.com>
Approved-by: Jiri Benc <jbenc@redhat.com>
Approved-by: John W. Linville <linville@redhat.com>
Signed-off-by: Scott Weaver <scweaver@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3067
JIRA: https://issues.redhat.com/browse/RHEL-1773
Depends: https://issues.redhat.com/browse/RHEL-860
Depends: https://issues.redhat.com/browse/RHEL-3646
Update TC (net/sched) to the upstream v6.5
Omitted-fix: cad7526f33ce ("net: dsa: ocelot: unlock on error in vsc9959_qos_port_tas_set()")
Not needed, DSA as well as ocelot driver is not enabled/supported in RHEL
Commits:
```
1b808993e194 ("flow_dissector: fix false-positive __read_overflow2_field() warning")
f743f16c548b ("treewide: use get_random_{u8,u16}() when possible, part 2")
7e3cf0843fe5 ("treewide: use get_random_{u8,u16}() when possible, part 1")
8032bf1233a7 ("treewide: use get_random_u32_below() instead of deprecated function")
62423bd2d2e2 ("net: sched: remove qdisc_watchdog->last_expires")
c66b2111c9c9 ("selftests: tc-testing: add tests for action binding")
f5fca219ad45 ("net: do not use skb_mac_header() in qdisc_pkt_len_init()")
e495a9673caf ("sch_cake: do not use skb_mac_header() in cake_overhead()")
b3be94885af4 ("net/sched: remove two skb_mac_header() uses")
fcb3a4653bc5 ("net/sched: act_api: use the correct TCA_ACT attributes in dump")
4170f0ef582c ("fix typos in net/sched/)
8b0f256530d9 ("net/sched: sch_mqprio: use netlink payload helpers")
3dd0c16ec93e ("net/sched: mqprio: simplify handling of nlattr portion of TCA_OPTIONS")
57f21bf85400 ("net/sched: mqprio: add extack to mqprio_parse_nlattr()")
ab277d2084ba ("net/sched: mqprio: add an extack message to mqprio_parse_opt()")
c54876cd5961 ("net/sched: pass netlink extack to mqprio and taprio offload")
f62af20bed2d ("net/sched: mqprio: allow per-TC user input of FP adminStatus")
a721c3e54b80 ("net/sched: taprio: allow per-TC user input of FP adminStatus")
8c966a10eb84 ("flow_dissector: Address kdoc warnings")
54e906f1639e ("selftests: forwarding: sch_tbf_*: Add a pre-run hook")
2f0f9465ad9f ("net: sched: Print msecs when transmit queue time out")
5036034572b7 ("net/sched: act_pedit: use NLA_POLICY for parsing 'ex' keys")
0c83c5210e18 ("net/sched: act_pedit: use extack in 'ex' parsing errors")
e1201bc781c2 ("net/sched: act_pedit: check static offsets a priori")
577140180ba2 ("net/sched: act_pedit: remove extra check for key type")
e3c9673e2f6e ("net/sched: act_pedit: rate limit datapath messages")
807cfded92b0 ("net/sched: sch_htb: use extack on errors messages")
c69a9b023f65 ("net/sched: sch_qfq: use extack on errors messages")
25369891fcef ("net/sched: sch_qfq: refactor parsing of netlink parameters")
7eb060a51a3b ("selftests: tc-testing: add more tests for sch_qfq")
1b483d9f5805 ("net/sched: act_pedit: free pedit keys on bail from offset check")
526f28bd0fbd ("net/sched: act_mirred: Add carrier check")
12e7789ad5b4 ("sch_htb: Allow HTB priority parameter in offload mode")
c7cfbd115001 ("net/sched: sch_ingress: Only create under TC_H_INGRESS")
5eeebfe6c493 ("net/sched: sch_clsact: Only create under TC_H_CLSACT")
f85fa45d4a94 ("net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs")
9de95df5d15b ("net/sched: Prohibit regrafting ingress or clsact Qdiscs")
7b4858df3bf7 ("skbuff: bridge: Add layer 2 miss indication")
d5ccfd90df7f ("flow_dissector: Dissect layer 2 miss from tc skb extension")
1a432018c0cd ("net/sched: flower: Allow matching on layer 2 miss")
f4356947f029 ("flow_offload: Reject matching on layer 2 miss")
8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases")
dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")
2d800bc500fb ("net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum")
6c1adb650c8d ("net/sched: taprio: add netlink reporting for offload statistics counters")
a395b8d1c7c3 ("selftests/tc-testing: replace mq with invalid parent ID")
8cde87b007da ("net: sched: wrap tc_skip_wrapper with CONFIG_RETPOLINE")
cd2b8113c2e8 ("net/sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values")
d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
886bc7d6ed33 ("net: sched: move rtm_tca_policy declaration to include file")
682881ee45c8 ("net: sched: act_police: fix sparse errors in tcf_police_dump()")
6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
26e35370b976 ("net/sched: act_pedit: Use kmemdup() to replace kmalloc + memcpy")
2b84960fc5dd ("net/sched: taprio: report class offload stats per TXQ, not per TC")
d7ad70b5ef5a ("net: flow_dissector: add support for cfm packets")
7cfffd5fed3e ("net: flower: add support for matching cfm fields")
1668a55a73f5 ("selftests: net: add tc flower cfm test")
c29e012eae29 ("selftests: forwarding: Fix layer 2 miss test syntax")
aef6e908b542 ("selftests/tc-testing: Fix Error: Specified qdisc kind is unknown.")
b849c566ee9c ("selftests/tc-testing: Fix Error: failed to find target LOG")
b39d8c41c7a8 ("selftests/tc-testing: Fix SFB db test")
11b8b2e70a9b ("selftests/tc-testing: Remove configs that no longer exist")
41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
2d5f6a8d7aef ("net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs")
84ad0af0bccd ("net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting")
e16ad981e2a1 ("net: sched: Remove unused qdisc_l2t()")
ca4fa8743537 ("selftests: tc-testing: add one test for flushing explicitly created chain")
b4ee93380b3c ("net/sched: act_ipt: add sanity checks on table name and hook locations")
b2dc32dcba08 ("net/sched: act_ipt: add sanity checks on skb before calling target")
93d75d475c5d ("net/sched: act_ipt: zero skb->cb before calling target")
30c45b5361d3 ("net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX")
989b52cdc849 ("net: sched: Replace strlcpy with strscpy")
d3f87278bcb8 ("net/sched: flower: Ensure both minimum and maximum ports are specified")
150e33e62c1f ("net/sched: make psched_mtu() RTNL-less safe")
158810b261d0 ("net/sched: sch_qfq: reintroduce lmax bound check for MTU")
c5a06fdc618d ("selftests: tc-testing: add tests for qfq mtu sanity check")
3e337087c3b5 ("net/sched: sch_qfq: account for stab overhead in qfq_enqueue")
137f6219da59 ("selftests: tc-testing: add test for qfq with stab overhead")
d1cca974548d ("pie: fix kernel-doc notation warning")
b3d0e0489430 ("net: sched: cls_matchall: Undo tcf_bind_filter in case of failure after mall_set_parms")
9cb36faedeaf ("net: sched: cls_u32: Undo tcf_bind_filter if u32_replace_hw_knode")
e8d3d78c19be ("net: sched: cls_u32: Undo refcount decrement in case update failed")
26a22194927e ("net: sched: cls_bpf: Undo tcf_bind_filter in case of an error")
ac177a330077 ("net: sched: cls_flower: Undo tcf_bind_filter in case of an error")
fda05798c22a ("selftests: tc: set timeout to 15 minutes")
719b4774a8cb ("selftests: tc: add 'ct' action kconfig dep")
031c99e71fed ("selftests: tc: add ConnTrack procfs kconfig")
4914109a8e1e ("netfilter: allow exp not to be removed in nf_ct_find_expectation")
76622ced50a1 ("net: sched: set IPS_CONFIRMED in tmpl status only when commit is set in act_ct")
8c8b73320805 ("openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
9fe63d5f1da9 ("sch_htb: Allow HTB quantum parameter in offload mode")
6c58c8816abb ("net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64")
4d50e50045aa ("net: flower: fix stack-out-of-bounds in fl_set_key_cfm()")
e68409db9953 ("net: sched: cls_u32: Fix match key mis-addressing")
e739718444f7 ("net/sched: taprio: Limit TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME to INT_MAX.")
21a72166abb9 ("selftests: forwarding: tc_flower_l2_miss: Fix failing test with old libnet")
```
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com>
Approved-by: Michal Schmidt <mschmidt@redhat.com>
Approved-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Scott Weaver <scweaver@redhat.com>
Conflicts:
drivers/gpu/drm/tests/drm_buddy_test.c
drivers/gpu/drm/tests/drm_mm_test.c - We already have
ce28ab1380e8 ("drm/tests: Add back seed value information")
so keep calls to kunit_info.
drop changes to drivers/misc/habanalabs/gaudi2/gaudi2.c
fs/ntfs3/fslog.c - files not in CS9
net/sunrpc/auth_gss/gss_krb5_wrap.c - We already have
7f675ca7757b ("SUNRPC: Improve Kerberos confounder generation")
so code to change is gone.
drivers/gpu/drm/i915/i915_gem_gtt.c
drivers/gpu/drm/i915/selftests/i915_selftest.c
drivers/gpu/drm/tests/drm_buddy_test.c
drivers/gpu/drm/tests/drm_mm_test.c
change added under
4cb818386e ("Merge DRM changes from upstream v6.0.8..v6.1")
JIRA: https://issues.redhat.com/browse/RHEL-1848
commit a251c17aa558d8e3128a528af5cf8b9d7caae4fd
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date: Wed Oct 5 17:43:22 2022 +0200
treewide: use get_random_u32() when possible
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbol
t
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: net-next.git
commit df3bf90fef281c630ef06a3d03efb9fe56c8a0fb
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Sat Oct 14 08:34:52 2023 +0200
net: openvswitch: Use struct_size()
Use struct_size() instead of hand writing it.
This is less verbose and more robust.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/e5122b4ff878cbf3ed72653a395ad5c4da04dc1e.1697264974.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: net-next.git
commit 06bc3668cc2a6db2831b9086f0e3c6ebda599dba
Author: Ilya Maximets <i.maximets@ovn.org>
Date: Thu Sep 21 21:42:35 2023 +0200
openvswitch: reduce stack usage in do_execute_actions
do_execute_actions() function can be called recursively multiple
times while executing actions that require pipeline forking or
recirculations. It may also be re-entered multiple times if the packet
leaves openvswitch module and re-enters it through a different port.
Currently, there is a 256-byte array allocated on stack in this
function that is supposed to hold NSH header. Compilers tend to
pre-allocate that space right at the beginning of the function:
a88: 48 81 ec b0 01 00 00 sub $0x1b0,%rsp
NSH is not a very common protocol, but the space is allocated on every
recursive call or re-entry multiplying the wasted stack space.
Move the stack allocation to push_nsh() function that is only used
if NSH actions are actually present. push_nsh() is also a simple
function without a possibility for re-entry, so the stack is returned
right away.
With this change the preallocated space is reduced by 256 B per call:
b18: 48 81 ec b0 00 00 00 sub $0xb0,%rsp
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eelco Chaudron echaudro@redhat.com
Reviewed-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: linux.git
commit a552bfa16bab4ce901ee721346a28c4e483f4066
Author: Jakub Kicinski <kuba@kernel.org>
Date: Mon Aug 14 13:38:40 2023 -0700
net: openvswitch: reject negative ifindex
Recent changes in net-next (commit 759ab1edb56c ("net: store netdevs
in an xarray")) refactored the handling of pre-assigned ifindexes
and let syzbot surface a latent problem in ovs. ovs does not validate
ifindex, making it possible to create netdev ports with negative
ifindex values. It's easy to repro with YNL:
$ ./cli.py --spec netlink/specs/ovs_datapath.yaml \
--do new \
--json '{"upcall-pid": 1, "name":"my-dp"}'
$ ./cli.py --spec netlink/specs/ovs_vport.yaml \
--do new \
--json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}'
$ ip link show
-65536: some-port0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 7a:48:21:ad:0b:fb brd ff:ff:ff:ff:ff:ff
...
Validate the inputs. Now the second command correctly returns:
$ ./cli.py --spec netlink/specs/ovs_vport.yaml \
--do new \
--json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}'
lib.ynl.NlError: Netlink error: Numerical result out of range
nl_len = 108 (92) nl_flags = 0x300 nl_type = 2
error: -34 extack: {'msg': 'integer out of range', 'unknown': [[type:4 len:36] b'\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x03\x00\xff\xff\xff\x7f\x00\x00\x00\x00\x08\x00\x01\x00\x08\x00\x00\x00'], 'bad-attr': '.ifindex'}
Accept 0 since it used to be silently ignored.
Fixes: 54c4ef34c4b6 ("openvswitch: allow specifying ifindex of new interfaces")
Reported-by: syzbot+7456b5dcf65111553320@syzkaller.appspotmail.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/20230814203840.2908710-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: linux.git
commit b50a8b0d57ab1ef11492171e98a030f48682eac3
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Sat May 6 18:04:16 2023 +0200
net: openvswitch: Use struct_size()
Use struct_size() instead of hand writing it.
This is less verbose and more informative.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/e7746fbbd62371d286081d5266e88bbe8d3fe9f0.1683388991.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-1773
commit 8c8b733208058702da451b7d60a12c0ff90b6879
Author: Xin Long <lucien.xin@gmail.com>
Date: Sun Jul 16 17:09:19 2023 -0400
openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack
By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
from the hashtable when lookup, we can simplify the exp processing code a
lot in openvswitch conntrack.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-12679
commit d457a0e329b0bfd3a1450e0b1a18cd2b47a25a08
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Jun 8 19:17:37 2023 +0000
net: move gso declarations and functions to their own files
Move declarations into include/net/gso.h and code into net/core/gso.c
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git
commit 43d95b30cf5793cdd3c7b1c1cd5fead9b469bd60
Author: Adrian Moreno <amorenoz@redhat.com>
Date: Fri Aug 11 16:12:52 2023 +0200
net: openvswitch: add misc error drop reasons
Use drop reasons from include/net/dropreason-core.h when a reasonable
candidate exists.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git
commit f329d1bc1a4580e0f8a402b14a6fd024ec8e5c7b
Author: Adrian Moreno <amorenoz@redhat.com>
Date: Fri Aug 11 16:12:51 2023 +0200
net: openvswitch: add meter drop reason
By using an independent drop reason it makes it easy to distinguish
between QoS-triggered or flow-triggered drop.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git
commit e7bc7db9ba463e763ac6113279cade19da9cb939
Author: Eric Garver <eric@garver.life>
Date: Fri Aug 11 16:12:50 2023 +0200
net: openvswitch: add explicit drop action
From: Eric Garver <eric@garver.life>
This adds an explicit drop action. This is used by OVS to drop packets
for which it cannot determine what to do. An explicit action in the
kernel allows passing the reason _why_ the packet is being dropped or
zero to indicate no particular error happened (i.e: OVS intentionally
dropped the packet).
Since the error codes coming from userspace mean nothing for the kernel,
we squash all of them into only two drop reasons:
- OVS_DROP_EXPLICIT_WITH_ERROR to indicate a non-zero value was passed
- OVS_DROP_EXPLICIT to indicate a zero value was passed (no error)
e.g. trace all OVS dropped skbs
# perf trace -e skb:kfree_skb --filter="reason >= 0x30000"
[..]
106.023 ping/2465 skb:kfree_skb(skbaddr: 0xffffa0e8765f2000, \
location:0xffffffffc0d9b462, protocol: 2048, reason: 196611)
reason: 196611 --> 0x30003 (OVS_DROP_EXPLICIT)
Also, this patch allows ovs-dpctl.py to add explicit drop actions as:
"drop" -> implicit empty-action drop
"drop(0)" -> explicit non-error action drop
"drop(42)" -> explicit error action drop
Signed-off-by: Eric Garver <eric@garver.life>
Co-developed-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git
commit ec7bfb5e5a054f1178e8bdbf4f145fdafa5bf804
Author: Adrian Moreno <amorenoz@redhat.com>
Date: Fri Aug 11 16:12:49 2023 +0200
net: openvswitch: add action error drop reason
Add a drop reason for packets that are dropped because an action
returns a non-zero error code.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git
commit 9d802da40b7c820deb9c60fc394457ea565cafc8
Author: Adrian Moreno <amorenoz@redhat.com>
Date: Fri Aug 11 16:12:48 2023 +0200
Create a new drop reason subsystem for openvswitch and add the first
drop reason to represent last-action drops.
Last-action drops happen when a flow has an empty action list or there
is no action that consumes the packet (output, userspace, recirc, etc).
It is the most common way in which OVS drops packets.
Implementation-wise, most of these skb-consuming actions already call
"consume_skb" internally and return directly from within the
do_execute_actions() loop so with minimal changes we can assume that
any skb that exits the loop normally is a packet drop.
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2662
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2203263
Upstream Status: Backport net-next.git commit de9df6c6b27e
Conflicts: none
Backport of upstream commit:
commit de9df6c6b27e22d7bdd20107947ef3a20e687de5
Author: Eelco Chaudron <echaudro@redhat.com>
Date: Tue Jun 6 13:56:35 2023 +0200
net: openvswitch: fix upcall counter access before allocation
Currently, the per cpu upcall counters are allocated after the vport is
created and inserted into the system. This could lead to the datapath
accessing the counters before they are allocated resulting in a kernel
Oops.
Here is an example:
PID: 59693 TASK: ffff0005f4f51500 CPU: 0 COMMAND: "ovs-vswitchd"
...
PID: 58682 TASK: ffff0005b2f0bf00 CPU: 0 COMMAND: "kworker/0:3"
We moved the per cpu upcall counter allocation to the existing vport
alloc and free functions to solve this.
Fixes: 95637d91fefd ("net: openvswitch: release vport resources on failure")
Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall packets")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Paolo Abeni <pabeni@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2188082
Upstream Status: commit e069ba07e6c7
commit e069ba07e6c7af69e119316bc87ff44869095f49
Author: Aaron Conole <aconole@redhat.com>
Date: Fri Jun 9 09:59:55 2023 -0400
net: openvswitch: add support for l4 symmetric hashing
Since its introduction, the ovs module execute_hash action allowed
hash algorithms other than the skb->l4_hash to be used. However,
additional hash algorithms were not implemented. This means flows
requiring different hash distributions weren't able to use the
kernel datapath.
Now, introduce support for symmetric hashing algorithm as an
alternative hash supported by the ovs module using the flow
dissector.
Output of flow using l4_sym hash:
recirc_id(0),in_port(3),eth(),eth_type(0x0800),
ipv4(dst=64.0.0.0/192.0.0.0,proto=6,frag=no), packets:30473425,
bytes:45902883702, used:0.000s, flags:SP.,
actions:hash(sym_l4(0)),recirc(0xd)
Some performance testing with no GRO/GSO, two veths, single flow:
hash(l4(0)): 4.35 GBits/s
hash(l4_sym(0)): 4.24 GBits/s
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2615
This is a backport of ALSA changes up to 6.4-rc6 kernel for RHEL 9.3.
Bugzilla: https://bugzilla.redhat.com/2179848
This upstream patchset updates the ALSA driver code:
- ALSA core
- ALSA HDA
- ALSA USB
- ALSA PCI
- ALSA SoC (mainly SOF including SoundWire drivers)
- Soundwire bus
- dt-bindings for qcom (Qualcomm) and fsl (Freescale) for automotive boards, NVidia seems handed in !2355
The other components are touched to get things in sync with the current upstream:
Some touched drivers are for hardware platforms which are not used in RHEL. The purpose to merge those upstream commits is to keep the future code sync more easy.
Kernel module renames:
- snd-soc-sst-broadwell -> snd-soc-bdw-rt286
- snd-soc-sst-haswell -> snd-soc-hsw-rt5640
Note: The Elf -> ELF patch touches many subsystems. I can remove it on demand.
ARK configuration changes: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2500 and https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2520
Signed-off-by: Jaroslav Kysela <jkysela@redhat.com>
Approved-by: Adrien Thierry <athierry@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: Jocelyn Falempe <jfalempe@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>
Approved-by: Julia Denham <jdenham@redhat.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2203263
Upstream Status: Backport net-next.git commit de9df6c6b27e
Conflicts: none
Backport of upstream commit:
commit de9df6c6b27e22d7bdd20107947ef3a20e687de5
Author: Eelco Chaudron <echaudro@redhat.com>
Date: Tue Jun 6 13:56:35 2023 +0200
net: openvswitch: fix upcall counter access before allocation
Currently, the per cpu upcall counters are allocated after the vport is
created and inserted into the system. This could lead to the datapath
accessing the counters before they are allocated resulting in a kernel
Oops.
Here is an example:
PID: 59693 TASK: ffff0005f4f51500 CPU: 0 COMMAND: "ovs-vswitchd"
#0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
#1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
#2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
#3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
#4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
#5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
#6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
#7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
...
PID: 58682 TASK: ffff0005b2f0bf00 CPU: 0 COMMAND: "kworker/0:3"
#0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
#1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
#2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
#3 [ffff80000a5d3120] die at ffffb70f0628234c
#4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
#5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
#6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
#7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
#8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
#9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
#10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
#11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
#12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
#13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
#14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
#15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
#16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
#17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at ffffb70f06079f90
We moved the per cpu upcall counter allocation to the existing vport
alloc and free functions to solve this.
Fixes: 95637d91fefd ("net: openvswitch: release vport resources on failure")
Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall packets")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2193170
Conflicts:
* net/netfilter/ipvs/ip_vs_ctl.c
- the change was already applied by RHEL commit 914c1e31d9 ("ipvs:
use u64_stats_t for the per-cpu counters")
* net/core/devlink.c
- hunk was applied in different file (net/devlink/leftover.c)
commit d120d1a63b2c484d6175873d8ee736a633f74b70
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Wed Oct 26 15:22:15 2022 +0200
net: Remove the obsolte u64_stats_fetch_*_irq() users (net).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.
Convert to the regular interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>