Commit Graph

942 Commits

Author SHA1 Message Date
Rado Vrbovsky 65ee7b65eb Merge: net: visibility patches for 9.6
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5833

JIRA: https://issues.redhat.com/browse/RHEL-68063

Signed-off-by: Antoine Tenart <atenart@redhat.com>

Approved-by: Guillaume Nault <gnault@redhat.com>
Approved-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2025-01-06 08:26:06 +00:00
Petr Oros ab39cead6a genetlink: remove linux/genetlink.h
JIRA: https://issues.redhat.com/browse/RHEL-57756

Upstream commit(s):
commit cd7209628cdb2a7edd7656c126d2455e7102e949
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Mar 29 10:57:10 2024 -0700

    genetlink: remove linux/genetlink.h

    genetlink.h is a shell of what used to be a combined uAPI
    and kernel header over a decade ago. It has fewer than
    10 lines of code. Merge it into net/genetlink.h.
    In some ways it'd be better to keep the combined header
    under linux/ but it would make looking through git history
    harder.

    Acked-by: Sven Eckelmann <sven@narfation.org>
    Link: https://lore.kernel.org/r/20240329175710.291749-4-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Petr Oros <poros@redhat.com>
2024-12-10 10:37:53 +01:00
Petr Oros 8f9545d559 net: openvswitch: remove unnecessary linux/genetlink.h include
JIRA: https://issues.redhat.com/browse/RHEL-57756

Upstream commit(s):
commit f97c9b533a1dc60a77ff329e0117acc5ae17def5
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Mar 29 10:57:09 2024 -0700

    net: openvswitch: remove unnecessary linux/genetlink.h include

    The only legit reason I could think of for net/genetlink.h
    and linux/genetlink.h to be separate would be if one was
    included by other headers and we wanted to keep it lightweight.
    That is not the case, net/openvswitch/meter.h includes
    linux/genetlink.h but for no apparent reason (for struct genl_family
    perhaps? it's not necessary, types of externs do not need
    to be known).

    Link: https://lore.kernel.org/r/20240329175710.291749-3-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Petr Oros <poros@redhat.com>
2024-12-10 10:37:53 +01:00
Ivan Vecera ad35cce099 tc: adjust network header after 2nd vlan push
JIRA: https://issues.redhat.com/browse/RHEL-57768

commit 938863727076f684abb39d1d0f9dce1924e9028e
Author: Boris Sukholitko <boris.sukholitko@broadcom.com>
Date:   Thu Aug 22 13:35:08 2024 +0300

    tc: adjust network header after 2nd vlan push

    <tldr>
    skb network header of the single-tagged vlan packet continues to point the
    vlan payload (e.g. IP) after second vlan tag is pushed by tc act_vlan. This
    causes problem at the dissector which expects double-tagged packet network
    header to point to the inner vlan.

    The fix is to adjust network header in tcf_act_vlan.c but requires
    refactoring of skb_vlan_push function.
    </tldr>

    Consider the following shell script snippet configuring TC rules on the
    veth interface:

    ip link add veth0 type veth peer veth1
    ip link set veth0 up
    ip link set veth1 up

    tc qdisc add dev veth0 clsact

    tc filter add dev veth0 ingress pref 10 chain 0 flower \
            num_of_vlans 2 cvlan_ethtype 0x800 action goto chain 5
    tc filter add dev veth0 ingress pref 20 chain 0 flower \
            num_of_vlans 1 action vlan push id 100 \
            protocol 0x8100 action goto chain 5
    tc filter add dev veth0 ingress pref 30 chain 5 flower \
            num_of_vlans 2 cvlan_ethtype 0x800 action simple sdata "success"

    Sending double-tagged vlan packet with the IP payload inside:

    cat <<ENDS | text2pcap - - | tcpreplay -i veth1 -
    0000  00 00 00 00 00 11 00 00 00 00 00 22 81 00 00 64   ..........."...d
    0010  81 00 00 14 08 00 45 04 00 26 04 d2 00 00 7f 11   ......E..&......
    0020  18 ef 0a 00 00 01 14 00 00 02 00 00 00 00 00 12   ................
    0030  e1 c7 00 00 00 00 00 00 00 00 00 00               ............
    ENDS

    will match rule 10, goto rule 30 in chain 5 and correctly emit "success" to
    the dmesg.

    OTOH, sending single-tagged vlan packet:

    cat <<ENDS | text2pcap - - | tcpreplay -i veth1 -
    0000  00 00 00 00 00 11 00 00 00 00 00 22 81 00 00 14   ..........."....
    0010  08 00 45 04 00 2a 04 d2 00 00 7f 11 18 eb 0a 00   ..E..*..........
    0020  00 01 14 00 00 02 00 00 00 00 00 16 e1 bf 00 00   ................
    0030  00 00 00 00 00 00 00 00 00 00 00 00               ............
    ENDS

    will match rule 20, will push the second vlan tag but will *not* match
    rule 30. IOW, the match at rule 30 fails if the second vlan was freshly
    pushed by the kernel.

    Lets look at  __skb_flow_dissect working on the double-tagged vlan packet.
    Here is the relevant code from around net/core/flow_dissector.c:1277
    copy-pasted here for convenience:

            if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX &&
                skb && skb_vlan_tag_present(skb)) {
                    proto = skb->protocol;
            } else {
                    vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
                                                data, hlen, &_vlan);
                    if (!vlan) {
                            fdret = FLOW_DISSECT_RET_OUT_BAD;
                            break;
                    }

                    proto = vlan->h_vlan_encapsulated_proto;
                    nhoff += sizeof(*vlan);
            }

    The "else" clause above gets the protocol of the encapsulated packet from
    the skb data at the network header location. printk debugging has showed
    that in the good double-tagged packet case proto is
    htons(0x800 == ETH_P_IP) as expected. However in the single-tagged packet
    case proto is garbage leading to the failure to match tc filter 30.

    proto is being set from the skb header pointed by nhoff parameter which is
    defined at the beginning of __skb_flow_dissect
    (net/core/flow_dissector.c:1055 in the current version):

                    nhoff = skb_network_offset(skb);

    Therefore the culprit seems to be that the skb network offset is different
    between double-tagged packet received from the interface and single-tagged
    packet having its vlan tag pushed by TC.

    Lets look at the interesting points of the lifetime of the single/double
    tagged packets as they traverse our packet flow.

    Both of them will start at __netif_receive_skb_core where the first vlan
    tag will be stripped:

            if (eth_type_vlan(skb->protocol)) {
                    skb = skb_vlan_untag(skb);
                    if (unlikely(!skb))
                            goto out;
            }

    At this stage in double-tagged case skb->data points to the second vlan tag
    while in single-tagged case skb->data points to the network (eg. IP)
    header.

    Looking at TC vlan push action (net/sched/act_vlan.c) we have the following
    code at tcf_vlan_act (interesting points are in square brackets):

            if (skb_at_tc_ingress(skb))
    [1]             skb_push_rcsum(skb, skb->mac_len);

            ....

            case TCA_VLAN_ACT_PUSH:
                    err = skb_vlan_push(skb, p->tcfv_push_proto, p->tcfv_push_vid |
                                        (p->tcfv_push_prio << VLAN_PRIO_SHIFT),
                                        0);
                    if (err)
                            goto drop;
                    break;

            ....

    out:
            if (skb_at_tc_ingress(skb))
    [3]             skb_pull_rcsum(skb, skb->mac_len);

    And skb_vlan_push (net/core/skbuff.c:6204) function does:

                    err = __vlan_insert_tag(skb, skb->vlan_proto,
                                            skb_vlan_tag_get(skb));
                    if (err)
                            return err;

                    skb->protocol = skb->vlan_proto;
    [2]             skb->mac_len += VLAN_HLEN;

    in the case of pushing the second tag. Lets look at what happens with
    skb->data of the single-tagged packet at each of the above points:

    1. As a result of the skb_push_rcsum, skb->data is moved back to the start
       of the packet.

    2. First VLAN tag is moved from the skb into packet buffer, skb->mac_len is
       incremented, skb->data still points to the start of the packet.

    3. As a result of the skb_pull_rcsum, skb->data is moved forward by the
       modified skb->mac_len, thus pointing to the network header again.

    Then __skb_flow_dissect will get confused by having double-tagged vlan
    packet with the skb->data at the network header.

    The solution for the bug is to preserve "skb->data at second vlan header"
    semantics in the skb_vlan_push function. We do this by manipulating
    skb->network_header rather than skb->mac_len. skb_vlan_push callers are
    updated to do skb_reset_mac_len.

    Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-11-22 11:07:15 +01:00
Antoine Tenart dca204658f net: ovs: fix ovs_drop_reasons error
JIRA: https://issues.redhat.com/browse/RHEL-68063
Upstream Status: linux.git

commit 57fb67783c4011581882f32e656d738da1f82042
Author: Menglong Dong <menglong8.dong@gmail.com>
Date:   Wed Aug 21 20:32:52 2024 +0800

    net: ovs: fix ovs_drop_reasons error

    There is something wrong with ovs_drop_reasons. ovs_drop_reasons[0] is
    "OVS_DROP_LAST_ACTION", but OVS_DROP_LAST_ACTION == __OVS_DROP_REASON + 1,
    which means that ovs_drop_reasons[1] should be "OVS_DROP_LAST_ACTION".

    And as Adrian tested, without the patch, adding flow to drop packets
    results in:

    drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
    origin: software
    input port ifindex: 8
    timestamp: Tue Aug 20 10:19:17 2024 859853461 nsec
    protocol: 0x800
    length: 98
    original length: 98
    drop reason: OVS_DROP_ACTION_ERROR

    With the patch, the same results in:

    drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
    origin: software
    input port ifindex: 8
    timestamp: Tue Aug 20 10:16:13 2024 475856608 nsec
    protocol: 0x800
    length: 98
    original length: 98
    drop reason: OVS_DROP_LAST_ACTION

    Fix this by initializing ovs_drop_reasons with index.

    Fixes: 9d802da40b7c ("net: openvswitch: add last-action drop reason")
    Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
    Tested-by: Adrian Moreno <amorenoz@redhat.com>
    Reviewed-by: Adrian Moreno <amorenoz@redhat.com>
    Link: https://patch.msgid.link/20240821123252.186305-1-dongml2@chinatelecom.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-11-19 14:30:01 +01:00
Michal Schmidt a672f9b810 netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local
JIRA: https://issues.redhat.com/browse/RHEL-59091

commit 05c1280a2bcfca187fe7fa90bb240602cf54af0a
Author: Alexander Lobakin <aleksander.lobakin@intel.com>
Date:   Thu Aug 29 14:33:38 2024 +0200

    netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local

    "Interface can't change network namespaces" is rather an attribute,
    not a feature, and it can't be changed via Ethtool.
    Make it a "cold" private flag instead of a netdev_feature and free
    one more bit.

    Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Conflicts:
	drivers/net/amt.c
	drivers/net/ethernet/adi/adin1110.c

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
2024-10-03 17:59:51 +02:00
Michal Schmidt 555cb3d84d netdev_features: convert NETIF_F_LLTX to dev->lltx
JIRA: https://issues.redhat.com/browse/RHEL-59091

commit 00d066a4d4edbe559ba6c35153da71d4b2b8a383
Author: Alexander Lobakin <aleksander.lobakin@intel.com>
Date:   Thu Aug 29 14:33:37 2024 +0200

    netdev_features: convert NETIF_F_LLTX to dev->lltx

    NETIF_F_LLTX can't be changed via Ethtool and is not a feature,
    rather an attribute, very similar to IFF_NO_QUEUE (and hot).
    Free one netdev_features_t bit and make it a "hot" private flag.

    Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Conflicts:
	drivers/net/macsec.c
	drivers/net/veth.c
	net/ipv6/ip6_tunnel.c
	- Context.

	drivers/net/amt.c
	drivers/net/netkit.c
	- Non-existent in RHEL 9.

	drivers/net/ethernet/chelsio/cxgb/cxgb2.c
	drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
	- Drivers disabled in RHEL 9. Skipped.

	net/dsa/user.c
	- This is slave.c in RHEL 9, but CONFIG_NET_DSA is disabled,
	  so skipped the hunk.

	net/core/net-sysfs.c
	- Code not present because of missing commit 74293ea1c4db
	  ("net: sysfs: Do not create sysfs for non BQL device")

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
2024-10-03 17:59:44 +02:00
cki-backport-bot e6673b02e6 net: openvswitch: fix overwriting ct original tuple for ICMPv6
JIRA: https://issues.redhat.com/browse/RHEL-44213
CVE: CVE-2024-38558

commit 7c988176b6c16c516474f6fceebe0f055af5eb56
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Thu May 9 11:38:05 2024 +0200

    net: openvswitch: fix overwriting ct original tuple for ICMPv6

    OVS_PACKET_CMD_EXECUTE has 3 main attributes:
     - OVS_PACKET_ATTR_KEY - Packet metadata in a netlink format.
     - OVS_PACKET_ATTR_PACKET - Binary packet content.
     - OVS_PACKET_ATTR_ACTIONS - Actions to execute on the packet.

    OVS_PACKET_ATTR_KEY is parsed first to populate sw_flow_key structure
    with the metadata like conntrack state, input port, recirculation id,
    etc.  Then the packet itself gets parsed to populate the rest of the
    keys from the packet headers.

    Whenever the packet parsing code starts parsing the ICMPv6 header, it
    first zeroes out fields in the key corresponding to Neighbor Discovery
    information even if it is not an ND packet.

    It is an 'ipv6.nd' field.  However, the 'ipv6' is a union that shares
    the space between 'nd' and 'ct_orig' that holds the original tuple
    conntrack metadata parsed from the OVS_PACKET_ATTR_KEY.

    ND packets should not normally have conntrack state, so it's fine to
    share the space, but normal ICMPv6 Echo packets or maybe other types of
    ICMPv6 can have the state attached and it should not be overwritten.

    The issue results in all but the last 4 bytes of the destination
    address being wiped from the original conntrack tuple leading to
    incorrect packet matching and potentially executing wrong actions
    in case this packet recirculates within the datapath or goes back
    to userspace.

    ND fields should not be accessed in non-ND packets, so not clearing
    them should be fine.  Executing memset() only for actual ND packets to
    avoid the issue.

    Initializing the whole thing before parsing is needed because ND packet
    may not contain all the options.

    The issue only affects the OVS_PACKET_CMD_EXECUTE path and doesn't
    affect packets entering OVS datapath from network interfaces, because
    in this case CT metadata is populated from skb after the packet is
    already parsed.

    Fixes: 9dd7f8907c ("openvswitch: Add original direction conntrack tuple to sw_flow_key.")
    Reported-by: Antonin Bas <antonin.bas@broadcom.com>
    Closes: https://github.com/openvswitch/ovs-issues/issues/327
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Eelco Chaudron <echaudro@redhat.com>
    Link: https://lore.kernel.org/r/20240509094228.1035477-1-i.maximets@ovn.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
2024-09-04 11:54:52 -03:00
Aaron Conole 680bab1536 openvswitch: Set the skbuff pkt_type for proper pmtud support.
JIRA: https://issues.redhat.com/browse/RHEL-37650
Upstream Status: commit 30a92c9e3d6b0

commit 30a92c9e3d6b073932762bef2ac66f4ee784c657
Author: Aaron Conole <aconole@redhat.com>
Date:   Thu May 16 16:09:41 2024 -0400

    openvswitch: Set the skbuff pkt_type for proper pmtud support.

    Open vSwitch is originally intended to switch at layer 2, only dealing with
    Ethernet frames.  With the introduction of l3 tunnels support, it crossed
    into the realm of needing to care a bit about some routing details when
    making forwarding decisions.  If an oversized packet would need to be
    fragmented during this forwarding decision, there is a chance for pmtu
    to get involved and generate a routing exception.  This is gated by the
    skbuff->pkt_type field.

    When a flow is already loaded into the openvswitch module this field is
    set up and transitioned properly as a packet moves from one port to
    another.  In the case that a packet execute is invoked after a flow is
    newly installed this field is not properly initialized.  This causes the
    pmtud mechanism to omit sending the required exception messages across
    the tunnel boundary and a second attempt needs to be made to make sure
    that the routing exception is properly setup.  To fix this, we set the
    outgoing packet's pkt_type to PACKET_OUTGOING, since it can only get
    to the openvswitch module via a port device or packet command.

    Even for bridge ports as users, the pkt_type needs to be reset when
    doing the transmit as the packet is truly outgoing and routing needs
    to get involved post packet transformations, in the case of
    VXLAN/GENEVE/udp-tunnel packets.  In general, the pkt_type on output
    gets ignored, since we go straight to the driver, but in the case of
    tunnel ports they go through IP routing layer.

    This issue is periodically encountered in complex setups, such as large
    openshift deployments, where multiple sets of tunnel traversal occurs.
    A way to recreate this is with the ovn-heater project that can setup
    a networking environment which mimics such large deployments.  We need
    larger environments for this because we need to ensure that flow
    misses occur.  In these environment, without this patch, we can see:

      ./ovn_cluster.sh start
      podman exec ovn-chassis-1 ip r a 170.168.0.5/32 dev eth1 mtu 1200
      podman exec ovn-chassis-1 ip netns exec sw01p1 ip r flush cache
      podman exec ovn-chassis-1 ip netns exec sw01p1 \
             ping 21.0.0.3 -M do -s 1300 -c2
      PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
      From 21.0.0.3 icmp_seq=2 Frag needed and DF set (mtu = 1142)

      --- 21.0.0.3 ping statistics ---
      ...

    Using tcpdump, we can also see the expected ICMP FRAG_NEEDED message is not
    sent into the server.

    With this patch, setting the pkt_type, we see the following:

      podman exec ovn-chassis-1 ip netns exec sw01p1 \
             ping 21.0.0.3 -M do -s 1300 -c2
      PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
      From 21.0.0.3 icmp_seq=1 Frag needed and DF set (mtu = 1222)
      ping: local error: message too long, mtu=1222

      --- 21.0.0.3 ping statistics ---
      ...

    In this case, the first ping request receives the FRAG_NEEDED message and
    a local routing exception is created.

    Tested-by: Jaime Caamano <jcaamano@redhat.com>
    Reported-at: https://issues.redhat.com/browse/FDP-164
    Fixes: 58264848a5 ("openvswitch: Add vxlan tunneling support.")
    Signed-off-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Eelco Chaudron <echaudro@redhat.com>
    Link: https://lore.kernel.org/r/20240516200941.16152-1-aconole@redhat.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Aaron Conole <aconole@redhat.com>
2024-08-01 11:36:53 -04:00
Lucas Zampieri 03feb1b243 Merge: CVE-2024-27395: net: openvswitch: Fix Use-After-Free in ovs_ct_exit
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4578

JIRA: https://issues.redhat.com/browse/RHEL-36364  
CVE: CVE-2024-27395

```
net: openvswitch: Fix Use-After-Free in ovs_ct_exit

Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
of ovs_ct_limit_exit, is not part of the RCU read critical section, it
is possible that the RCU grace period will pass during the traversal and
the key will be free.

To prevent this, it should be changed to hlist_for_each_entry_safe.

Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2)
```

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-29 19:30:30 +00:00
Lucas Zampieri 7941f9b2da Merge: openvswitch: add psample action.
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4658

JIRA: https://issues.redhat.com/browse/RHEL-31876
Upstream-Status: net-next.git
Tested: manual testing + OVS testsuite including psample-specific tests
from [1] + upstream kernel selftests tests including psample-specific
tests.

OpenvSwitch currently supports a feature called "per-flow sampling" by
which a controller such as OVN can configure certain flows that make the
matched packet get "sampled". The sample is sent via IPFIX alongside
OVN-generated metadata. This is very useful to enhance visibility on the
datapath. E.g: it can be used to know what NetworkPolicy impacted a certain
packet (and the packet header contents).

However, a big limitation makes this solution non-production ready:
samples have to go through ovs-vswitchd via upcall (userspace action) sharing
both netlink socket buffer and ovs-vswitchd thread time with actual packet
processing.

This series adds support for a new action called "psample" that, when used by
OVS, allows samples to go directly to some external observer through the
psample netlink multicast group fixing the current limitation and enabling
observability solutions to be built on top of OVS/OVN.





[1]
https://patchwork.ozlabs.org/project/openvswitch/cover/20240707200905.2719071-1-amorenoz@redhat.com/





Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-25 16:50:28 +00:00
Lucas Zampieri 980b2ba738 Merge: openvswitch: get related ct labels from its master if it is not confirmed
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4535

JIRA: https://issues.redhat.com/browse/RHEL-44560
Tested: compile only

Signed-off-by: Xin Long <lxin@redhat.com>

Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Aaron Conole <aconole@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-18 12:21:32 +00:00
Adrian Moreno 38ca257930 net: openvswitch: store sampling probability in cb.
JIRA: https://issues.redhat.com/browse/RHEL-31876
Upstream-Status: net-next.git

commit 71763d8a8203c28178d7be7f18af73d4dddb36ba
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Thu Jul 4 10:56:57 2024 +0200

    net: openvswitch: store sampling probability in cb.

    When a packet sample is observed, the sampling rate that was used is
    important to estimate the real frequency of such event.

    Store the probability of the parent sample action in the skb's cb area
    and use it in psample action to pass it down to psample module.

Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-7-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2024-07-09 16:32:56 +02:00
Adrian Moreno 74c750d315 net: openvswitch: add psample action
JIRA: https://issues.redhat.com/browse/RHEL-31876
Upstream-status: net-next.git

commit aae0b82b46cb5004bdf82a000c004d69a0885c33
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Thu Jul 4 10:56:56 2024 +0200

    net: openvswitch: add psample action

    Add support for a new action: psample.

    This action accepts a u32 group id and a variable-length cookie and uses
    the psample multicast group to make the packet available for
    observability.

    The maximum length of the user-defined cookie is set to 16, same as
    tc_cookie, to discourage using cookies that will not be offloadable.

Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-6-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2024-07-09 16:31:46 +02:00
cki-backport-bot c29d123d5c net: openvswitch: Fix Use-After-Free in ovs_ct_exit
JIRA: https://issues.redhat.com/browse/RHEL-36364
CVE: CVE-2024-27395

commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2
Author: Hyunwoo Kim <v4bel@theori.io>
Date:   Mon Apr 22 05:37:17 2024 -0400

    net: openvswitch: Fix Use-After-Free in ovs_ct_exit

    Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
    of ovs_ct_limit_exit, is not part of the RCU read critical section, it
    is possible that the RCU grace period will pass during the traversal and
    the key will be free.

    To prevent this, it should be changed to hlist_for_each_entry_safe.

    Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
    Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-06-25 17:40:27 +00:00
Xin Long 08cd1c3932 openvswitch: get related ct labels from its master if it is not confirmed
JIRA: https://issues.redhat.com/browse/RHEL-44560
Tested: compile only

commit a23ac973f67f37e77b3c634e8b1ad5b0164fcc1f
Author: Xin Long <lucien.xin@gmail.com>
Date:   Wed Jun 19 18:08:56 2024 -0400

    openvswitch: get related ct labels from its master if it is not confirmed

    Ilya found a failure in running check-kernel tests with at_groups=144
    (144: conntrack - FTP SNAT orig tuple) in OVS repo. After his further
    investigation, the root cause is that the labels sent to userspace
    for related ct are incorrect.

    The labels for unconfirmed related ct should use its master's labels.
    However, the changes made in commit 8c8b73320805 ("openvswitch: set
    IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
    led to getting labels from this related ct.

    So fix it in ovs_ct_get_labels() by changing to copy labels from its
    master ct if it is a unconfirmed related ct. Note that there is no
    fix needed for ct->mark, as it was already copied from its master
    ct for related ct in init_conntrack().

    Fixes: 8c8b73320805 ("openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
    Reported-by: Ilya Maximets <i.maximets@ovn.org>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
    Tested-by: Ilya Maximets <i.maximets@ovn.org>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Xin Long <lxin@redhat.com>
2024-06-22 16:03:50 -04:00
Ivan Vecera a4a12f7632 ip_tunnel: convert __be16 tunnel flags to bitmaps
JIRA: https://issues.redhat.com/browse/RHEL-40130

Conflicts:
- hunk for non-existing net/ipv4/fou_bpf.c skipped
- conflict in ip_gre.c resolved in the same way as upstream merge
  commit cf1ca1f66d30 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") did
- simple context conflict ip_tunnel.c due to missing commit
  c4794d22251b9 ("ipv4: tunnels: use DEV_STATS_INC()")
- simple context conflict in ip6_gre.c and ip6_tunnel.c due to missing
  commit 2fad1ba354d4a ("ipv6: tunnels: use DEV_STATS_INC()")
- simple conflict in nft_tunnel.c due to missing ffb3d9a30cc67 ("netfilter:
  nf_tables: use correct integer types")

commit 5832c4a77d6931cebf9ba737129ae8f14b66ee1d
Author: Alexander Lobakin <aleksander.lobakin@intel.com>
Date:   Wed Mar 27 16:23:53 2024 +0100

    ip_tunnel: convert __be16 tunnel flags to bitmaps

    Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT
    have been defined as __be16. Now all of those 16 bits are occupied
    and there's no more free space for new flags.
    It can't be simply switched to a bigger container with no
    adjustments to the values, since it's an explicit Endian storage,
    and on LE systems (__be16)0x0001 equals to
    (__be64)0x0001000000000000.
    We could probably define new 64-bit flags depending on the
    Endianness, i.e. (__be64)0x0001 on BE and (__be64)0x00010000... on
    LE, but that would introduce an Endianness dependency and spawn a
    ton of Sparse warnings. To mitigate them, all of those places which
    were adjusted with this change would be touched anyway, so why not
    define stuff properly if there's no choice.

    Define IP_TUNNEL_*_BIT counterparts as a bit number instead of the
    value already coded and a fistful of <16 <-> bitmap> converters and
    helpers. The two flags which have a different bit position are
    SIT_ISATAP_BIT and VTI_ISVTI_BIT, as they were defined not as
    __cpu_to_be16(), but as (__force __be16), i.e. had different
    positions on LE and BE. Now they both have strongly defined places.
    Change all __be16 fields which were used to store those flags, to
    IP_TUNNEL_DECLARE_FLAGS() -> DECLARE_BITMAP(__IP_TUNNEL_FLAG_NUM) ->
    unsigned long[1] for now, and replace all TUNNEL_* occurrences to
    their bitmap counterparts. Use the converters in the places which talk
    to the userspace, hardware (NFP) or other hosts (GRE header). The rest
    must explicitly use the new flags only. This must be done at once,
    otherwise there will be too many conversions throughout the code in
    the intermediate commits.
    Finally, disable the old __be16 flags for use in the kernel code
    (except for the two 'irregular' flags mentioned above), to prevent
    any accidental (mis)use of them. For the userspace, nothing is
    changed, only additions were made.

    Most noticeable bloat-o-meter difference (.text):

    vmlinux:        307/-1 (306)
    gre.ko:         62/0 (62)
    ip_gre.ko:      941/-217 (724)  [*]
    ip_tunnel.ko:   390/-900 (-510) [**]
    ip_vti.ko:      138/0 (138)
    ip6_gre.ko:     534/-18 (516)   [*]
    ip6_tunnel.ko:  118/-10 (108)

    [*] gre_flags_to_tnl_flags() grew, but still is inlined
    [**] ip_tunnel_find() got uninlined, hence such decrease

    The average code size increase in non-extreme case is 100-200 bytes
    per module, mostly due to sizeof(long) > sizeof(__be16), as
    %__IP_TUNNEL_FLAG_NUM is less than %BITS_PER_LONG and the compilers
    are able to expand the majority of bitmap_*() calls here into direct
    operations on scalars.

    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-06-12 14:49:18 +02:00
Lucas Zampieri 9cf345119a Merge: net: openvswitch: limit the number of recursions from action sets
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3929

# Merge Request Required Information

## Summary of Changes
JIRA: https://issues.redhat.com/browse/RHEL-23575  
CVE: CVE-2024-1151  

```
commit 6e2f90d31fe09f2b852de25125ca875aabd81367
Author: Aaron Conole <aconole@redhat.com>
Date:   Fri Feb 09 21:54:38 2024 +0100

    net: openvswitch: limit the number of recursions from action sets

    The ovs module allows for some actions to recursively contain an action
    list for complex scenarios, such as sampling, checking lengths, etc.
    When these actions are copied into the internal flow table, they are
    evaluated to validate that such actions make sense, and these calls
    happen recursively.

    The ovs-vswitchd userspace won't emit more than 16 recursion levels
    deep.  However, the module has no such limit and will happily accept
    limits larger than 16 levels nested.  Prevent this by tracking the
    number of recursions happening and manually limiting it to 16 levels
    deep.  However, the module has no such limit and will happily accept
    limits larger than 16 levels nested.  Prevent this by tracking the
    number of recursions happening and manually limiting it to 16 levels
    nested.

    The initial implementation of the sample action would track this depth
    and prevent more than 3 levels of recursion, but this was removed to
    support the clone use case, rather than limited at the current userspace
    limit.

    Fixes: 798c166173 ("openvswitch: Optimize sample action for the clone use cases")
    Signed-off-by: Aaron Conole <aconole@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240207132416.1488485-2-aconole@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
```

Signed-off-by: Aaron Conole <aconole@redhat.com>


## Approved Development Ticket
All submissions to CentOS Stream must reference an approved ticket in [Red Hat Jira](https://issues.redhat.com/). Please follow the CentOS Stream [contribution documentation](https://docs.centos.org/en-US/stream-contrib/quickstart/) for how to file this ticket and have it approved.

Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-05-16 13:19:49 +00:00
Patrick Talbert ee83f8fca0 Merge: ovs: P1 backports for 9.5
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4003

JIRA: https://issues.redhat.com/browse/RHEL-32143

Signed-off-by: Antoine Tenart <atenart@redhat.com>

Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Davide Caratti <dcaratti@redhat.com>

Merged-by: Patrick Talbert <ptalbert@redhat.com>
2024-05-07 08:23:25 +02:00
Aaron Conole c8ed6bb98d net: openvswitch: limit the number of recursions from action sets
JIRA: https://issues.redhat.com/browse/RHEL-23575
CVE: CVE-2024-1151

commit 6e2f90d31fe09f2b852de25125ca875aabd81367
Author: Aaron Conole <aconole@redhat.com>
Date:   Fri Feb 09 21:54:38 2024 +0100

    net: openvswitch: limit the number of recursions from action sets

    The ovs module allows for some actions to recursively contain an action
    list for complex scenarios, such as sampling, checking lengths, etc.
    When these actions are copied into the internal flow table, they are
    evaluated to validate that such actions make sense, and these calls
    happen recursively.

    The ovs-vswitchd userspace won't emit more than 16 recursion levels
    deep.  However, the module has no such limit and will happily accept
    limits larger than 16 levels nested.  Prevent this by tracking the
    number of recursions happening and manually limiting it to 16 levels
    nested.

    The initial implementation of the sample action would track this depth
    and prevent more than 3 levels of recursion, but this was removed to
    support the clone use case, rather than limited at the current userspace
    limit.

    Fixes: 798c166173 ("openvswitch: Optimize sample action for the clone use cases")
    Signed-off-by: Aaron Conole <aconole@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240207132416.1488485-2-aconole@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Aaron Conole <aconole@redhat.com>
2024-04-30 15:08:13 -04:00
Ivan Vecera 25a5e1ea3a genetlink: remove userhdr from struct genl_info
JIRA: https://issues.redhat.com/browse/RHEL-30656

commit bffcc6882a1bb2be8c9420184966f4c2c822078e
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Aug 14 14:47:16 2023 -0700

    genetlink: remove userhdr from struct genl_info

    Only three families use info->userhdr today and going forward
    we discourage using fixed headers in new families.
    So having the pointer to user header in struct genl_info
    is an overkill. Compute the header pointer at runtime.

    Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Link: https://lore.kernel.org/r/20230814214723.2924989-4-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-10 09:19:30 +02:00
Antoine Tenart 5b4efc3e3e net: openvswitch: fix unwanted error log on timeout policy probing
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: net.git

commit 4539f91f2a801c0c028c252bffae56030cfb2cae
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Wed Apr 3 22:38:01 2024 +0200

    net: openvswitch: fix unwanted error log on timeout policy probing

    On startup, ovs-vswitchd probes different datapath features including
    support for timeout policies.  While probing, it tries to execute
    certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE
    attributes set.  These attributes tell the openvswitch module to not
    log any errors when they occur as it is expected that some of the
    probes will fail.

    For some reason, setting the timeout policy ignores the PROBE attribute
    and logs a failure anyway.  This is causing the following kernel log
    on each re-start of ovs-vswitchd:

      kernel: Failed to associated timeout policy `ovs_test_tp'

    Fix that by using the same logging macro that all other messages are
    using.  The message will still be printed at info level when needed
    and will be rate limited, but with a net rate limiter instead of
    generic printk one.

    The nf_ct_set_timeout() itself will still print some info messages,
    but at least this change makes logging in openvswitch module more
    consistent.

    Fixes: 06bd2bdf19 ("openvswitch: Add timeout support to ct action")
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Acked-by: Eelco Chaudron <echaudro@redhat.com>
    Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-04-08 17:06:24 +02:00
Antoine Tenart f2da139eeb net: openvswitch: Annotate struct mask_array with __counted_by
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: linux.git

commit 7713ec844756a9883ba9a91381369256275de4fb
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat Oct 14 08:34:53 2023 +0200

    net: openvswitch: Annotate struct mask_array with __counted_by

    Prepare for the coming implementation by GCC and Clang of the __counted_by
    attribute. Flexible array members annotated with __counted_by can have
    their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
    (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
    functions).

    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/ca5c8049f58bb933f231afd0816e30a5aaa0eddd.1697264974.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-04-08 17:06:24 +02:00
Antoine Tenart 05d7ee7607 net: openvswitch: Annotate struct dp_meter with __counted_by
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: linux.git

commit 16ae53d80c00445c903128f2a64af87b5a03d474
Author: Kees Cook <keescook@chromium.org>
Date:   Fri Sep 22 10:28:54 2023 -0700

    net: openvswitch: Annotate struct dp_meter with __counted_by

    Prepare for the coming implementation by GCC and Clang of the __counted_by
    attribute. Flexible array members annotated with __counted_by can have
    their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
    (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
    functions).

    As found with Coccinelle[1], add __counted_by for struct dp_meter.

    [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

    Cc: Pravin B Shelar <pshelar@ovn.org>
    Cc: dev@openvswitch.org
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Link: https://lore.kernel.org/r/20230922172858.3822653-12-keescook@chromium.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-04-08 17:06:24 +02:00
Antoine Tenart 8ce373886e net: openvswitch: Annotate struct dp_meter_instance with __counted_by
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: linux.git

commit e7b34822fa4dcf6101deb3d51a77efd77533571d
Author: Kees Cook <keescook@chromium.org>
Date:   Fri Sep 22 10:28:52 2023 -0700

    net: openvswitch: Annotate struct dp_meter_instance with __counted_by

    Prepare for the coming implementation by GCC and Clang of the __counted_by
    attribute. Flexible array members annotated with __counted_by can have
    their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
    (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
    functions).

    As found with Coccinelle[1], add __counted_by for struct dp_meter_instance.

    [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

    Cc: Pravin B Shelar <pshelar@ovn.org>
    Cc: dev@openvswitch.org
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Link: https://lore.kernel.org/r/20230922172858.3822653-10-keescook@chromium.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-04-08 17:01:06 +02:00
Ivan Vecera 49681c8b6d rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link
JIRA: https://issues.redhat.com/browse/RHEL-30344

commit f3a63cce1b4fbde7738395c5a2dea83f05de3407
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Fri Oct 28 04:42:24 2022 -0400

    rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link

    This patch use the new helper unregister_netdevice_many_notify() for
    rtnl_delete_link(), so that the kernel could reply unicast when userspace
     set NLM_F_ECHO flag to request the new created interface info.

    At the same time, the parameters of rtnl_delete_link() need to be updated
    since we need nlmsghdr and portid info.

    Suggested-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Reviewed-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-02 11:15:38 +02:00
Davide Caratti 08f7955211 net/sched: act_ct: Always fill offloading tuple iifidx
JIRA: https://issues.redhat.com/browse/RHEL-21360
Upstream Status: net.git commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba

commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Fri Nov 3 16:14:10 2023 +0100

    net/sched: act_ct: Always fill offloading tuple iifidx

    Referenced commit doesn't always set iifidx when offloading the flow to
    hardware. Fix the following cases:

    - nf_conn_act_ct_ext_fill() is called before extension is created with
    nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
    unspecified iifidx when connection is offloaded after only single
    original-direction packet has been processed by tc data path. Always fill
    the new nf_conn_act_ct_ext instance after creating it in
    nf_conn_act_ct_ext_add().

    - Offloading of unidirectional UDP NEW connections is now supported, but ct
    flow iifidx field is not updated when connection is promoted to
    bidirectional which can result reply-direction iifidx to be zero when
    refreshing the connection. Fill in the extension and update flow iifidx
    before calling flow_offload_refresh().

    Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
    Reviewed-by: Paul Blakey <paulb@nvidia.com>
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
    Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-01-11 15:54:54 +01:00
Jan Stancek f25e7a1141 Merge: ovs: P1 backports from upstream
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3247

JIRA: https://issues.redhat.com/browse/RHEL-14346

Signed-off-by: Antoine Tenart <atenart@redhat.com>

Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-11-24 07:31:07 +01:00
Scott Weaver ad72d6de84 Merge: 9.4 mm changes
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2843

JIRA: https://issues.redhat.com/browse/RHEL-1848

Already in CS9
Omitted-fix: 327b18b7aaed ("mm/kfence: select random number before taking raw lock")
Omitted-fix: bfbfb6182ad1 ("nfsd_splice_actor(): handle compound pages")
Omitted-fix: ac8db824ead0 ("NFSD: Fix reads with a non-zero offset that don't end on a page boundary")
Omitted-fix: b3719108ae60 ("perf kmem: Support legacy tracepoints")
Omitted-fix: dce088ab0d51 ("perf kmem: Support field "node" in evsel__process_alloc_event() coping with recent tracepoint restructuring")
Omitted-fix: c18c20f16219 ("mm, slab: remove duplicate kernel-doc comment for ksize()")
Omitted-fix: cfccd2e63e7e ("mm, compaction: finish pageblocks on complete migration failure")
Omitted-fix: 6342140db660 ("selftests/timens: add a test for vfork+exit")
Omitted-fix: be6667b0db97 ("selftests/vm: dedup hugepage allocation logic")
Omitted-fix: 9d0d94684007 ("selftests/vm: add selftest to verify multi THP collapse")
Omitted-fix: 1370a21fe470 ("selftests/vm: add selftest to verify recollapse of THPs")
Omitted-fix: b25806dcd3d5 ("mm: memcontrol: deprecate swapaccounting=0 mode")
Omitted-fix: b94c4e949c36 ("mm: memcontrol: use do_memsw_account() in a few more places")
Omitted-fix: e55b9f96860f ("mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol")
Omitted-fix: 6f777dcef774 ("docs: kmsan: fix formatting of "Example report"")
Omitted fix: 26e1a0c3277d ("mm: use pmdp_get_lockless() without surplus barrier()")
Omitted-fix: 0cb8fd4d1416 ("mm/migrate: remove cruft from migration_entry_wait()s")

patches resulting in empty commits after conflict resolution
Omitted-fix: 4a7e922587d2 ("selftests: vm: add /dev/userfaultfd test cases to run_vmtests.sh")

patches that are functionally identical
Omitted-fix: 6f777dcef774 ("docs: kmsan: fix formatting of "Example report"")
   Is identical to 436fa4a699bc ("docs: kmsan: fix formatting of "Example report"")

Defer to crypto group
Omitted-fix: f900fde28883 ("crypto: testmgr - fix RNG performance in fuzz tests")

Not including since we're specifically excluding the Maple Tree VMA Iterator
Omitted-fix: 524e00b36e8c ("mm: remove rb tree.")

'series' patches that won't be addressed by this MR
Omitted-fix: 9905eed48e82 ("Merge branch 'af_unix-OOB-fixes'")
Omitted-fix: 2e4b231ac125 ("scsi: NCR5380: Use sc_data_direction instead of rq_data_dir()")
Omitted-fix: 40e16ce7b6fa ("scsi: advansys: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 11bf4ec58073 ("scsi: aha1542: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 3ada9c791b1d ("scsi: dpt_i2o: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 240ec1197786 ("scsi: ips: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: ce425dd7dbc9 ("scsi: mvumi: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 2fd8f23aae36 ("scsi: myrb: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 43b2d1b14ed0 ("scsi: myrs: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 0f8f3ea84a89 ("scsi: ncr53c8xx: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 3f5e62c5e074 ("scsi: qla1280: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: ba4baf0951bb ("scsi: qlogicpti: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: ec808ef9b838 ("scsi: snic: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: bbfa8d7d1283 ("scsi: stex: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 6c5d5422c533 ("scsi: sun3_scsi: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 77ff7756c73e ("scsi: sym53c8xx: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 80ca10b6052d ("scsi: xen-scsifront: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 332f606b32b6 ("ovl: enable RCU'd ->get_acl()")
Omitted-fix: b3b6f5b92255 ("btrfs: handle idmaps in btrfs_new_inode()")
Omitted-fix: ca07274c3da9 ("btrfs: allow idmapped rename inode op")
Omitted-fix: c020d2eaf1a8 ("btrfs: allow idmapped getattr inode op")
Omitted-fix: 72105277dcfc ("btrfs: allow idmapped mknod inode op")
Omitted-fix: e93ca491d03f ("btrfs: allow idmapped create inode op")
Omitted-fix: b0b3e44d346c ("btrfs: allow idmapped mkdir inode op")
Omitted-fix: 5a0521086e5f ("btrfs: allow idmapped symlink inode op")
Omitted-fix: 98b6ab5fc098 ("btrfs: allow idmapped tmpfile inode op")
Omitted-fix: d4d094646142 ("btrfs: allow idmapped setattr inode op")
Omitted-fix: 3bc71ba02cf5 ("btrfs: allow idmapped permission inode op")
Omitted-fix: 5474bf400f16 ("btrfs: check whether fsgid/fsuid are mapped during subvolume creation")
Omitted-fix: 4d4340c912cc ("btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls")
Omitted-fix: c4ed533bdc79 ("btrfs: allow idmapped SNAP_DESTROY ioctls")
Omitted-fix: aabb34e7a31c ("btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids")
Omitted-fix: e4fed17a32b6 ("btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls")
Omitted-fix: 39e1674ff035 ("btrfs: allow idmapped SUBVOL_SETFLAGS ioctl")
Omitted-fix: 6623d9a0b0ce ("btrfs: allow idmapped INO_LOOKUP_USER ioctl")
Omitted-fix: 4a8b34afa9c9 ("btrfs: handle ACLs on idmapped mounts")
Omitted-fix: 5b9b26f5d0b8 ("btrfs: allow idmapped mount")
Omitted-fix: 8cc5c54de44c ("docs: update mapping documentation")
Omitted-fix: 02e407991350 ("fs: remove unused low-level mapping helpers")
Omitted-fix: ce70fd9a551a ("scsi: core: Remove the cmd field from struct scsi_request")
Omitted-fix: 5b794f98074a ("scsi: core: Remove the sense and sense_len fields from struct scsi_request")
Omitted-fix: a9a4ea1166d6 ("scsi: core: Move the resid_len field from struct scsi_request to struct scsi_cmnd")
Omitted-fix: dbb4c84d87af ("scsi: core: Move the result field from struct scsi_request to struct scsi_cmnd")
Omitted-fix: 6aded12b10e0 ("scsi: core: Remove struct scsi_request")
Omitted-fix: 264403033105 ("scsi: core: Remove <scsi/scsi_request.h>")
Omitted-fix: cd4b46cdb491 ("scsi: 53c700: Use scsi_cmd_to_rq() instead of scsi_cmnd.request")
Omitted-fix: 417c434aa1b4 ("docs/zh_CN: core-api: Update the translation of cachetlb.rst to 5.19-rc3")
Omitted-fix: 1ebfae49fd44 ("docs/zh_CN: core-api: Update the translation of cpu_hotplug.rst to 5.19-rc3")
Omitted-fix: 722ecdbce68a ("docs/zh_CN: core-api: Update the translation of irq/irq-domain.rst to 5.19-rc3")
Omitted-fix: b2fdf7f080b4 ("docs/zh_CN: core-api: Update the translation of kernel-api.rst to 5.19-rc3")
Omitted-fix: e86a0e297f0b ("docs/zh_CN: core-api: Update the translation of printk-format.rst to 5.19-rc3")
Omitted-fix: c290f175e73f ("docs/zh_CN: core-api: Update the translation of workqueue.rst to 5.19-rc3")
Omitted-fix: 4a6d00a43ef7 ("docs/zh_CN: core-api: Update the translation of xarray.rst to 5.19-rc3")
Omitted-fix: e8f60cd7db24 ("Merge tag 'perf-tools-fixes-for-v6.2-2-2023-01-11' of git://git.kernel.org/pub/scm/linux/ker…")
Omitted-fix: 3a761d72fa62 ("exportfs: support idmapped mounts")
Omitted-fix: 22f289ce1f8b ("ovl: use ovl_lookup_upper() wrapper")
Omitted-fix: 50db8d027355 ("ovl: handle idmappings for layer fileattrs")
Omitted-fix: c85bcc912f4f ("kselftests: memcg: update the oom group leaf events test")
Omitted-fix: be74553f250f ("kselftests: memcg: speed up the memory.high test")
Omitted-fix: 1bd1a4dd3e8c ("MAINTAINERS: add corresponding kselftests to cgroup entry")
Omitted-fix: 3a761d72fa62 ("exportfs: support idmapped mounts")
Omitted-fix: 22f289ce1f8b ("ovl: use ovl_lookup_upper() wrapper")
Omitted-fix: 50db8d027355 ("ovl: handle idmappings for layer fileattrs")
Omitted-fix: c85bcc912f4f ("kselftests: memcg: update the oom group leaf events test")
Omitted-fix: be74553f250f ("kselftests: memcg: speed up the memory.high test")
Omitted-fix: 1bd1a4dd3e8c ("MAINTAINERS: add corresponding kselftests to cgroup entry")
Omitted-fix: cdc69458a5f3 ("cgroup: account for memory_recursiveprot in test_memcg_low()")
Omitted-fix: 72b1e03aa725 ("cgroup: account for memory_localevents in test_memcg_oom_group_leaf_events()")
Omitted-fix: 830316807e02 ("cgroup: remove racy check in test_memcg_sock()")
Omitted-fix: c1a31a2f7a9c ("cgroup: fix racy check in alloc_pagecache_max_30M() helper function")
Omitted-fix: c01d4d0a82b7 ("random: quiet urandom warning ratelimit suppression message")
Omitted-fix: 21873bd66b6e ("Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux")
Omitted-fix: ff3b72a5d614 ("selftests: memcg: fix compilation")
Omitted-fix: 1d09069f5313 ("selftests: memcg: expect no low events in unprotected sibling")
Omitted-fix: 63fbdd3c77ec ("net: use DEBUG_NET_WARN_ON_ONCE() in __release_sock()")
Omitted-fix: 76458faeb285 ("net: use DEBUG_NET_WARN_ON_ONCE() in dev_loopback_xmit()")
Omitted-fix: 3e7f2b8d3088 ("net: use WARN_ON_ONCE() in inet_sock_destruct()")
Omitted-fix: 7890e2f09d43 ("net: use DEBUG_NET_WARN_ON_ONCE() in skb_release_head_state()")
Omitted-fix: ee2640df2393 ("net: add debug checks in napi_consume_skb and __napi_alloc_skb()")
Omitted-fix: 39e0f991a62e ("random: mark bootloader randomness code as __init")
Omitted-fix: 6342140db660 ("selftests/timens: add a test for vfork+exit")
Omitted-fix: cf21b355ccb3 ("af_unix: Optimise hash table layout.")
Omitted-fix: c12db92d62bf ("ovl: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 73db6a063c78 ("ovl: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 1e8a9191ccc2 ("f2fs: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: a03a972b26da ("fuse: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 00d369bc2de5 ("fuse: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 276a3f7cf1d9 ("ksmbd: port to vfs{g,u}id_t and associated helpers")
Omitted-fix: 45c311501c77 ("fs: use mount types in iattr")
Omitted-fix: 1f36146a5a3d ("fs: introduce tiny iattr ownership update helpers")
Omitted-fix: 35faf3109a78 ("fs: port to iattr ownership update helpers")
Omitted-fix: 71e7b535b890 ("quota: port quota helpers mount ids")
Omitted-fix: b27c82e12965 ("attr: port attribute changes to new types")
Omitted-fix: cf21b355ccb3 ("af_unix: Optimise hash table layout.")
Omitted-fix: e95ab1d85289 ("selftests: net: af_unix: Test connect() with different netns.")
Omitted-fix: 169005eae2af ("docs/zh_CN: Update the translation of mm-api to 6.1-rc8")
Omitted-fix: 659797dc4d64 ("Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8")
Omitted-fix: 6a5057e9dc13 ("Docs/zh_CN: Update the translation of sparse to 5.19-rc8")
Omitted-fix: 63c1d2516b05 ("Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8")
Omitted-fix: 83b41bb27b25 ("Docs/zh_CN: Update the translation of usage to 5.19-rc8")
Omitted-fix: c78478e164d4 ("Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8")
Omitted-fix: ce1120076c53 ("Docs/zh_CN: Update the translation of pci to 5.19-rc8")
Omitted-fix: 4116ff79749d ("Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8")
Omitted-fix: 7f02464739da ("9p: convert to advancing variant of iov_iter_get_pages_alloc()")
Omitted-fix: 5b09c9fec086 ("do_proc_readlink(): constify path")
Omitted-fix: ea4af4aa03c3 ("nd_jump_link(): constify path")
Omitted-fix: 20f45ad50d65 ("spufs: constify path")
Omitted-fix: 88569546e8a1 ("ecryptfs: constify path")
Omitted-fix: 9204a97f7ae8 ("sched: Change wait_task_inactive()s match_state")
Omitted-fix: 04c6b79ae4f0 ("btrfs: convert __process_pages_contig() to use filemap_get_folios_contig()")
Omitted-fix: a75b81c3f63b ("btrfs: convert end_compressed_writeback() to use filemap_get_folios()")
Omitted-fix: 47d554199513 ("btrfs: convert process_page_range() to use filemap_get_folios_contig()")
Omitted-fix: 24a1efb4a912 ("nilfs2: convert nilfs_find_uncommited_extent() to use filemap_get_folios_contig()")
Omitted-fix: 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early")
Omitted-fix: 7d37539037c2 ("fuse: implement ->tmpfile()")
Omitted-fix: f743f16c548b ("treewide: use get_random_{u8,u16}() when possible, part 2")
Omitted-fix: 6ab587e8e8b4 ("docs/zh_CN: Update the translation of delay-accounting to 6.1-rc8")
Omitted-fix: cf306a26cb3a ("docs/zh_CN: Update the translation of kernel-api to 6.1-rc8")
Omitted-fix: e07e9f22259e ("docs/zh_CN: Update the translation of testing-overview to 6.1-rc8")
Omitted-fix: ffdd9bd7a278 ("docs/zh_CN: Update the translation of reclaim to 6.1-rc8")
Omitted-fix: 9a833802a04d ("docs/zh_CN: Update the translation of start to 6.1-rc8")
Omitted-fix: 7cb52d4b3724 ("docs/zh_CN: Update the translation of usage to 6.1-rc8")
Omitted-fix: 03474d581df3 ("docs/zh_CN: Update the translation of msi-howto to 6.1-rc8")
Omitted-fix: 7df047be4363 ("docs/zh_CN: Update the translation of energy-model to 6.1-rc8")
Omitted-fix: e0068090095c ("docs/zh_CN: Update the translation of highmem to 6.1-rc8")
Omitted-fix: 0f3d70cb01da ("docs/zh_CN: Update the translation of ksm to 6.1-rc8")
Omitted-fix: 11018ef90ce7 ("s390/checksum: remove not needed uaccess.h include")
Omitted-fix: 2ea3498980f5 ("mm/damon/core: split out DAMOS-charged region skip logic into a new function")
Omitted-fix: e63a30c51f84 ("mm/damon/core: split damos application logic into a new function")
Omitted-fix: d1cbbf621fc2 ("mm/damon/core: split out scheme stat update logic into a new function")
Omitted-fix: 898810e5ca54 ("mm/damon/core: split out scheme quota adjustment logic into a new function")
Omitted-fix: 789a230613c8 ("mm/damon/sysfs: use damon_addr_range for region's start and end values")
Omitted-fix: 1f71981408ef ("mm/damon/sysfs: remove parameters of damon_sysfs_region_alloc()")
Omitted-fix: 39240595917e ("mm/damon/sysfs: move sysfs_lock to common module")
Omitted-fix: d332fe11debe ("mm/damon/sysfs: move unsigned long range directory to common module")
Omitted-fix: 4acd715ff57f ("mm/damon/sysfs: split out kdamond-independent schemes stats update logic into a new function")
Omitted-fix: c8e7b4d0ba34 ("mm/damon/sysfs: split out schemes directory implementation to separate file")
Omitted fix: dfe843dce775 ("s390/checksum: support GENERIC_CSUM, enable it for KASAN")
Omitted fix: e42ac7789df6 ("s390/checksum: always use cksm instruction")
Omitted fix: 1a167ddd3c56 ("x86: kmsan: pgtable: reduce vmalloc space")
Omitted fix: 7cf8f44a5a1c ("x86: fs: kmsan: disable CONFIG_DCACHE_WORD_ACCESS")
Omitted fix: 1468c6f4558b ("mm: fs: initialize fsdata passed to write_begin/write_end interface")
Omitted fix: 0aa8ea3c5d35 ("mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages")
Omitted fix: 42855f588e18 ("x86/purgatory: disable KMSAN instrumentation")
Omitted fix: 11385b261200 ("x86/uaccess: instrument copy_from_user_nmi()")
Omitted fix: f70da5ee8fe1 ("mm/damon: convert damon_pa_mark_accessed_or_deactivate() to use folios")
Omitted fix: 5a9e34747c9f ("mm/swap: convert deactivate_page() to folio_deactivate()")
Omitted fix: 0aa8ea3c5d35 ("mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages")
Omitted fix: de1f5055523e ("mm/mempolicy: convert queue_pages_pmd() to queue_folios_pmd()")
Omitted fix: 3dae02bbd07f ("mm/mempolicy: convert queue_pages_pte_range() to queue_folios_pte_range()")
Omitted fix: 0a2c1e818316 ("mm/mempolicy: convert queue_pages_hugetlb() to queue_folios_hugetlb()")
Omitted fix: d451b89dcd18 ("mm/mempolicy: convert queue_pages_required() to queue_folio_required()")
Omitted fix: 4a64981dfee9 ("mm/mempolicy: convert migrate_page_add() to migrate_folio_add()")
Omitted fix: 0aa8ea3c5d35 ("mm/compaction: correct comment of fast_find_migrateblock in isolate_migratepages")
Omitted fix: 46c475bd676b ("mm/pgtable: kmap_local_page() instead of kmap_atomic()")
Omitted fix: 0d940a9b270b ("mm/pgtable: allow pte_offset_map[_lock]() to fail")
Omitted fix: 65747aaf42b7 ("mm/filemap: allow pte_offset_map_lock() to fail")
Omitted fix: 45fe85e9811e ("mm/page_vma_mapped: delete bogosity in page_vma_mapped_walk()")
Omitted fix: 90f43b0a13cd ("mm/page_vma_mapped: reformat map_pte() with less indentation")
Omitted fix: 2798bbe75b9c ("mm/page_vma_mapped: pte_offset_map_nolock() not pte_lockptr()")
Omitted fix: 7780d04046a2 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails")
Omitted fix: be872f83bf57 ("mm/pagewalk: walk_pte_range() allow for pte_offset_map()")
Omitted fix: e5ad581c7f1c ("mm/vmwgfx: simplify pmd & pud mapping dirty helpers")
Omitted fix: 0d1c81edc61e ("mm/vmalloc: vmalloc_to_page() use pte_offset_kernel()")
Omitted fix: 6ec1905f6ec7 ("mm/hmm: retry if pte_offset_map() fails")
Omitted fix: 2b683a4ff6ee ("mm/userfaultfd: retry if pte_offset_map() fails")
Omitted fix: 3622d3cde308 ("mm/userfaultfd: allow pte_offset_map_lock() to fail")
Omitted fix: 9f2bad096d2f ("mm/debug_vm_pgtable,page_table_check: warn pte map fails")
Omitted fix: 04dee9e85cf5 ("mm/various: give up if pte_offset_map[_lock]() fails")
Omitted fix: 670ddd8cdcbd ("mm/mprotect: delete pmd_none_or_clear_bad_unless_trans_huge()")
Omitted fix: a5be621ee292 ("mm/mremap: retry if either pte_offset_map_*lock() fails")
Omitted fix: 179d3e4f3bfa ("mm/madvise: clean up force_shm_swapin_readahead()")
Omitted fix: d850fa729873 ("mm/swapoff: allow pte_offset_map[_lock]() to fail")
Omitted fix: 52fc048320ad ("mm/mglru: allow pte_offset_map_nolock() to fail")
Omitted fix: 4b56069c95d6 ("mm/migrate_device: allow pte_offset_map_lock() to fail")
Omitted fix: 2378118bd9da ("mm/gup: remove FOLL_SPLIT_PMD use of pmd_trans_unstable()")
Omitted fix: c9c1ee20ee84 ("mm/huge_memory: split huge pmd under one pte_offset_map()")
Omitted fix: 895f5ee464cc ("mm/khugepaged: allow pte_offset_map[_lock]() to fail")
Omitted fix: 3db82b9374ca ("mm/memory: allow pte_offset_map[_lock]() to fail")
Omitted fix: c7ad08804fae ("mm/memory: handle_pte_fault() use pte_offset_map_nolock()")
Omitted fix: 20b18aada185 ("madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check")
Omitted fix: 3db82b9374ca ("mm/memory: allow pte_offset_map[_lock]() to fail")
Omitted fix: c7ad08804fae ("mm/memory: handle_pte_fault() use pte_offset_map_nolock()")
Omitted fix: 20b18aada185 ("madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check")

Coming Soon:
Omitted-fix: 6f0df8e16eb5 ("memcontrol: ensure memcg acquired by id is properly set up")
Omitted-fix: ee40d543e97d ("mm/pagewalk: fix bootstopping regression from extra pte_unmap()")
Omitted-fix: ab048302026d ("ovl: fix failed copyup of fileattr on a symlink")
Omitted-fix: 92fe9dcbe4e1 ("hugetlbfs: clear resv_map pointer if mmap fails")
Omitted-fix: bf4916922c60 ("hugetlbfs: extend hugetlb_vma_lock to private VMAs")
Omitted-fix: 2820b0f09be9 ("hugetlbfs: close race between MADV_DONTNEED and page fault")

Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=56452800
Tested: KT1+mm regression: https://beaker.engineering.redhat.com/jobs/8467307
Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>

Approved-by: Jan Stancek <jstancek@redhat.com>
Approved-by: Mika Penttilä <mpenttil@redhat.com>
Approved-by: Jerry Snitselaar <jsnitsel@redhat.com>
Approved-by: Alex Gladkov <agladkov@redhat.com>
Approved-by: Vladis Dronov <vdronov@redhat.com>
Approved-by: Dean Nelson <dnelson@redhat.com>
Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Baoquan He <5820488-baoquan_he@users.noreply.gitlab.com>
Approved-by: Jiri Benc <jbenc@redhat.com>
Approved-by: John W. Linville <linville@redhat.com>

Signed-off-by: Scott Weaver <scweaver@redhat.com>
2023-10-25 11:39:20 -04:00
Scott Weaver d05495aca0 Merge: CNB94: tc: update tc subsystem to the upstream v6.5
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3067

JIRA: https://issues.redhat.com/browse/RHEL-1773
Depends: https://issues.redhat.com/browse/RHEL-860
Depends: https://issues.redhat.com/browse/RHEL-3646

Update TC (net/sched) to the upstream v6.5

Omitted-fix: cad7526f33ce ("net: dsa: ocelot: unlock on error in vsc9959_qos_port_tas_set()")
Not needed, DSA as well as ocelot driver is not enabled/supported in RHEL

Commits:
```
1b808993e194 ("flow_dissector: fix false-positive __read_overflow2_field() warning")
f743f16c548b ("treewide: use get_random_{u8,u16}() when possible, part 2")
7e3cf0843fe5 ("treewide: use get_random_{u8,u16}() when possible, part 1")
8032bf1233a7 ("treewide: use get_random_u32_below() instead of deprecated function")
62423bd2d2e2 ("net: sched: remove qdisc_watchdog->last_expires")
c66b2111c9c9 ("selftests: tc-testing: add tests for action binding")
f5fca219ad45 ("net: do not use skb_mac_header() in qdisc_pkt_len_init()")
e495a9673caf ("sch_cake: do not use skb_mac_header() in cake_overhead()")
b3be94885af4 ("net/sched: remove two skb_mac_header() uses")
fcb3a4653bc5 ("net/sched: act_api: use the correct TCA_ACT attributes in dump")
4170f0ef582c ("fix typos in net/sched/)
8b0f256530d9 ("net/sched: sch_mqprio: use netlink payload helpers")
3dd0c16ec93e ("net/sched: mqprio: simplify handling of nlattr portion of TCA_OPTIONS")
57f21bf85400 ("net/sched: mqprio: add extack to mqprio_parse_nlattr()")
ab277d2084ba ("net/sched: mqprio: add an extack message to mqprio_parse_opt()")
c54876cd5961 ("net/sched: pass netlink extack to mqprio and taprio offload")
f62af20bed2d ("net/sched: mqprio: allow per-TC user input of FP adminStatus")
a721c3e54b80 ("net/sched: taprio: allow per-TC user input of FP adminStatus")
8c966a10eb84 ("flow_dissector: Address kdoc warnings")
54e906f1639e ("selftests: forwarding: sch_tbf_*: Add a pre-run hook")
2f0f9465ad9f ("net: sched: Print msecs when transmit queue time out")
5036034572b7 ("net/sched: act_pedit: use NLA_POLICY for parsing 'ex' keys")
0c83c5210e18 ("net/sched: act_pedit: use extack in 'ex' parsing errors")
e1201bc781c2 ("net/sched: act_pedit: check static offsets a priori")
577140180ba2 ("net/sched: act_pedit: remove extra check for key type")
e3c9673e2f6e ("net/sched: act_pedit: rate limit datapath messages")
807cfded92b0 ("net/sched: sch_htb: use extack on errors messages")
c69a9b023f65 ("net/sched: sch_qfq: use extack on errors messages")
25369891fcef ("net/sched: sch_qfq: refactor parsing of netlink parameters")
7eb060a51a3b ("selftests: tc-testing: add more tests for sch_qfq")
1b483d9f5805 ("net/sched: act_pedit: free pedit keys on bail from offset check")
526f28bd0fbd ("net/sched: act_mirred: Add carrier check")
12e7789ad5b4 ("sch_htb: Allow HTB priority parameter in offload mode")
c7cfbd115001 ("net/sched: sch_ingress: Only create under TC_H_INGRESS")
5eeebfe6c493 ("net/sched: sch_clsact: Only create under TC_H_CLSACT")
f85fa45d4a94 ("net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs")
9de95df5d15b ("net/sched: Prohibit regrafting ingress or clsact Qdiscs")
7b4858df3bf7 ("skbuff: bridge: Add layer 2 miss indication")
d5ccfd90df7f ("flow_dissector: Dissect layer 2 miss from tc skb extension")
1a432018c0cd ("net/sched: flower: Allow matching on layer 2 miss")
f4356947f029 ("flow_offload: Reject matching on layer 2 miss")
8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases")
dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")
2d800bc500fb ("net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum")
6c1adb650c8d ("net/sched: taprio: add netlink reporting for offload statistics counters")
a395b8d1c7c3 ("selftests/tc-testing: replace mq with invalid parent ID")
8cde87b007da ("net: sched: wrap tc_skip_wrapper with CONFIG_RETPOLINE")
cd2b8113c2e8 ("net/sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values")
d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
886bc7d6ed33 ("net: sched: move rtm_tca_policy declaration to include file")
682881ee45c8 ("net: sched: act_police: fix sparse errors in tcf_police_dump()")
6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
26e35370b976 ("net/sched: act_pedit: Use kmemdup() to replace kmalloc + memcpy")
2b84960fc5dd ("net/sched: taprio: report class offload stats per TXQ, not per TC")
d7ad70b5ef5a ("net: flow_dissector: add support for cfm packets")
7cfffd5fed3e ("net: flower: add support for matching cfm fields")
1668a55a73f5 ("selftests: net: add tc flower cfm test")
c29e012eae29 ("selftests: forwarding: Fix layer 2 miss test syntax")
aef6e908b542 ("selftests/tc-testing: Fix Error: Specified qdisc kind is unknown.")
b849c566ee9c ("selftests/tc-testing: Fix Error: failed to find target LOG")
b39d8c41c7a8 ("selftests/tc-testing: Fix SFB db test")
11b8b2e70a9b ("selftests/tc-testing: Remove configs that no longer exist")
41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
2d5f6a8d7aef ("net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs")
84ad0af0bccd ("net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting")
e16ad981e2a1 ("net: sched: Remove unused qdisc_l2t()")
ca4fa8743537 ("selftests: tc-testing: add one test for flushing explicitly created chain")
b4ee93380b3c ("net/sched: act_ipt: add sanity checks on table name and hook locations")
b2dc32dcba08 ("net/sched: act_ipt: add sanity checks on skb before calling target")
93d75d475c5d ("net/sched: act_ipt: zero skb->cb before calling target")
30c45b5361d3 ("net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX")
989b52cdc849 ("net: sched: Replace strlcpy with strscpy")
d3f87278bcb8 ("net/sched: flower: Ensure both minimum and maximum ports are specified")
150e33e62c1f ("net/sched: make psched_mtu() RTNL-less safe")
158810b261d0 ("net/sched: sch_qfq: reintroduce lmax bound check for MTU")
c5a06fdc618d ("selftests: tc-testing: add tests for qfq mtu sanity check")
3e337087c3b5 ("net/sched: sch_qfq: account for stab overhead in qfq_enqueue")
137f6219da59 ("selftests: tc-testing: add test for qfq with stab overhead")
d1cca974548d ("pie: fix kernel-doc notation warning")
b3d0e0489430 ("net: sched: cls_matchall: Undo tcf_bind_filter in case of failure after mall_set_parms")
9cb36faedeaf ("net: sched: cls_u32: Undo tcf_bind_filter if u32_replace_hw_knode")
e8d3d78c19be ("net: sched: cls_u32: Undo refcount decrement in case update failed")
26a22194927e ("net: sched: cls_bpf: Undo tcf_bind_filter in case of an error")
ac177a330077 ("net: sched: cls_flower: Undo tcf_bind_filter in case of an error")
fda05798c22a ("selftests: tc: set timeout to 15 minutes")
719b4774a8cb ("selftests: tc: add 'ct' action kconfig dep")
031c99e71fed ("selftests: tc: add ConnTrack procfs kconfig")
4914109a8e1e ("netfilter: allow exp not to be removed in nf_ct_find_expectation")
76622ced50a1 ("net: sched: set IPS_CONFIRMED in tmpl status only when commit is set in act_ct")
8c8b73320805 ("openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
9fe63d5f1da9 ("sch_htb: Allow HTB quantum parameter in offload mode")
6c58c8816abb ("net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64")
4d50e50045aa ("net: flower: fix stack-out-of-bounds in fl_set_key_cfm()")
e68409db9953 ("net: sched: cls_u32: Fix match key mis-addressing")
e739718444f7 ("net/sched: taprio: Limit TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME to INT_MAX.")
21a72166abb9 ("selftests: forwarding: tc_flower_l2_miss: Fix failing test with old libnet")
```

Signed-off-by: Ivan Vecera <ivecera@redhat.com>

Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com>
Approved-by: Michal Schmidt <mschmidt@redhat.com>
Approved-by: Davide Caratti <dcaratti@redhat.com>

Signed-off-by: Scott Weaver <scweaver@redhat.com>
2023-10-24 13:29:05 -04:00
Chris von Recklinghausen 1f619343f6 treewide: use get_random_u32() when possible
Conflicts:
	drivers/gpu/drm/tests/drm_buddy_test.c
	drivers/gpu/drm/tests/drm_mm_test.c - We already have
		ce28ab1380e8 ("drm/tests: Add back seed value information")
		so keep calls to kunit_info.
	drop changes to drivers/misc/habanalabs/gaudi2/gaudi2.c
		fs/ntfs3/fslog.c - files not in CS9
	net/sunrpc/auth_gss/gss_krb5_wrap.c - We already have
		7f675ca7757b ("SUNRPC: Improve Kerberos confounder generation")
		so code to change is gone.
	drivers/gpu/drm/i915/i915_gem_gtt.c
	drivers/gpu/drm/i915/selftests/i915_selftest.c
	drivers/gpu/drm/tests/drm_buddy_test.c
	drivers/gpu/drm/tests/drm_mm_test.c
		change added under
		4cb818386e ("Merge DRM changes from upstream v6.0.8..v6.1")

JIRA: https://issues.redhat.com/browse/RHEL-1848

commit a251c17aa558d8e3128a528af5cf8b9d7caae4fd
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date:   Wed Oct 5 17:43:22 2022 +0200

    treewide: use get_random_u32() when possible

    The prandom_u32() function has been a deprecated inline wrapper around
    get_random_u32() for several releases now, and compiles down to the
    exact same code. Replace the deprecated wrapper with a direct call to
    the real function. The same also applies to get_random_int(), which is
    just a wrapper around get_random_u32(). This was done as a basic find
    and replace.

    Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Yury Norov <yury.norov@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
    Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
    Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
    Acked-by: Jakub Kicinski <kuba@kernel.org>
    Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbol
t
    Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
    Acked-by: Helge Deller <deller@gmx.de> # for parisc
    Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-10-20 06:15:03 -04:00
Antoine Tenart 5797656b66 net: openvswitch: Use struct_size()
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: net-next.git

commit df3bf90fef281c630ef06a3d03efb9fe56c8a0fb
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat Oct 14 08:34:52 2023 +0200

    net: openvswitch: Use struct_size()

    Use struct_size() instead of hand writing it.
    This is less verbose and more robust.

    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/e5122b4ff878cbf3ed72653a395ad5c4da04dc1e.1697264974.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-10-20 10:37:15 +02:00
Antoine Tenart 4e2c80178c openvswitch: reduce stack usage in do_execute_actions
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: net-next.git

commit 06bc3668cc2a6db2831b9086f0e3c6ebda599dba
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Thu Sep 21 21:42:35 2023 +0200

    openvswitch: reduce stack usage in do_execute_actions

    do_execute_actions() function can be called recursively multiple
    times while executing actions that require pipeline forking or
    recirculations.  It may also be re-entered multiple times if the packet
    leaves openvswitch module and re-enters it through a different port.

    Currently, there is a 256-byte array allocated on stack in this
    function that is supposed to hold NSH header.  Compilers tend to
    pre-allocate that space right at the beginning of the function:

         a88:       48 81 ec b0 01 00 00    sub    $0x1b0,%rsp

    NSH is not a very common protocol, but the space is allocated on every
    recursive call or re-entry multiplying the wasted stack space.

    Move the stack allocation to push_nsh() function that is only used
    if NSH actions are actually present.  push_nsh() is also a simple
    function without a possibility for re-entry, so the stack is returned
    right away.

    With this change the preallocated space is reduced by 256 B per call:

         b18:       48 81 ec b0 00 00 00    sub    $0xb0,%rsp

    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Eelco Chaudron echaudro@redhat.com
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-10-20 10:31:29 +02:00
Antoine Tenart 6d85d98a29 net: openvswitch: reject negative ifindex
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: linux.git

commit a552bfa16bab4ce901ee721346a28c4e483f4066
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Aug 14 13:38:40 2023 -0700

    net: openvswitch: reject negative ifindex

    Recent changes in net-next (commit 759ab1edb56c ("net: store netdevs
    in an xarray")) refactored the handling of pre-assigned ifindexes
    and let syzbot surface a latent problem in ovs. ovs does not validate
    ifindex, making it possible to create netdev ports with negative
    ifindex values. It's easy to repro with YNL:

    $ ./cli.py --spec netlink/specs/ovs_datapath.yaml \
             --do new \
             --json '{"upcall-pid": 1, "name":"my-dp"}'
    $ ./cli.py --spec netlink/specs/ovs_vport.yaml \
             --do new \
             --json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}'

    $ ip link show
    -65536: some-port0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
        link/ether 7a:48:21:ad:0b:fb brd ff:ff:ff:ff:ff:ff
    ...

    Validate the inputs. Now the second command correctly returns:

    $ ./cli.py --spec netlink/specs/ovs_vport.yaml \
             --do new \
             --json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}'

    lib.ynl.NlError: Netlink error: Numerical result out of range
    nl_len = 108 (92) nl_flags = 0x300 nl_type = 2
            error: -34      extack: {'msg': 'integer out of range', 'unknown': [[type:4 len:36] b'\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x03\x00\xff\xff\xff\x7f\x00\x00\x00\x00\x08\x00\x01\x00\x08\x00\x00\x00'], 'bad-attr': '.ifindex'}

    Accept 0 since it used to be silently ignored.

    Fixes: 54c4ef34c4b6 ("openvswitch: allow specifying ifindex of new interfaces")
    Reported-by: syzbot+7456b5dcf65111553320@syzkaller.appspotmail.com
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Link: https://lore.kernel.org/r/20230814203840.2908710-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-10-20 10:21:36 +02:00
Antoine Tenart d08cfcdd87 net: openvswitch: Use struct_size()
JIRA: https://issues.redhat.com/browse/RHEL-14346
Upstream Status: linux.git

commit b50a8b0d57ab1ef11492171e98a030f48682eac3
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat May 6 18:04:16 2023 +0200

    net: openvswitch: Use struct_size()

    Use struct_size() instead of hand writing it.
    This is less verbose and more informative.

    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Acked-by: Eelco Chaudron <echaudro@redhat.com>
    Link: https://lore.kernel.org/r/e7746fbbd62371d286081d5266e88bbe8d3fe9f0.1683388991.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-10-20 10:21:36 +02:00
Ivan Vecera 946cbed244 openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack
JIRA: https://issues.redhat.com/browse/RHEL-1773

commit 8c8b733208058702da451b7d60a12c0ff90b6879
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Jul 16 17:09:19 2023 -0400

    openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

    By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
    from the hashtable when lookup, we can simplify the exp processing code a
    lot in openvswitch conntrack.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-10-13 09:03:13 +02:00
Ivan Vecera 497f645693 net: move gso declarations and functions to their own files
JIRA: https://issues.redhat.com/browse/RHEL-12679

commit d457a0e329b0bfd3a1450e0b1a18cd2b47a25a08
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Jun 8 19:17:37 2023 +0000

    net: move gso declarations and functions to their own files

    Move declarations into include/net/gso.h and code into net/core/gso.c

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Stanislav Fomichev <sdf@google.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-10-11 13:35:27 +02:00
Adrian Moreno 6a98fafa3d net: openvswitch: add misc error drop reasons
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git

commit 43d95b30cf5793cdd3c7b1c1cd5fead9b469bd60
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Fri Aug 11 16:12:52 2023 +0200

    net: openvswitch: add misc error drop reasons

    Use drop reasons from include/net/dropreason-core.h when a reasonable
    candidate exists.

    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2023-08-21 08:34:23 +02:00
Adrian Moreno 1008eb8dd9 net: openvswitch: add meter drop reason
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git

commit f329d1bc1a4580e0f8a402b14a6fd024ec8e5c7b
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Fri Aug 11 16:12:51 2023 +0200

    net: openvswitch: add meter drop reason

    By using an independent drop reason it makes it easy to distinguish
    between QoS-triggered or flow-triggered drop.

    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2023-08-21 08:34:23 +02:00
Adrian Moreno 4d3ff090b6 net: openvswitch: add explicit drop action
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git

commit e7bc7db9ba463e763ac6113279cade19da9cb939
Author: Eric Garver <eric@garver.life>
Date:   Fri Aug 11 16:12:50 2023 +0200

    net: openvswitch: add explicit drop action

    From: Eric Garver <eric@garver.life>

    This adds an explicit drop action. This is used by OVS to drop packets
    for which it cannot determine what to do. An explicit action in the
    kernel allows passing the reason _why_ the packet is being dropped or
    zero to indicate no particular error happened (i.e: OVS intentionally
    dropped the packet).

    Since the error codes coming from userspace mean nothing for the kernel,
    we squash all of them into only two drop reasons:
    - OVS_DROP_EXPLICIT_WITH_ERROR to indicate a non-zero value was passed
    - OVS_DROP_EXPLICIT to indicate a zero value was passed (no error)

    e.g. trace all OVS dropped skbs

     # perf trace -e skb:kfree_skb --filter="reason >= 0x30000"
     [..]
     106.023 ping/2465 skb:kfree_skb(skbaddr: 0xffffa0e8765f2000, \
      location:0xffffffffc0d9b462, protocol: 2048, reason: 196611)

    reason: 196611 --> 0x30003 (OVS_DROP_EXPLICIT)

    Also, this patch allows ovs-dpctl.py to add explicit drop actions as:
      "drop"     -> implicit empty-action drop
      "drop(0)"  -> explicit non-error action drop
      "drop(42)" -> explicit error action drop

    Signed-off-by: Eric Garver <eric@garver.life>
    Co-developed-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2023-08-21 08:34:22 +02:00
Adrian Moreno 3b252dd672 net: openvswitch: add action error drop reason
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git

commit ec7bfb5e5a054f1178e8bdbf4f145fdafa5bf804
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Fri Aug 11 16:12:49 2023 +0200

    net: openvswitch: add action error drop reason

    Add a drop reason for packets that are dropped because an action
    returns a non-zero error code.

    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2023-08-21 08:34:22 +02:00
Adrian Moreno 10015df94f net: openvswitch: add last-action drop reason
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git

commit 9d802da40b7c820deb9c60fc394457ea565cafc8
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Fri Aug 11 16:12:48 2023 +0200

    Create a new drop reason subsystem for openvswitch and add the first
    drop reason to represent last-action drops.

    Last-action drops happen when a flow has an empty action list or there
    is no action that consumes the packet (output, userspace, recirc, etc).
    It is the most common way in which OVS drops packets.

    Implementation-wise, most of these skb-consuming actions already call
    "consume_skb" internally and return directly from within the
    do_execute_actions() loop so with minimal changes we can assume that
    any skb that exits the loop normally is a packet drop.

    Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2023-08-21 08:34:17 +02:00
Jan Stancek e3c14423d4 Merge: net: openvswitch: add support for l4 symmetric hashing
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2804

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2188082
Upstream Status: commit e069ba07e6c7
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Paolo Abeni <pabeni@redhat.com>
Approved-by: Aaron Conole <aconole@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-07-21 17:32:23 +02:00
Jan Stancek c088b1cb17 Merge: net: openvswitch: fix upcall counter access before allocation
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2662

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2203263
Upstream Status: Backport net-next.git commit de9df6c6b27e
Conflicts: none

Backport of upstream commit:

commit de9df6c6b27e22d7bdd20107947ef3a20e687de5
Author: Eelco Chaudron <echaudro@redhat.com>
Date:   Tue Jun 6 13:56:35 2023 +0200

    net: openvswitch: fix upcall counter access before allocation

    Currently, the per cpu upcall counters are allocated after the vport is
    created and inserted into the system. This could lead to the datapath
    accessing the counters before they are allocated resulting in a kernel
    Oops.

    Here is an example:

      PID: 59693    TASK: ffff0005f4f51500  CPU: 0    COMMAND: "ovs-vswitchd"
       ...

      PID: 58682    TASK: ffff0005b2f0bf00  CPU: 0    COMMAND: "kworker/0:3"

    We moved the per cpu upcall counter allocation to the existing vport
    alloc and free functions to solve this.

    Fixes: 95637d91fefd ("net: openvswitch: release vport resources on failure")
    Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall packets")
    Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Approved-by: Paolo Abeni <pabeni@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Aaron Conole <aconole@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-07-19 08:47:03 +02:00
Timothy Redaelli 3ba9e9cc48 net: openvswitch: add support for l4 symmetric hashing
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2188082
Upstream Status: commit e069ba07e6c7

commit e069ba07e6c7af69e119316bc87ff44869095f49
Author: Aaron Conole <aconole@redhat.com>
Date:   Fri Jun 9 09:59:55 2023 -0400

    net: openvswitch: add support for l4 symmetric hashing

    Since its introduction, the ovs module execute_hash action allowed
    hash algorithms other than the skb->l4_hash to be used.  However,
    additional hash algorithms were not implemented.  This means flows
    requiring different hash distributions weren't able to use the
    kernel datapath.

    Now, introduce support for symmetric hashing algorithm as an
    alternative hash supported by the ovs module using the flow
    dissector.

    Output of flow using l4_sym hash:

        recirc_id(0),in_port(3),eth(),eth_type(0x0800),
        ipv4(dst=64.0.0.0/192.0.0.0,proto=6,frag=no), packets:30473425,
        bytes:45902883702, used:0.000s, flags:SP.,
        actions:hash(sym_l4(0)),recirc(0xd)

    Some performance testing with no GRO/GSO, two veths, single flow:

        hash(l4(0)):      4.35 GBits/s
        hash(l4_sym(0)):  4.24 GBits/s

    Signed-off-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
2023-07-12 19:15:18 +02:00
Jan Stancek de7da32ded Merge: ALSA - update drivers for 9.3
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2615

This is a backport of ALSA changes up to 6.4-rc6 kernel for RHEL 9.3.

Bugzilla: https://bugzilla.redhat.com/2179848

This upstream patchset updates the ALSA driver code:

- ALSA core
- ALSA HDA
- ALSA USB
- ALSA PCI
- ALSA SoC (mainly SOF including SoundWire drivers)
- Soundwire bus
- dt-bindings for qcom (Qualcomm) and fsl (Freescale) for automotive boards, NVidia seems handed in !2355

The other components are touched to get things in sync with the current upstream:

Some touched drivers are for hardware platforms which are not used in RHEL. The purpose to merge those upstream commits is to keep the future code sync more easy.

Kernel module renames:

- snd-soc-sst-broadwell -> snd-soc-bdw-rt286
- snd-soc-sst-haswell -> snd-soc-hsw-rt5640

Note: The Elf -> ELF patch touches many subsystems. I can remove it on demand.

ARK configuration changes: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2500 and https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2520

Signed-off-by: Jaroslav Kysela <jkysela@redhat.com>

Approved-by: Adrien Thierry <athierry@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: Jocelyn Falempe <jfalempe@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>
Approved-by: Julia Denham <jdenham@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-07-04 11:15:01 +02:00
Jaroslav Kysela 3e088c880f scripts/spelling.txt: add "exsits" pattern and fix typo instances
Bugzilla: https://bugzilla.redhat.com/2179848

Conflicts: omit iscsi_iser.c (in RH commit b13392d470)

commit 1b381f6fe495fffbbdace1ee530afb74287c809d
Author: Luca Ceresoli <luca.ceresoli@bootlin.com>
Date: Thu Jan 26 16:22:05 2023 +0100

    scripts/spelling.txt: add "exsits" pattern and fix typo instances

    Fix typos and add the following to the scripts/spelling.txt:

      exsits||exists

    Link: https://lkml.kernel.org/r/20230126152205.959277-1-luca.ceresoli@bootlin.com
    Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Jaroslav Kysela <jkysela@redhat.com>
2023-06-21 16:20:54 +02:00
Eelco Chaudron 6785b5bdee net: openvswitch: fix upcall counter access before allocation
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2203263
Upstream Status: Backport net-next.git commit de9df6c6b27e
Conflicts: none

Backport of upstream commit:

commit de9df6c6b27e22d7bdd20107947ef3a20e687de5
Author: Eelco Chaudron <echaudro@redhat.com>
Date:   Tue Jun 6 13:56:35 2023 +0200

    net: openvswitch: fix upcall counter access before allocation

    Currently, the per cpu upcall counters are allocated after the vport is
    created and inserted into the system. This could lead to the datapath
    accessing the counters before they are allocated resulting in a kernel
    Oops.

    Here is an example:

      PID: 59693    TASK: ffff0005f4f51500  CPU: 0    COMMAND: "ovs-vswitchd"
       #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
       #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
       #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
       #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
       #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
       #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
       #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
       #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
       ...

      PID: 58682    TASK: ffff0005b2f0bf00  CPU: 0    COMMAND: "kworker/0:3"
       #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
       #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
       #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
       #3 [ffff80000a5d3120] die at ffffb70f0628234c
       #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
       #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
       #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
       #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
       #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
       #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
      #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
      #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
      #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
      #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
      #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
      #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
      #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
      #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at ffffb70f06079f90

    We moved the per cpu upcall counter allocation to the existing vport
    alloc and free functions to solve this.

    Fixes: 95637d91fefd ("net: openvswitch: release vport resources on failure")
    Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall packets")
    Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-06-12 14:31:52 +02:00
Ivan Vecera 1cb324e3cc net: Remove the obsolte u64_stats_fetch_*_irq() users (net).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2193170

Conflicts:
* net/netfilter/ipvs/ip_vs_ctl.c
  - the change was already applied by RHEL commit 914c1e31d9 ("ipvs:
    use u64_stats_t for the per-cpu counters")
* net/core/devlink.c
  - hunk was applied in different file (net/devlink/leftover.c)

commit d120d1a63b2c484d6175873d8ee736a633f74b70
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Oct 26 15:22:15 2022 +0200

    net: Remove the obsolte u64_stats_fetch_*_irq() users (net).

    Now that the 32bit UP oddity is gone and 32bit uses always a sequence
    count, there is no need for the fetch_irq() variants anymore.

    Convert to the regular interface.

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-06-08 13:38:11 +02:00
Jan Stancek 6318ae37c7 Merge: ovs: stable backports for 9.3 phase 1
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2438

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2190207

Signed-off-by: Antoine Tenart <atenart@redhat.com>

Approved-by: Andrea Claudi <aclaudi@redhat.com>
Approved-by: Davide Caratti <dcaratti@redhat.com>
Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Marcelo Ricardo Leitner <mleitner@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-06-01 07:25:53 +02:00