Commit Graph

165 Commits

Author SHA1 Message Date
Lucas Zampieri 03feb1b243 Merge: CVE-2024-27395: net: openvswitch: Fix Use-After-Free in ovs_ct_exit
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4578

JIRA: https://issues.redhat.com/browse/RHEL-36364  
CVE: CVE-2024-27395

```
net: openvswitch: Fix Use-After-Free in ovs_ct_exit

Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
of ovs_ct_limit_exit, is not part of the RCU read critical section, it
is possible that the RCU grace period will pass during the traversal and
the key will be free.

To prevent this, it should be changed to hlist_for_each_entry_safe.

Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2)
```

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-29 19:30:30 +00:00
cki-backport-bot c29d123d5c net: openvswitch: Fix Use-After-Free in ovs_ct_exit
JIRA: https://issues.redhat.com/browse/RHEL-36364
CVE: CVE-2024-27395

commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2
Author: Hyunwoo Kim <v4bel@theori.io>
Date:   Mon Apr 22 05:37:17 2024 -0400

    net: openvswitch: Fix Use-After-Free in ovs_ct_exit

    Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
    of ovs_ct_limit_exit, is not part of the RCU read critical section, it
    is possible that the RCU grace period will pass during the traversal and
    the key will be free.

    To prevent this, it should be changed to hlist_for_each_entry_safe.

    Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
    Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-06-25 17:40:27 +00:00
Xin Long 08cd1c3932 openvswitch: get related ct labels from its master if it is not confirmed
JIRA: https://issues.redhat.com/browse/RHEL-44560
Tested: compile only

commit a23ac973f67f37e77b3c634e8b1ad5b0164fcc1f
Author: Xin Long <lucien.xin@gmail.com>
Date:   Wed Jun 19 18:08:56 2024 -0400

    openvswitch: get related ct labels from its master if it is not confirmed

    Ilya found a failure in running check-kernel tests with at_groups=144
    (144: conntrack - FTP SNAT orig tuple) in OVS repo. After his further
    investigation, the root cause is that the labels sent to userspace
    for related ct are incorrect.

    The labels for unconfirmed related ct should use its master's labels.
    However, the changes made in commit 8c8b73320805 ("openvswitch: set
    IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
    led to getting labels from this related ct.

    So fix it in ovs_ct_get_labels() by changing to copy labels from its
    master ct if it is a unconfirmed related ct. Note that there is no
    fix needed for ct->mark, as it was already copied from its master
    ct for related ct in init_conntrack().

    Fixes: 8c8b73320805 ("openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
    Reported-by: Ilya Maximets <i.maximets@ovn.org>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
    Tested-by: Ilya Maximets <i.maximets@ovn.org>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Xin Long <lxin@redhat.com>
2024-06-22 16:03:50 -04:00
Patrick Talbert ee83f8fca0 Merge: ovs: P1 backports for 9.5
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4003

JIRA: https://issues.redhat.com/browse/RHEL-32143

Signed-off-by: Antoine Tenart <atenart@redhat.com>

Approved-by: Eelco Chaudron <echaudro@redhat.com>
Approved-by: Davide Caratti <dcaratti@redhat.com>

Merged-by: Patrick Talbert <ptalbert@redhat.com>
2024-05-07 08:23:25 +02:00
Ivan Vecera 25a5e1ea3a genetlink: remove userhdr from struct genl_info
JIRA: https://issues.redhat.com/browse/RHEL-30656

commit bffcc6882a1bb2be8c9420184966f4c2c822078e
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Aug 14 14:47:16 2023 -0700

    genetlink: remove userhdr from struct genl_info

    Only three families use info->userhdr today and going forward
    we discourage using fixed headers in new families.
    So having the pointer to user header in struct genl_info
    is an overkill. Compute the header pointer at runtime.

    Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Link: https://lore.kernel.org/r/20230814214723.2924989-4-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-10 09:19:30 +02:00
Antoine Tenart 5b4efc3e3e net: openvswitch: fix unwanted error log on timeout policy probing
JIRA: https://issues.redhat.com/browse/RHEL-32143
Upstream Status: net.git

commit 4539f91f2a801c0c028c252bffae56030cfb2cae
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Wed Apr 3 22:38:01 2024 +0200

    net: openvswitch: fix unwanted error log on timeout policy probing

    On startup, ovs-vswitchd probes different datapath features including
    support for timeout policies.  While probing, it tries to execute
    certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE
    attributes set.  These attributes tell the openvswitch module to not
    log any errors when they occur as it is expected that some of the
    probes will fail.

    For some reason, setting the timeout policy ignores the PROBE attribute
    and logs a failure anyway.  This is causing the following kernel log
    on each re-start of ovs-vswitchd:

      kernel: Failed to associated timeout policy `ovs_test_tp'

    Fix that by using the same logging macro that all other messages are
    using.  The message will still be printed at info level when needed
    and will be rate limited, but with a net rate limiter instead of
    generic printk one.

    The nf_ct_set_timeout() itself will still print some info messages,
    but at least this change makes logging in openvswitch module more
    consistent.

    Fixes: 06bd2bdf19 ("openvswitch: Add timeout support to ct action")
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Acked-by: Eelco Chaudron <echaudro@redhat.com>
    Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-04-08 17:06:24 +02:00
Davide Caratti 08f7955211 net/sched: act_ct: Always fill offloading tuple iifidx
JIRA: https://issues.redhat.com/browse/RHEL-21360
Upstream Status: net.git commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba

commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Fri Nov 3 16:14:10 2023 +0100

    net/sched: act_ct: Always fill offloading tuple iifidx

    Referenced commit doesn't always set iifidx when offloading the flow to
    hardware. Fix the following cases:

    - nf_conn_act_ct_ext_fill() is called before extension is created with
    nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
    unspecified iifidx when connection is offloaded after only single
    original-direction packet has been processed by tc data path. Always fill
    the new nf_conn_act_ct_ext instance after creating it in
    nf_conn_act_ct_ext_add().

    - Offloading of unidirectional UDP NEW connections is now supported, but ct
    flow iifidx field is not updated when connection is promoted to
    bidirectional which can result reply-direction iifidx to be zero when
    refreshing the connection. Fill in the extension and update flow iifidx
    before calling flow_offload_refresh().

    Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
    Reviewed-by: Paul Blakey <paulb@nvidia.com>
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
    Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-01-11 15:54:54 +01:00
Ivan Vecera 946cbed244 openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack
JIRA: https://issues.redhat.com/browse/RHEL-1773

commit 8c8b733208058702da451b7d60a12c0ff90b6879
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Jul 16 17:09:19 2023 -0400

    openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

    By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
    from the hashtable when lookup, we can simplify the exp processing code a
    lot in openvswitch conntrack.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-10-13 09:03:13 +02:00
Adrian Moreno 6a98fafa3d net: openvswitch: add misc error drop reasons
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2232283
Upstream Status: net-next.git

commit 43d95b30cf5793cdd3c7b1c1cd5fead9b469bd60
Author: Adrian Moreno <amorenoz@redhat.com>
Date:   Fri Aug 11 16:12:52 2023 +0200

    net: openvswitch: add misc error drop reasons

    Use drop reasons from include/net/dropreason-core.h when a reasonable
    candidate exists.

    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2023-08-21 08:34:23 +02:00
Ivan Vecera 2dcc14314d net: extract nf_ct_handle_fragments to nf_conntrack_ovs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2172886

commit 0785407e78d4bce56e04d92a6c961900b3d513dd
Author: Xin Long <lucien.xin@gmail.com>
Date:   Tue Feb 7 17:52:10 2023 -0500

    net: extract nf_ct_handle_fragments to nf_conntrack_ovs

    Now handle_fragments() in OVS and TC have the similar code, and
    this patch removes the duplicate code by moving the function
    to nf_conntrack_ovs.

    Note that skb_clear_hash(skb) or skb->ignore_df = 1 should be
    done only when defrag returns 0, as it does in other places
    in kernel.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-05-10 20:48:52 +02:00
Ivan Vecera 247eaef750 openvswitch: move key and ovs_cb update out of handle_fragments
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2172886

commit 1b83bf4489cbc47d88976291cc967a17adb8e118
Author: Xin Long <lucien.xin@gmail.com>
Date:   Tue Feb 7 17:52:08 2023 -0500

    openvswitch: move key and ovs_cb update out of handle_fragments

    This patch has no functional changes and just moves key and ovs_cb update
    out of handle_fragments, and skb_clear_hash() and skb->ignore_df change
    into handle_fragments(), to make it easier to move the duplicate code
    from handle_fragments() into nf_conntrack_ovs later.

    Note that it changes to pass info->family to handle_fragments() instead
    of key for the packet type check, as info->family is set according to
    key->eth.type in ovs_ct_copy_action() when creating the action.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-05-10 20:48:52 +02:00
Ivan Vecera 0ccb6dc8a4 net: extract nf_ct_skb_network_trim function to nf_conntrack_ovs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2172886

commit 67fc5d7ffbd4f9cf52adf166f5bc9a35fef37f24
Author: Xin Long <lucien.xin@gmail.com>
Date:   Tue Feb 7 17:52:07 2023 -0500

    net: extract nf_ct_skb_network_trim function to nf_conntrack_ovs

    There are almost the same code in ovs_skb_network_trim() and
    tcf_ct_skb_network_trim(), this patch extracts them into a function
    nf_ct_skb_network_trim() and moves the function to nf_conntrack_ovs.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-05-10 20:48:52 +02:00
Ivan Vecera e785d89eaf net: move the nat function to nf_nat_ovs for ovs and tc
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2172886

commit ebddb1404900657b7f03a56ee4c34a9d218c4030
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Dec 8 11:56:12 2022 -0500

    net: move the nat function to nf_nat_ovs for ovs and tc

    There are two nat functions are nearly the same in both OVS and
    TC code, (ovs_)ct_nat_execute() and ovs_ct_nat/tcf_ct_act_nat().

    This patch creates nf_nat_ovs.c under netfilter and moves them
    there then exports nf_ct_nat() so that it can be shared by both
    OVS and TC, and keeps the nat (type) check and nat flag update
    in OVS and TC's own place, as these parts are different between
    OVS and TC.

    Note that in OVS nat function it was using skb->protocol to get
    the proto as it already skips vlans in key_extract(), while it
    doesn't in TC, and TC has to call skb_protocol() to get proto.
    So in nf_ct_nat_execute(), we keep using skb_protocol() which
    works for both OVS and TC contrack.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-05-10 20:48:49 +02:00
Xin Long 0c18c7cb64 openvswitch: use skb_ip_totlen in conntrack
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2185290
Tested: compile only

commit ec84c955a0d06cef31664bae328d94be7a3e2f03
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sat Jan 28 10:58:32 2023 -0500

    openvswitch: use skb_ip_totlen in conntrack

    IPv4 GSO packets may get processed in ovs_skb_network_trim(),
    and we need to use skb_ip_totlen() to get iph totlen.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Aaron Conole <aconole@redhat.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Xin Long <lxin@redhat.com>
2023-05-02 10:36:10 -04:00
Jan Stancek cb25836a90 Merge: netfilter: conntrack: Fix data-races around ct mark
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2237

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2180943
Upstream Status: All mainline in linux.git.
Conflicts: clean cherry-picks

nf_conn:mark can be read from and written to in parallel. Use
READ_ONCE()/WRITE_ONCE() for reads and writes to prevent unwanted
compiler optimizations.

Also grab the two followup fixes to avoid a compiler warning
and make sure ctnetlink events still include the ctmark in the
delete notification.

Signed-off-by: Florian Westphal <fwestpha@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Eelco Chaudron <echaudro@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-03-30 12:36:45 +02:00
Florian Westphal 539491426c netfilter: conntrack: Fix data-races around ct mark
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2180943
Upstream Status: commit 52d1aa8b8249f

commit 52d1aa8b8249ff477aaa38b6f74a8ced780d079c
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Wed Nov 9 12:39:07 2022 -0700

    netfilter: conntrack: Fix data-races around ct mark

    nf_conn:mark can be read from and written to in parallel. Use
    READ_ONCE()/WRITE_ONCE() for reads and writes to prevent unwanted
    compiler optimizations.

    Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Florian Westphal <fwestpha@redhat.com>
2023-03-24 11:20:55 +01:00
Ivan Vecera 6fb59586eb genetlink: start to validate reserved header bytes
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2175249

Conflicts:
* kernel/taskstats.c
  context conflict due to missing edc73c7261ca ("kernel: make taskstats
  available from all net namespaces")
* fs/ksmbd/transport_ipc.c
* net/ipv6/ioam6.c
  hunks skipped as the files are not present in RHEL kernel

commit 9c5d03d362519f36cd551aec596388f895c93d2d
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Aug 24 17:18:30 2022 -0700

    genetlink: start to validate reserved header bytes

    We had historically not checked that genlmsghdr.reserved
    is 0 on input which prevents us from using those precious
    bytes in the future.

    One use case would be to extend the cmd field, which is
    currently just 8 bits wide and 256 is not a lot of commands
    for some core families.

    To make sure that new families do the right thing by default
    put the onus of opting out of validation on existing families.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-03-06 15:42:45 +01:00
Herton R. Krzesinski b8e14263db Merge: net: add helper support in tc act_ct for ovs offloading
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1967

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2106859
Tested: bz reproducer

I didn't backport the cleanups in upstream patchset of:

  "net: eliminate the duplicate code in the ct nat functions of ovs and tc"

as some of the patches have been already in:

  https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1932

I may do the rest backport for the cleanups when that MR is merged.

Signed-off-by: Xin Long <lxin@redhat.com>

Approved-by: Aaron Conole <aconole@redhat.com>
Approved-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Approved-by: Andrea Claudi <aclaudi@redhat.com>

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
2023-02-08 01:40:19 +00:00
Xin Long 1391e064b6 net: move add ct helper function to nf_conntrack_helper for ovs and tc
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2106859
Tested: compile only

commit f96cba2eb923c025014fe74a50e104b7c5234feb
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Nov 6 15:34:15 2022 -0500

    net: move add ct helper function to nf_conntrack_helper for ovs and tc

    Move ovs_ct_add_helper from openvswitch to nf_conntrack_helper and
    rename as nf_ct_add_helper, so that it can be used in TC act_ct in
    the next patch.

    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Xin Long <lxin@redhat.com>
2023-01-26 18:14:35 -05:00
Xin Long d1fffd80fc net: move the ct helper function to nf_conntrack_helper for ovs and tc
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2106859
Tested: compile only

commit ca71277f36e0781db663aedeb5fc1e26e7c144c4
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Nov 6 15:34:14 2022 -0500

    net: move the ct helper function to nf_conntrack_helper for ovs and tc

    Move ovs_ct_helper from openvswitch to nf_conntrack_helper and rename
    as nf_ct_helper so that it can be used in TC act_ct in the next patch.
    Note that it also adds the checks for the family and proto, as in TC
    act_ct, the packets with correct family and proto are not guaranteed.

    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Xin Long <lxin@redhat.com>
2023-01-26 18:14:35 -05:00
Antoine Tenart 06c5e46444 openvswitch: return NF_DROP when fails to add nat ext in ovs_ct_nat
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2163374
Upstream Status: linux.git

commit 2b85144ab36e0e870f59b5ae55e299179eb8cdb8
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Dec 8 11:56:10 2022 -0500

    openvswitch: return NF_DROP when fails to add nat ext in ovs_ct_nat

    When it fails to allocate nat ext, the packet should be dropped, like
    the memory allocation failures in other places in ovs_ct_nat().

    This patch changes to return NF_DROP when fails to add nat ext before
    doing NAT in ovs_ct_nat(), also it would keep consistent with tc
    action ct' processing in tcf_ct_act_nat().

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-01-23 14:40:30 +01:00
Antoine Tenart 4f41252a94 openvswitch: return NF_ACCEPT when OVS_CT_NAT is not set in info nat
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2163374
Upstream Status: linux.git

commit 7795928921332fdd52c33eab73f1280d5e58678a
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Dec 8 11:56:09 2022 -0500

    openvswitch: return NF_ACCEPT when OVS_CT_NAT is not set in info nat

    Either OVS_CT_SRC_NAT or OVS_CT_DST_NAT is set, OVS_CT_NAT must be
    set in info->nat. Thus, if OVS_CT_NAT is not set in info->nat, it
    will definitely not do NAT but returns NF_ACCEPT in ovs_ct_nat().

    This patch changes nothing funcational but only makes this return
    earlier in ovs_ct_nat() to keep consistent with TC's processing
    in tcf_ct_act_nat().

    Reviewed-by: Saeed Mahameed <saeed@kernel.org>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-01-23 14:40:25 +01:00
Antoine Tenart c317e23bd3 openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2163374
Upstream Status: linux.git

commit bf14f4923d516d77320500461c0692c9d4480c30
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Dec 8 11:56:08 2022 -0500

    openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute

    The calls to ovs_ct_nat_execute() are as below:

      ovs_ct_execute()
        ovs_ct_lookup()
          __ovs_ct_lookup()
            ovs_ct_nat()
              ovs_ct_nat_execute()
        ovs_ct_commit()
          __ovs_ct_lookup()
            ovs_ct_nat()
              ovs_ct_nat_execute()

    and since skb_pull_rcsum() and skb_push_rcsum() are already
    called in ovs_ct_execute(), there's no need to do it again
    in ovs_ct_nat_execute().

    Reviewed-by: Saeed Mahameed <saeed@kernel.org>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2023-01-23 14:40:18 +01:00
Antoine Tenart 7c58a58e19 openvswitch: add nf_ct_is_confirmed check before assigning the helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2134560
Upstream Status: linux.git

commit 3c1860543fccc1d0cfe3fd6b190e414a418fe60e
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Oct 6 15:45:02 2022 -0400

    openvswitch: add nf_ct_is_confirmed check before assigning the helper

    A WARN_ON call trace would be triggered when 'ct(commit, alg=helper)'
    applies on a confirmed connection:

      WARNING: CPU: 0 PID: 1251 at net/netfilter/nf_conntrack_extend.c:98
      RIP: 0010:nf_ct_ext_add+0x12d/0x150 [nf_conntrack]
      Call Trace:
       <TASK>
       nf_ct_helper_ext_add+0x12/0x60 [nf_conntrack]
       __nf_ct_try_assign_helper+0xc4/0x160 [nf_conntrack]
       __ovs_ct_lookup+0x72e/0x780 [openvswitch]
       ovs_ct_execute+0x1d8/0x920 [openvswitch]
       do_execute_actions+0x4e6/0xb60 [openvswitch]
       ovs_execute_actions+0x60/0x140 [openvswitch]
       ovs_packet_cmd_execute+0x2ad/0x310 [openvswitch]
       genl_family_rcv_msg_doit.isra.15+0x113/0x150
       genl_rcv_msg+0xef/0x1f0

    which can be reproduced with these OVS flows:

      table=0, in_port=veth1,tcp,tcp_dst=2121,ct_state=-trk
      actions=ct(commit, table=1)
      table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
      actions=ct(commit, alg=ftp),normal

    The issue was introduced by commit 248d45f1e1 ("openvswitch: Allow
    attaching helper in later commit") where it somehow removed the check
    of nf_ct_is_confirmed before asigning the helper. This patch is to fix
    it by bringing it back.

    Fixes: 248d45f1e1 ("openvswitch: Allow attaching helper in later commit")
    Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Aaron Conole <aconole@redhat.com>
    Tested-by: Aaron Conole <aconole@redhat.com>
    Link: https://lore.kernel.org/r/c5c9092a22a2194650222bffaf786902613deb16.1665085502.git.lucien.xin@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2022-10-17 16:39:53 +02:00
Antoine Tenart 716597aa9a net: openvswitch: allow conntrack in non-initial user namespace
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2134560
Upstream Status: linux.git

commit 59cd7377660a76780bfdd9fd26da058bcca5320d
Author: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Date:   Fri Sep 23 15:38:20 2022 +0200

    net: openvswitch: allow conntrack in non-initial user namespace

    Similar to the previous commit, the Netlink interface of the OVS
    conntrack module was restricted to global CAP_NET_ADMIN by using
    GENL_ADMIN_PERM. This is changed to GENL_UNS_ADMIN_PERM to support
    unprivileged containers in non-initial user namespace.

    Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2022-10-17 16:39:49 +02:00
Antoine Tenart 3fddd8c5e0 net: openvswitch: fix misuse of the cached connection on tuple changes
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2101452
Upstream Status: linux.git

commit 2061ecfdf2350994e5b61c43e50e98a7a70e95ee
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Tue Jun 7 00:11:40 2022 +0200

    net: openvswitch: fix misuse of the cached connection on tuple changes

    If packet headers changed, the cached nfct is no longer relevant
    for the packet and attempt to re-use it leads to the incorrect packet
    classification.

    This issue is causing broken connectivity in OpenStack deployments
    with OVS/OVN due to hairpin traffic being unexpectedly dropped.

    The setup has datapath flows with several conntrack actions and tuple
    changes between them:

      actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
              set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
              set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
              ct(zone=8),recirc(0x4)

    After the first ct() action the packet headers are almost fully
    re-written.  The next ct() tries to re-use the existing nfct entry
    and marks the packet as invalid, so it gets dropped later in the
    pipeline.

    Clearing the cached conntrack entry whenever packet tuple is changed
    to avoid the issue.

    The flow key should not be cleared though, because we should still
    be able to match on the ct_state if the recirculation happens after
    the tuple change but before the next ct() action.

    Cc: stable@vger.kernel.org
    Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
    Reported-by: Frode Nordahl <frode.nordahl@canonical.com>
    Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
    Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2022-06-27 16:42:13 +02:00
Patrick Talbert 164ce13234 Merge: CNB: Update TC subsystem to upstream v5.18
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/971

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2090410
Depends: https://bugzilla.redhat.com/show_bug.cgi?id=2094002
Tested: Using TC related kernel self-tests

The series rebases TC subsystem to upstream v5.18

Commits:
```
f79a3bcb1a50 ("net/sched: Remove unnecessary if statement")
409f386b8e5d ("qdisc: add new field for qdisc_enqueue tracepoint")
56af5e749f20 ("net/sched: act_skbmod: Add SKBMOD_F_ECN option support")
68f9884837c6 ("tc-testing: Add control-plane selftest for skbmod SKBMOD_F_ECN option")
695176bfe5de ("net_sched: refactor TC action init API")
625af9f0298b ("tc-testing: Add control-plane selftests for sch_mq")
a5397d68b2db ("net/sched: cls_api, reset flags on replay")
efe487fce306 ("fix array-index-out-of-bounds in taprio_change")
1e080f17750d ("net: sched: update default qdisc visibility after Tx queue cnt changes")
2e367522ce6b ("netdevsim: add ability to change channel count")
2d6a58996ee2 ("selftests: net: test ethtool -L vs mq")
f7116fb46085 ("net: sched: move and reuse mq_change_real_num_tx()")
b193e15ac69d ("net: prevent user from passing illegal stab size")
69508d43334e ("net_sched: Use struct_size() and flex_array_size() helpers")
129291980f49 ("net: sched: Use struct_size() helper in kvmalloc()")
fbf307c89eb0 ("gen_stats: Add instead Set the value in __gnet_stats_copy_basic().")
448e163f8b9b ("gen_stats: Add gnet_stats_add_queue().")
7361df4606ba ("mq, mqprio: Use gnet_stats_add_queue().")
10940eb746d4 ("gen_stats: Move remaining users to gnet_stats_add_queue().")
f2efdb179289 ("u64_stats: Introduce u64_stats_set()")
67c9e6270f30 ("net: sched: Protect Qdisc::bstats with u64_stats")
f56940daa5a7 ("net: sched: Use _bstats_update/set() instead of raw writes")
50dc9a8572aa ("net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types")
29cbcd858283 ("net: sched: Remove Qdisc::running sequence counter")
4c57e2fac41c ("net: sched: fix logic error in qdisc_run_begin()")
97604c65bcda ("net: sched: remove one pair of atomic operations")
6b3efbfa4e68 ("net: sch_tbf: Add a graft command")
e22db7bd552f ("net: sched: Allow statistics reads from softirq.")
c5c6e589a8c8 ("net: stats: Read the statistics in ___gnet_stats_copy_basic() instead of adding.")
f25c0515c521 ("net: sched: gred: dynamically allocate tc_gred_qopt_offload")
267463823adb ("net: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap()")
85c0c3eb9a66 ("net: sch: simplify condtion for selecting mini_Qdisc_pair buffer")
648a991cf316 ("sch_htb: Add extack messages for EOPNOTSUPP errors")
6de6e46d27ef ("cls_flower: Fix inability to match GRE/IPIP packets")
af0a51113cb7 ("selftests: forwarding: Fix packet matching in mirroring selftests")
cb3ef7b00042 ("net: sched: sch_netem: Refactor code in 4-state loss generator")
bdf1565fe03d ("selftests/tc-testing: match any qdisc type")
b43c2793f5e9 ("netfilter: nfnetlink_queue: silence bogus compiler warning")
43332cf97425 ("net/sched: act_ct: Offload only ASSURED connections")
40bd094d65fc ("flow_offload: fill flags to action structure")
144d4c9e800d ("flow_offload: reject to offload tc actions in offload drivers")
5a9959008fb6 ("flow_offload: add index to flow_action_entry structure")
9c1c0e124ca2 ("flow_offload: rename offload functions with offload instead of flow")
c54e1d920f04 ("flow_offload: add ops to tc_action_ops for flow action setup")
8cbfe939abe9 ("flow_offload: allow user to offload tc action to net device")
7adc57651211 ("flow_offload: add skip_hw and skip_sw to control if offload the action")
bcd64368584b ("flow_offload: rename exts stats update functions with hw")
c7a66f8d8a94 ("flow_offload: add process to update action stats from hardware")
e8cb5bcf6ed6 ("net: sched: save full flags for tc action")
13926d19a11e ("flow_offload: add reoffload process to update hw_count")
c86e0209dc77 ("flow_offload: validate flags of filter and actions")
eb473bac4a4b ("selftests: tc-testing: add action offload selftest for action and filter")
c48c94b0ab75 ("net/sched: use min() macro instead of doing it manually")
963178a06352 ("flow_offload: fix suspicious RCU usage when offloading tc action")
9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
b702436a51df ("net: openvswitch: Fill act ct extension")
7d18a07897d0 ("sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc")
c25af830ab26 ("sch_cake: revise Diffserv docs")
719774377622 ("netfilter: conntrack: convert to refcount_t api")
3fce16493dc1 ("netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook")
285c8a7a5815 ("netfilter: make function op structures const")
6ae7989c9af0 ("netfilter: conntrack: avoid useless indirection during conntrack destruction")
408bdcfce8df ("net: prefer nf_ct_put instead of nf_conntrack_put")
fb80445c438c ("net_sched: restore "mpu xxx" handling")
973bf8fdd12f ("net: sched: Clarify error message when qdisc kind is unknown")
bb62a765b1b5 ("netfilter: conntrack: make all extensions 8-byte alignned")
5f31edc0676b ("netfilter: conntrack: move extension sizes into core")
1bc91a5ddf3e ("netfilter: conntrack: handle ->destroy hook via nat_ops instead")
1015c3de23ee ("netfilter: conntrack: remove extension register api")
34243b9ec856 ("netfilter: nft_ct: fix use after free when attaching zone template")
429c3be8a5e2 ("sch_htb: Fail on unsupported parameters when offload is requested")
98b608629746 ("net: sched: remove psched_tdiff_bounded()")
a459bc9a3a68 ("net: sched: remove qdisc_qlen_cpu()")
04c2a47ffb13 ("net: sched: fix use-after-free in tc_new_tfilter()")
35d39fecbc24 ("net/sched: Enable tc skb ext allocation on chain miss only when needed")
4ddc844eb81d ("net/sched: act_police: more accurate MTU policing")
5891cd5ec46c ("net_sched: add __rcu annotation to netdev->qdisc")
5740d0689096 ("net: sched: limit TC_ACT_REPEAT loops")
2f131de361f6 ("net/sched: act_ct: Fix flow table lookup after ct clear or switching zones")
ecf4a24cf978 ("net: sched: avoid newline at end of message in NL_SET_ERR_MSG_MOD")
b8cd5831c61c ("net: flow_offload: add tc police action parameters")
d97b4b105ce7 ("flow_offload: reject offload for all drivers with invalid police parameters")
fcb6aa86532c ("act_ct: Support GRE offload")
db6140e5e35a ("net/sched: act_ct: Fix flow table lookup failure with no originating ifindex")
d922a99b96d0 ("flow_offload: improve extack msg for user when adding invalid filter")
ab95465cde23 ("net/sched: add vlan push_eth and pop_eth action to the hardware IR")
054d5575cd6e ("net/sched: fix incorrect vlan_push_eth dest field")
bcb74e132a76 ("net/sched: act_ct: fix ref leak when switching zones")
2105f700b53c ("net/sched: flower: fix parsing of ethertype following VLAN header")
e65812fd22eb ("net/sched: fix initialization order when updating chain 0 head")
e8a64bbaaad1 ("net/sched: taprio: Check if socket flags are valid")
3db09e762dc7 ("net/sched: cls_u32: fix netns refcount changes in u32_change()")
ec5b0f605b10 ("net/sched: cls_u32: fix possible leak in u32_init_knode()")
8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
4d42d54a7d6a ("net/sched: act_pedit: sanitize shift argument before usage")
86360030cc51 ("net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6()")
```

Signed-off-by: Ivan Vecera <ivecera@redhat.com>

Approved-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Approved-by: Corinna Vinschen <vinschen@redhat.com>
Approved-by: Jarod Wilson <jarod@redhat.com>
Approved-by: Davide Caratti <dcaratti@redhat.com>
Approved-by: Petr Oros <poros@redhat.com>

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
2022-06-21 10:07:08 +02:00
Ivan Vecera 963d0124f4 net: prefer nf_ct_put instead of nf_conntrack_put
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2090410

commit 408bdcfce8dfd6902f75fbcd3b99d8b24b506597
Author: Florian Westphal <fw@strlen.de>
Date:   Fri Jan 7 05:03:26 2022 +0100

    net: prefer nf_ct_put instead of nf_conntrack_put

    Its the same as nf_conntrack_put(), but without the
    need for an indirect call.  The downside is a module dependency on
    nf_conntrack, but all of these already depend on conntrack anyway.

    Cc: Paul Blakey <paulb@mellanox.com>
    Cc: dev@openvswitch.org
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-06 16:31:30 +02:00
Ivan Vecera 8dc405869d netfilter: conntrack: convert to refcount_t api
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2090410

commit 719774377622bc4025d2a74f551b5dc2158c6c30
Author: Florian Westphal <fw@strlen.de>
Date:   Fri Jan 7 05:03:22 2022 +0100

    netfilter: conntrack: convert to refcount_t api

    Convert nf_conn reference counting from atomic_t to refcount_t based api.
    refcount_t api provides more runtime sanity checks and will warn on
    certain constructs, e.g. refcount_inc() on a zero reference count, which
    usually indicates use-after-free.

    For this reason template allocation is changed to init the refcount to
    1, the subsequenct add operations are removed.

    Likewise, init_conntrack() is changed to set the initial refcount to 1
    instead refcount_inc().

    This is safe because the new entry is not (yet) visible to other cpus.

    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-06 16:31:29 +02:00
Ivan Vecera f723efa3d6 net: openvswitch: Fill act ct extension
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2090410

commit b702436a51dfdf1e2960fb8e228009e09eedb462
Author: Paul Blakey <paulb@nvidia.com>
Date:   Mon Jan 3 13:44:51 2022 +0200

    net: openvswitch: Fill act ct extension

    To give drivers the originating device information for optimized
    connection tracking offload, fill in act ct extension with
    ifindex from skb.

    Signed-off-by: Paul Blakey <paulb@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-06 16:31:27 +02:00
Antoine Tenart 794402f28c openvswitch: always update flow key after nat
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2082155
Upstream Status: linux.git
Tested: Sanity only

commit 60b44ca6bd7518dd38fa2719bc9240378b6172c3
Author: Aaron Conole <aconole@redhat.com>
Date:   Fri Mar 18 08:43:19 2022 -0400

    openvswitch: always update flow key after nat

    During NAT, a tuple collision may occur.  When this happens, openvswitch
    will make a second pass through NAT which will perform additional packet
    modification.  This will update the skb data, but not the flow key that
    OVS uses.  This means that future flow lookups, and packet matches will
    have incorrect data.  This has been supported since
    5d50aa83e2 ("openvswitch: support asymmetric conntrack").

    That commit failed to properly update the sw_flow_key attributes, since
    it only called the ovs_ct_nat_update_key once, rather than each time
    ovs_ct_nat_execute was called.  As these two operations are linked, the
    ovs_ct_nat_execute() function should always make sure that the
    sw_flow_key is updated after a successful call through NAT infrastructure.

    Fixes: 5d50aa83e2 ("openvswitch: support asymmetric conntrack")
    Cc: Dumitru Ceara <dceara@redhat.com>
    Cc: Numan Siddique <nusiddiq@redhat.com>
    Signed-off-by: Aaron Conole <aconole@redhat.com>
    Acked-by: Eelco Chaudron <echaudro@redhat.com>
    Link: https://lore.kernel.org/r/20220318124319.3056455-1-aconole@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2022-05-18 09:30:11 +02:00
Yejune Deng d2792e91de net: openvswitch: Remove unnecessary skb_nfct()
There is no need add 'if (skb_nfct(skb))' assignment, the
nf_conntrack_put() would check it.

Signed-off-by: Yejune Deng <yejunedeng@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-10 14:18:19 -07:00
Jakub Kicinski 8859a44ea0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

MAINTAINERS
 - keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
 - simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
 - trivial
include/linux/ethtool.h
 - trivial, fix kdoc while at it
include/linux/skmsg.h
 - move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
 - add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
 - trivial

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09 20:48:35 -07:00
Ilya Maximets 4d51419d49 openvswitch: fix send of uninitialized stack memory in ct limit reply
'struct ovs_zone_limit' has more members than initialized in
ovs_ct_limit_get_default_limit().  The rest of the memory is a random
kernel stack content that ends up being sent to userspace.

Fix that by using designated initializer that will clear all
non-specified fields.

Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-05 12:54:42 -07:00
Christophe JAILLET 7d42e84eb9 net: openvswitch: Use 'skb_push_rcsum()' instead of hand coding it
'skb_push()'/'skb_postpush_rcsum()' can be replaced by an equivalent
'skb_push_rcsum()' which is less verbose.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-04 01:43:02 -07:00
wenxu d29334c15d net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct
When openvswitch conntrack offload with act_ct action. The first rule
do conntrack in the act_ct in tc subsystem. And miss the next rule in
the tc and fallback to the ovs datapath but miss set post_ct flag
which will lead the ct_state_key with -trk flag.

Fixes: 7baf2429a1 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16 15:22:18 -07:00
Zheng Yongjun 5e359044c1 net: openvswitch: conntrack: simplify the return expression of ovs_ct_limit_get_default_limit()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-08 16:22:54 -08:00
Numan Siddique e2ef5203c8 net: openvswitch: Be liberal in tcp conntrack.
There is no easy way to distinguish if a conntracked tcp packet is
marked invalid because of tcp_in_window() check error or because
it doesn't belong to an existing connection. With this patch,
openvswitch sets liberal tcp flag for the established sessions so
that out of window packets are not marked invalid.

A helper function - nf_ct_set_tcp_be_liberal(nf_conn) is added which
sets this flag for both the directions of the nf_conn.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20201116130126.3065077-1-nusiddiq@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 09:53:48 -08:00
Jakub Kicinski 9d49aea13f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Small conflict around locking in rxrpc_process_event() -
channel_lock moved to bundle in next, while state lock
needs _bh() from net.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 15:44:50 -07:00
Dumitru Ceara 8aa7b526dc openvswitch: handle DNAT tuple collision
With multiple DNAT rules it's possible that after destination
translation the resulting tuples collide.

For example, two openvswitch flows:
nw_dst=10.0.0.10,tp_dst=10, actions=ct(commit,table=2,nat(dst=20.0.0.1:20))
nw_dst=10.0.0.20,tp_dst=10, actions=ct(commit,table=2,nat(dst=20.0.0.1:20))

Assuming two TCP clients initiating the following connections:
10.0.0.10:5000->10.0.0.10:10
10.0.0.10:5000->10.0.0.20:10

Both tuples would translate to 10.0.0.10:5000->20.0.0.1:20 causing
nf_conntrack_confirm() to fail because of tuple collision.

Netfilter handles this case by allocating a null binding for SNAT at
egress by default.  Perform the same operation in openvswitch for DNAT
if no explicit SNAT is requested by the user and allocate a null binding
for SNAT for packets in the "original" direction.

Reported-at: https://bugzilla.redhat.com/1877128
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: 05752523e5 ("openvswitch: Interface with NAT.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:20:35 -07:00
Rikard Falkeborn b980b313e5 net: openvswitch: Constify static struct genl_small_ops
The only usage of these is to assign their address to the small_ops field
in the genl_family struct, which is a const pointer, and applying
ARRAY_SIZE() on them. Make them const to allow the compiler to put them
in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 21:13:36 -07:00
Jakub Kicinski 66a9b9287d genetlink: move to smaller ops wherever possible
Bulk of the genetlink users can use smaller ops, move them.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:11 -07:00
Zeng Tao 7b066d173b net: openswitch: reuse the helper variable to improve the code readablity
In the function ovs_ct_limit_exit, there is already a helper vaibale
which could be reused to improve the readability, so i fix it in this
patch.

Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 14:24:08 -07:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Peilin Ye 9aba6c5b49 openvswitch: Prevent kernel-infoleak in ovs_ct_put_key()
ovs_ct_put_key() is potentially copying uninitialized kernel stack memory
into socket buffers, since the compiler may leave a 3-byte hole at the end
of `struct ovs_key_ct_tuple_ipv4` and `struct ovs_key_ct_tuple_ipv6`. Fix
it by initializing `orig` with memset().

Fixes: 9dd7f8907c ("openvswitch: Add original direction conntrack tuple to sw_flow_key.")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:09:44 -07:00
Tonghao Zhang 27de77cec9 net: openvswitch: ovs_ct_exit to be done under ovs_lock
syzbot wrote:
| =============================
| WARNING: suspicious RCU usage
| 5.7.0-rc1+ #45 Not tainted
| -----------------------------
| net/openvswitch/conntrack.c:1898 RCU-list traversed in non-reader section!!
|
| other info that might help us debug this:
| rcu_scheduler_active = 2, debug_locks = 1
| ...
|
| stack backtrace:
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
| Workqueue: netns cleanup_net
| Call Trace:
| ...
| ovs_ct_exit
| ovs_exit_net
| ops_exit_list.isra.7
| cleanup_net
| process_one_work
| worker_thread

To avoid that warning, invoke the ovs_ct_exit under ovs_lock and add
lockdep_ovsl_is_held as optional lockdep expression.

Link: https://lore.kernel.org/lkml/000000000000e642a905a0cbee6e@google.com
Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Reported-by: syzbot+7ef50afd3a211f879112@syzkaller.appspotmail.com
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-20 10:53:54 -07:00
Aaron Conole 5d50aa83e2 openvswitch: support asymmetric conntrack
The openvswitch module shares a common conntrack and NAT infrastructure
exposed via netfilter.  It's possible that a packet needs both SNAT and
DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
this because it runs through the NAT table twice - once on ingress and
again after egress.  The openvswitch module doesn't have such capability.

Like netfilter hook infrastructure, we should run through NAT twice to
keep the symmetry.

Fixes: 05752523e5 ("openvswitch: Interface with NAT.")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-04 16:31:15 -08:00
Yi-Hung Wei 248d45f1e1 openvswitch: Allow attaching helper in later commit
This patch allows to attach conntrack helper to a confirmed conntrack
entry.  Currently, we can only attach alg helper to a conntrack entry
when it is in the unconfirmed state.  This patch enables an use case
that we can firstly commit a conntrack entry after it passed some
initial conditions.  After that the processing pipeline will further
check a couple of packets to determine if the connection belongs to
a particular application, and attach alg helper to the connection
in a later stage.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-06 15:23:43 +02:00
Greg Rose ad06a566e1 openvswitch: Properly set L4 keys on "later" IP fragments
When IP fragments are reassembled before being sent to conntrack, the
key from the last fragment is used.  Unless there are reordering
issues, the last fragment received will not contain the L4 ports, so the
key for the reassembled datagram won't contain them.  This patch updates
the key once we have a reassembled datagram.

The handle_fragments() function works on L3 headers so we pull the L3/L4
flow key update code from key_extract into a new function
'key_extract_l3l4'.  Then we add a another new function
ovs_flow_key_update_l3l4() and export it so that it is accessible by
handle_fragments() for conntrack packet reassembly.

Co-authored-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28 14:53:51 -07:00
Yi-Hung Wei 7177895154 openvswitch: Fix conntrack cache with timeout
This patch addresses a conntrack cache issue with timeout policy.
Currently, we do not check if the timeout extension is set properly in the
cached conntrack entry.  Thus, after packet recirculate from conntrack
action, the timeout policy is not applied properly.  This patch fixes the
aforementioned issue.

Fixes: 06bd2bdf19 ("openvswitch: Add timeout support to ct action")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-25 14:48:43 -07:00