Commit Graph

645 Commits

Author SHA1 Message Date
Patrick Talbert d11925dd01 Merge: virtio-net: fix overflow inside virtnet_rq_alloc
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6240

virtio-net: fix overflow inside virtnet_rq_alloc

JIRA: https://issues.redhat.com/browse/RHEL-73638
CVE: CVE-2024-57843
Upstream: Merged

commit 6aacd1484468361d1d04badfe75f264fa5314864
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Tue Oct 29 16:46:12 2024 +0800

    virtio-net: fix overflow inside virtnet_rq_alloc

    When the frag just got a page, then may lead to regression on VM.
    Specially if the sysctl net.core.high_order_alloc_disable value is 1,
    then the frag always get a page when do refill.

    Which could see reliable crashes or scp failure (scp a file 100M in size
    to VM).

    The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning
    of a new frag. When the frag size is larger than PAGE_SIZE,
    everything is fine. However, if the frag is only one page and the
    total size of the buffer and virtnet_rq_dma is larger than one page, an
    overflow may occur.

    The commit f9dac92ba908 ("virtio_ring: enable premapped mode whatever
    use_dma_api") introduced this problem. And we reverted some commits to
    fix this in last linux version. Now we try to enable it and fix this
    bug directly.

    Here, when the frag size is not enough, we reduce the buffer len to fix
    this problem.

    Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
    Tested-by: Darren Kenny <darren.kenny@oracle.com>
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Jon Maloy <jmaloy@redhat.com>

Approved-by: Laurent Vivier <lvivier@redhat.com>
Approved-by: Eugenio Pérez <eperezma@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Patrick Talbert <ptalbert@redhat.com>
2025-02-13 02:24:31 -05:00
Jon Maloy 37f65ebfd0 virtio-net: fix overflow inside virtnet_rq_alloc
JIRA: https://issues.redhat.com/browse/RHEL-73638
CVE: CVE-2024-57843
Upstream: Merged

commit 6aacd1484468361d1d04badfe75f264fa5314864
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Tue Oct 29 16:46:12 2024 +0800

    virtio-net: fix overflow inside virtnet_rq_alloc

    When the frag just got a page, then may lead to regression on VM.
    Specially if the sysctl net.core.high_order_alloc_disable value is 1,
    then the frag always get a page when do refill.

    Which could see reliable crashes or scp failure (scp a file 100M in size
    to VM).

    The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning
    of a new frag. When the frag size is larger than PAGE_SIZE,
    everything is fine. However, if the frag is only one page and the
    total size of the buffer and virtnet_rq_dma is larger than one page, an
    overflow may occur.

    The commit f9dac92ba908 ("virtio_ring: enable premapped mode whatever
    use_dma_api") introduced this problem. And we reverted some commits to
    fix this in last linux version. Now we try to enable it and fix this
    bug directly.

    Here, when the frag size is not enough, we reduce the buffer len to fix
    this problem.

    Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
    Tested-by: Darren Kenny <darren.kenny@oracle.com>
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Jon Maloy <jmaloy@redhat.com>
2025-01-21 14:28:17 -05:00
Jon Maloy 525ce2f11c virtio_net: Add hash_key_length check
JIRA: https://issues.redhat.com/browse/RHEL-68253
CVE: CVE-2024-53082
Upstream: Merged

commit 3f7d9c1964fcd16d02a8a9d4fd6f6cb60c4cc530
Author: Philo Lu <lulie@linux.alibaba.com>
Date:   Mon Nov 4 16:57:04 2024 +0800

    virtio_net: Add hash_key_length check

    Add hash_key_length check in virtnet_probe() to avoid possible out of
    bound errors when setting/reading the hash key.

    Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
    Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Joe Damato <jdamato@fastly.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Jon Maloy <jmaloy@redhat.com>
2024-12-09 18:05:28 -05:00
Eric Auger 52fe18afb9 virtio_net: fix missing dma unmap for resize
JIRA: https://issues.redhat.com/browse/RHEL-3230

For rq, we have three cases getting buffers from virtio core:

1. virtqueue_get_buf{,_ctx}
2. virtqueue_detach_unused_buf
3. callback for virtqueue_resize

But in commit 295525e29a5b("virtio_net: merge dma operations when
filling mergeable buffers"), I missed the dma unmap for the #3 case.

That will leak some memory, because I did not release the pages referred
by the unused buffers.

If we do such script, we will make the system OOM.

    while true
    do
            ethtool -G ens4 rx 128
            ethtool -G ens4 rx 256
            free -m
    done

Fixes: 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20231226094333.47740-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 2311e06b9bf3d44e15f9175af177a782806f688f)
Signed-off-by: Eric Auger <eric.auger@redhat.com>

Conflicts:
in drivers/net/virtio_net.c
Contextual conflict in rs/net/virtio_net.c because we don't have
struct virtio_net_common_hdr downstream introduced by
dae64749db25 (“virtio_net: Introduce skb_vnet_common_hdr to avoid
typecasting”)
2024-08-21 11:53:04 +02:00
Eric Auger 8dc7939efb virtio_net: avoid data-races on dev->stats fields
JIRA: https://issues.redhat.com/browse/RHEL-3230

Use DEV_STATS_INC() and DEV_STATS_READ() which provide
atomicity on paths that can be used concurrently.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d12a26b74fb77434b73fe39022266c4b00907219)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
2024-08-21 11:51:16 +02:00
Eric Auger 6f87782127 virtio_net: fix the missing of the dma cpu sync
JIRA: https://issues.redhat.com/browse/RHEL-3230

Commit 295525e29a5b ("virtio_net: merge dma operations when filling
mergeable buffers") unmaps the buffer with DMA_ATTR_SKIP_CPU_SYNC when
the dma->ref is zero. We do that with DMA_ATTR_SKIP_CPU_SYNC, because we
do not want to do the sync for the entire page_frag. But that misses the
sync for the current area.

This patch does cpu sync regardless of whether the ref is zero or not.

Fixes: 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers")
Reported-by: Michael Roth <michael.roth@amd.com>
Closes: http://lore.kernel.org/all/20230926130451.axgodaa6tvwqs3ut@amd.com
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 5720c43d5216b5dbd9ab25595f7c61e55d36d4fc)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
2024-08-21 11:51:01 +02:00
Eric Auger bb0f29d172 virtio_net: merge dma operations when filling mergeable buffers
JIRA: https://issues.redhat.com/browse/RHEL-3230

Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.

This patch, the driver does the dma operation and manages the dma
address based the feature premapped of virtio core.

This way, we can perform only one dma operation for the pages of the
alloc frag. This is beneficial for the iommu device.

kernel command line: intel_iommu=on iommu.passthrough=0

       |  strict=0  | strict=1
Before |  775496pps | 428614pps
After  | 1109316pps | 742853pps

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-13-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 295525e29a5b5694a6e96864f0c1365f79639863)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
2024-08-21 11:50:41 +02:00
Lucas Zampieri a2c77db22b Merge: CNB95: ethtool: Support symmetric-xor RSS hash
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3990

JIRA: https://issues.redhat.com/browse/RHEL-31889  
Tested: Preverified by QA  
Depends: !3939 

Commits:
```
b9335a757232 ("net/mlx5e: Make flow classification filters static")
fb6e30a72539 ("net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops")
dcd8dbf9e734 ("net: ethtool: get rid of get/set_rxfh_context functions")
13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
7c402f77e8cb ("net: ethtool: copy input_xfrm to user-space in ethtool_get_rxfh")
0dd415d15505 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm")
501869fecfbc ("net: ethtool: Fix symmetric-xor RSS RX flow hash check")
948f97f9d8d2 ("net: ethtool: reject unsupported RSS input xfrm values")
```

Signed-off-by: Ivan Vecera <ivecera@redhat.com>

Approved-by: Michal Schmidt <mschmidt@redhat.com>
Approved-by: Corinna Vinschen <vinschen@redhat.com>
Approved-by: Antoine Tenart <atenart@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-05-16 13:34:27 +00:00
Ivan Vecera 6555e8128e net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops
JIRA: https://issues.redhat.com/browse/RHEL-31889

Conflicts:
- drivers/net/ethernet/fungible/funeth/funeth_ethtool.c
  hunk removed as the file does not exist in RHEL

commit fb6e30a72539ce28c1323aef4190d35aac106f6f
Author: Ahmed Zaki <ahmed.zaki@intel.com>
Date:   Tue Dec 12 17:33:14 2023 -0700

    net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops

    The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters
    as direct function arguments. This will force us to change the API (and
    all drivers' functions) every time some new parameters are added.

    This is part 1/2 of the fix, as suggested in [1]:

    - First simplify the code by always providing a pointer to all params
       (indir, key and func); the fact that some of them may be NULL seems
       like a weird historic thing or a premature optimization.
       It will simplify the drivers if all pointers are always present.

     - Then make the functions take a dev pointer, and a pointer to a
       single struct wrapping all arguments. The set_* should also take
       an extack.

    Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1]
    Suggested-by: Jakub Kicinski <kuba@kernel.org>
    Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
    Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-10 09:30:33 +02:00
Ivan Vecera 139012e61c net: move struct netdev_rx_queue out of netdevice.h
JIRA: https://issues.redhat.com/browse/RHEL-31916

Conflicts:
* include/linux/netdevice.h
  Adjusted due to KABI reservations made by RHEL
  commit 3b3a52715a ("net: exclude BPF/XDP from kABI")

commit 49e47a5b6145d86c30022fe0e949bbb24bae28ba
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Aug 2 18:02:29 2023 -0700

    net: move struct netdev_rx_queue out of netdevice.h

    struct netdev_rx_queue is touched in only a few places
    and having it defined in netdevice.h brings in the dependency
    on xdp.h, because struct xdp_rxq_info gets embedded in
    struct netdev_rx_queue.

    In prep for removal of xdp.h from netdevice.h move all
    the netdev_rx_queue stuff to a new header.

    We could technically break the new header up to avoid
    the sysfs.h include but it's so rarely included it
    doesn't seem to be worth it at this point.

    Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
    Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-04-05 16:03:26 +02:00
Ivan Vecera b4aa21f5ad net: introduce and use skb_frag_fill_page_desc()
JIRA: https://issues.redhat.com/browse/RHEL-12625

Conflicts:
* drivers/net/ethernet/freescale/enetc/enetc.c
- context due to missing 8feb020f92a5 ("net: ethernet: enetc: unlock
  XDP_REDIRECT for XDP non-linear buffers")
* drivers/net/ethernet/fungible/funeth/funeth_rx.c
  - removed hunk for non-existing file
* drivers/net/ethernet/marvell/mvneta.c
  - context due to missing 76a676947b56 ("net: mvneta: update frags bit
    before passing the xdp buffer to eBPF layer")
* drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
  - adjusted due to missing 27602319e328 ("net/mlx5e: RX, Take shared
    info fragment addition into a function")

commit b51f4113ebb02011f0ca86abc3134b28d2071b6a
Author: Yunsheng Lin <linyunsheng@huawei.com>
Date:   Thu May 11 09:12:12 2023 +0800

    net: introduce and use skb_frag_fill_page_desc()

    Most users use __skb_frag_set_page()/skb_frag_off_set()/
    skb_frag_size_set() to fill the page desc for a skb frag.

    Introduce skb_frag_fill_page_desc() to do that.

    net/bpf/test_run.c does not call skb_frag_off_set() to
    set the offset, "copy_from_user(page_address(page), ...)"
    and 'shinfo' being part of the 'data' kzalloced in
    bpf_test_init() suggest that it is assuming offset to be
    initialized as zero, so call skb_frag_fill_page_desc()
    with offset being zero for this case.

    Also, skb_frag_set_page() is not used anymore, so remove
    it.

    Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2023-10-11 12:38:04 +02:00
Laurent Vivier e48e032661 virtio_net: use control_buf for coalesce params
JIRA: https://issues.redhat.com/browse/RHEL-346

commit accc1bf23068c1cdc4c2b015320ba856e210dd98
Author: Brett Creeley <brett.creeley@amd.com>
Date:   Mon Jun 5 12:59:25 2023 -0700

    virtio_net: use control_buf for coalesce params

    Commit 699b045a8e43 ("net: virtio_net: notifications coalescing
    support") added coalescing command support for virtio_net. However,
    the coalesce commands are using buffers on the stack, which is causing
    the device to see DMA errors. There should also be a complaint from
    check_for_stack() in debug_dma_map_xyz(). Fix this by adding and using
    coalesce params from the control_buf struct, which aligns with other
    commands.

    Cc: stable@vger.kernel.org
    Fixes: 699b045a8e43 ("net: virtio_net: notifications coalescing support")
    Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
    Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
    Signed-off-by: Brett Creeley <brett.creeley@amd.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Link: https://lore.kernel.org/r/20230605195925.51625-1-brett.creeley@amd.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:16 +02:00
Laurent Vivier d3540e8bcd virtio_net: Fix error unwinding of XDP initialization
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 5306623a9826aa7d63b32c6a3803c798a765474d
Author: Feng Liu <feliu@nvidia.com>
Date:   Fri May 12 11:18:12 2023 -0400

    virtio_net: Fix error unwinding of XDP initialization

    When initializing XDP in virtnet_open(), some rq xdp initialization
    may hit an error causing net device open failed. However, previous
    rqs have already initialized XDP and enabled NAPI, which is not the
    expected behavior. Need to roll back the previous rq initialization
    to avoid leaks in error unwinding of init code.

    Also extract helper functions of disable and enable queue pairs.
    Use newly introduced disable helper function in error unwinding and
    virtnet_close. Use enable helper function in virtnet_open.

    Fixes: 754b8a21a9 ("virtio_net: setup xdp_rxq_info")
    Signed-off-by: Feng Liu <feliu@nvidia.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: William Tu <witu@nvidia.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:16 +02:00
Laurent Vivier 154460c561 virtio_net: introduce virtnet_build_skb()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 21e26a71f5d3cae2551abcda318263a6a8c15206
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:17 2023 +0800

    virtio_net: introduce virtnet_build_skb()

    This logic is used in multiple places, now we separate it into
    a helper.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier fe0e091ded virtio_net: introduce receive_small_build_xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 19e8c85e336d17ae43cf730590fe6337ea238af0
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:16 2023 +0800

    virtio_net: introduce receive_small_build_xdp

    Simplifying receive_small() function. Bringing the logic relating to
    build_skb together.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier f334c379c5 virtio_net: small: remove skip_xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit aef76506bc64bbf567490cbe437c26f1aadeee90
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:15 2023 +0800

    virtio_net: small: remove skip_xdp

    Because the skb build code is not shared between xdp and non-xdp, and
    the xdp code in receive_small() is simpler, so "skip_xdp" is not needed.
    We can remove it.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier 77e6aa85b2 virtio_net: small: avoid code duplication in xdp scenarios
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 7af70fc169bd254aea780115b0f355956f84902b
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:14 2023 +0800

    virtio_net: small: avoid code duplication in xdp scenarios

    Avoid the problem that some variables(headroom and so on) will repeat
    the calculation when process xdp.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier 959fb3d699 virtio_net: small: remove the delta
JIRA: https://issues.redhat.com/browse/RHEL-346

commit fc8ce84b09bcc7306f3128f783967ecbfa617207
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:13 2023 +0800

    virtio_net: small: remove the delta

    In the case of XDP-PASS, skb_reserve uses the "delta" to compatible
    non-XDP, now that is not shared between xdp and non-xdp, so we can
    remove this logic.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier c8b09bee30 virtio_net: introduce receive_small_xdp()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit c5f3e72f04c02cc2a0671adbb16224e61dc4bd8a
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:12 2023 +0800

    virtio_net: introduce receive_small_xdp()

    The purpose of this patch is to simplify the receive_small().
    Separate all the logic of XDP of small into a function.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier 4c5d25305d virtio_net: merge: remove skip_xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 59ba3b1a88a8251d8fd0ef847afb59aa83a47126
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:11 2023 +0800

    virtio_net: merge: remove skip_xdp

    Now, the logic of merge xdp process is simple, we can remove the
    skip_xdp.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:15 +02:00
Laurent Vivier 20f4bdbaca virtio_net: introduce receive_mergeable_xdp()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit d8f2835a4746f26523cb512dc17e2b0a00dd31a9
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:10 2023 +0800

    virtio_net: introduce receive_mergeable_xdp()

    The purpose of this patch is to simplify the receive_mergeable().
    Separate all the logic of XDP into a function.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier d1609e0599 virtio_net: virtnet_build_xdp_buff_mrg() auto release xdp shinfo
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 4cb00b13c064088352a4f2ca4a8279010ad218a8
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:09 2023 +0800

    virtio_net: virtnet_build_xdp_buff_mrg() auto release xdp shinfo

    virtnet_build_xdp_buff_mrg() auto release xdp shinfo then the caller no
    need to careful the xdp shinfo.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier 794daae9eb virtio_net: separate the logic of freeing the rest mergeable buf
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 80f50f918c6e2d3f46aae717e6b04271298ab1c0
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:08 2023 +0800

    virtio_net: separate the logic of freeing the rest mergeable buf

    This patch introduce a new function that frees the rest mergeable buf.
    The subsequent patch will reuse this function.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier 9f601a342f virtio_net: separate the logic of freeing xdp shinfo
JIRA: https://issues.redhat.com/browse/RHEL-346

commit bb2c1e9e75be4a059fa795aac58b805091c9dae7
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:07 2023 +0800

    virtio_net: separate the logic of freeing xdp shinfo

    This patch introduce a new function that releases the
    xdp shinfo. The subsequent patch will reuse this function.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier a98f4a5278 virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 00765f8ed74240419091a1708c195405b55fe243
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:06 2023 +0800

    virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp

    At present, we have two similar logic to perform the XDP prog.

    Therefore, this patch separates the code of executing XDP, which is
    conducive to later maintenance.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier 34c19a214e virtio_net: optimize mergeable_xdp_get_buf()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit dbe4fec2447dd215964aad88b0e06f96c6958ee9
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:05 2023 +0800

    virtio_net: optimize mergeable_xdp_get_buf()

    The previous patch, in order to facilitate review, I do not do any
    modification. This patch has made some optimization on the top.

    * remove some repeated logics in this function.
    * add fast check for passing without any alloc.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier eac8aaba95 virtio_net: introduce mergeable_xdp_get_buf()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit ad4858beb824aeba53deeae660ea7fab9e624bc0
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:04 2023 +0800

    virtio_net: introduce mergeable_xdp_get_buf()

    Separating the logic of preparation for xdp from receive_mergeable.

    The purpose of this is to simplify the logic of execution of XDP.

    The main logic here is that when headroom is insufficient, we need to
    allocate a new page and calculate offset. It should be noted that if
    there is new page, the variable page will refer to the new page.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:14 +02:00
Laurent Vivier abbc95223e virtio_net: mergeable xdp: put old page immediately
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 363d8ce4b94719a87dad865e2829f1dba0f7ef71
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon May 8 14:14:03 2023 +0800

    virtio_net: mergeable xdp: put old page immediately

    In the xdp implementation of virtio-net mergeable, it always checks
    whether two page is used and a page is selected to release. This is
    complicated for the processing of action, and be careful.

    In the entire process, we have such principles:
    * If xdp_page is used (PASS, TX, Redirect), then we release the old
      page.
    * If it is a drop case, we will release two. The old page obtained from
      buf is release inside err_xdp, and xdp_page needs be relased by us.

    But in fact, when we allocate a new page, we can release the old page
    immediately. Then just one is using, we just need to release the new
    page for drop case. On the drop path, err_xdp will release the variable
    "page", so we only need to let "page" point to the new xdp_page in
    advance.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:13 +02:00
Laurent Vivier 1e65e5ec40 virtio_net: suppress cpu stall when free_unused_bufs
JIRA: https://issues.redhat.com/browse/RHEL-346

commit f8bb5104394560e29017c25bcade4c6b7aabd108
Author: Wenliang Wang <wangwenliang.1995@bytedance.com>
Date:   Thu May 4 10:27:06 2023 +0800

    virtio_net: suppress cpu stall when free_unused_bufs

    For multi-queue and large ring-size use case, the following error
    occurred when free_unused_bufs:
    rcu: INFO: rcu_sched self-detected stall on CPU.

    Fixes: 986a4f4d45 ("virtio_net: multiqueue support")
    Signed-off-by: Wenliang Wang <wangwenliang.1995@bytedance.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:13 +02:00
Laurent Vivier a289f80a10 virtio_net: bugfix overflow inside xdp_linearize_page()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 853618d5886bf94812f31228091cd37d308230f7
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Fri Apr 14 14:08:35 2023 +0800

    virtio_net: bugfix overflow inside xdp_linearize_page()

    Here we copy the data from the original buf to the new page. But we
    not check that it may be overflow.

    As long as the size received(including vnethdr) is greater than 3840
    (PAGE_SIZE -VIRTIO_XDP_HEADROOM). Then the memcpy will overflow.

    And this is completely possible, as long as the MTU is large, such
    as 4096. In our test environment, this will cause crash. Since crash is
    caused by the written memory, it is meaningless, so I do not include it.

    Fixes: 72979a6c35 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:11 +02:00
Laurent Vivier b38b50a7cc virtio_net: free xdp shinfo frags when build_skb_from_xdp_buff() fails
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 1a3bd6eabae35afc5c6dbe2651f21467cf8ad3fd
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Wed Mar 15 09:52:23 2023 +0800

    virtio_net: free xdp shinfo frags when build_skb_from_xdp_buff() fails

    build_skb_from_xdp_buff() may return NULL, in this case
    we need to free the frags of xdp shinfo.

    Fixes: fab89bafa95b ("virtio-net: support multi-buffer xdp")
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:10 +02:00
Laurent Vivier ac1b6a2aa6 virtio_net: fix page_to_skb() miss headroom
JIRA: https://issues.redhat.com/browse/RHEL-346

commit fa0f1ba7c8233118b6fdaa65e2f5ded563d3e1fa
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Wed Mar 15 09:52:22 2023 +0800

    virtio_net: fix page_to_skb() miss headroom

    Because headroom is not passed to page_to_skb(), this causes the shinfo
    exceeds the range. Then the frags of shinfo are changed by other process.

    [  157.724634] stack segment: 0000 [#1] PREEMPT SMP NOPTI
    [  157.725358] CPU: 3 PID: 679 Comm: xdp_pass_user_f Tainted: G            E      6.2.0+ #150
    [  157.726401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/4
    [  157.727820] RIP: 0010:skb_release_data+0x11b/0x180
    [  157.728449] Code: 44 24 02 48 83 c3 01 39 d8 7e be 48 89 d8 48 c1 e0 04 41 80 7d 7e 00 49 8b 6c 04 30 79 0c 48 89 ef e8 89 b
    [  157.730751] RSP: 0018:ffffc90000178b48 EFLAGS: 00010202
    [  157.731383] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000
    [  157.732270] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff888100dd0b00
    [  157.733117] RBP: 5d5d76010f6e2408 R08: ffff888100dd0b2c R09: 0000000000000000
    [  157.734013] R10: ffffffff82effd30 R11: 000000000000a14e R12: ffff88810981ffc0
    [  157.734904] R13: ffff888100dd0b00 R14: 0000000000000002 R15: 0000000000002310
    [  157.735793] FS:  00007f06121d9740(0000) GS:ffff88842fcc0000(0000) knlGS:0000000000000000
    [  157.736794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  157.737522] CR2: 00007ffd9a56c084 CR3: 0000000104bda001 CR4: 0000000000770ee0
    [  157.738420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  157.739283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  157.740146] PKRU: 55555554
    [  157.740502] Call Trace:
    [  157.740843]  <IRQ>
    [  157.741117]  kfree_skb_reason+0x50/0x120
    [  157.741613]  __udp4_lib_rcv+0x52b/0x5e0
    [  157.742132]  ip_protocol_deliver_rcu+0xaf/0x190
    [  157.742715]  ip_local_deliver_finish+0x77/0xa0
    [  157.743280]  ip_sublist_rcv_finish+0x80/0x90
    [  157.743834]  ip_list_rcv_finish.constprop.0+0x16f/0x190
    [  157.744493]  ip_list_rcv+0x126/0x140
    [  157.744952]  __netif_receive_skb_list_core+0x29b/0x2c0
    [  157.745602]  __netif_receive_skb_list+0xed/0x160
    [  157.746190]  ? udp4_gro_receive+0x275/0x350
    [  157.746732]  netif_receive_skb_list_internal+0xf2/0x1b0
    [  157.747398]  napi_gro_receive+0xd1/0x210
    [  157.747911]  virtnet_receive+0x75/0x1c0
    [  157.748422]  virtnet_poll+0x48/0x1b0
    [  157.748878]  __napi_poll+0x29/0x1b0
    [  157.749330]  net_rx_action+0x27a/0x340
    [  157.749812]  __do_softirq+0xf3/0x2fb
    [  157.750298]  do_softirq+0xa2/0xd0
    [  157.750745]  </IRQ>
    [  157.751563]  <TASK>
    [  157.752329]  __local_bh_enable_ip+0x6d/0x80
    [  157.753178]  virtnet_xdp_set+0x482/0x860
    [  157.754159]  ? __pfx_virtnet_xdp+0x10/0x10
    [  157.755129]  dev_xdp_install+0xa4/0xe0
    [  157.756033]  dev_xdp_attach+0x20b/0x5e0
    [  157.756933]  do_setlink+0x82e/0xc90
    [  157.757777]  ? __nla_validate_parse+0x12b/0x1e0
    [  157.758744]  rtnl_setlink+0xd8/0x170
    [  157.759549]  ? mod_objcg_state+0xcb/0x320
    [  157.760328]  ? security_capable+0x37/0x60
    [  157.761209]  ? security_capable+0x37/0x60
    [  157.762072]  rtnetlink_rcv_msg+0x145/0x3d0
    [  157.762929]  ? ___slab_alloc+0x327/0x610
    [  157.763754]  ? __alloc_skb+0x141/0x170
    [  157.764533]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
    [  157.765422]  netlink_rcv_skb+0x58/0x110
    [  157.766229]  netlink_unicast+0x21f/0x330
    [  157.766951]  netlink_sendmsg+0x240/0x4a0
    [  157.767654]  sock_sendmsg+0x93/0xa0
    [  157.768434]  ? sockfd_lookup_light+0x12/0x70
    [  157.769245]  __sys_sendto+0xfe/0x170
    [  157.770079]  ? handle_mm_fault+0xe9/0x2d0
    [  157.770859]  ? preempt_count_add+0x51/0xa0
    [  157.771645]  ? up_read+0x3c/0x80
    [  157.772340]  ? do_user_addr_fault+0x1e9/0x710
    [  157.773166]  ? kvm_read_and_reset_apf_flags+0x49/0x60
    [  157.774087]  __x64_sys_sendto+0x29/0x30
    [  157.774856]  do_syscall_64+0x3c/0x90
    [  157.775518]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
    [  157.776382] RIP: 0033:0x7f06122def70

    Fixes: 18117a842ab0 ("virtio-net: remove xdp related info from page_to_skb()")
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:09 +02:00
Laurent Vivier 111b5c89a9 net: virtio_net: implement exact header length guest feature
JIRA: https://issues.redhat.com/browse/RHEL-346

commit be50da3e9d4ad1958f7b11322d44d94d5c25a4c1
Author: Jiri Pirko <jiri@nvidia.com>
Date:   Thu Mar 9 10:45:59 2023 +0100

    net: virtio_net: implement exact header length guest feature

    Virtio spec introduced a feature VIRTIO_NET_F_GUEST_HDRLEN which when
    set implicates that device benefits from knowing the exact size
    of the header. For compatibility, to signal to the device that
    the header is reliable driver also needs to set this feature.
    Without this feature set by driver, device has to figure
    out the header size itself.

    Quoting the original virtio spec:
    "hdr_len is a hint to the device as to how much of the header needs to
     be kept to copy into each packet"

    "a hint" might not be clear for the reader what does it mean, if it is
    "maybe like that" of "exactly like that". This feature just makes it
    crystal clear and let the device count on the hdr_len being filled up
    by the exact length of header.

    Also note the spec already has following note about hdr_len:
    "Due to various bugs in implementations, this field is not useful
     as a guarantee of the transport header size."

    Without this feature the device needs to parse the header in core
    data path handling. Accurate information helps the device to eliminate
    such header parsing and directly use the hardware accelerators
    for GSO operation.

    virtio_net_hdr_from_skb() fills up hdr_len to skb_headlen(skb).
    The driver already complies to fill the correct value. Introduce the
    feature and advertise it.

    Note that virtio spec also includes following note for device
    implementation:
    "Caution should be taken by the implementation so as to prevent
     a malicious driver from attacking the device by setting
     an incorrect hdr_len."

    There is a plan to support this feature in our emulated device.
    A device of SolidRun offers this feature bit. They claim this feature
    will save the device a few cycles for every GSO packet.

    Link: https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.html#x1-230006x3
    Signed-off-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Parav Pandit <parav@nvidia.com>
    Reviewed-by: Alvaro Karsz <alvaro.karsz@solid-run.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Acked-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/20230309094559.917857-1-jiri@resnulli.us
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:09 +02:00
Laurent Vivier 9654bc6435 virtio_net: add checking sq is full inside xdp xmit
JIRA: https://issues.redhat.com/browse/RHEL-346

commit cd1c604aa1d8c641f5edcb58b76352d4eba06ec1
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Wed Mar 8 10:49:35 2023 +0800

    virtio_net: add checking sq is full inside xdp xmit

    If the queue of xdp xmit is not an independent queue, then when the xdp
    xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter
    the following error.

    net ens4: Unexpected TXQ (0) queue failure: -28

    This patch adds a check whether sq is full in xdp xmit.

    Fixes: 56434a01b1 ("virtio_net: add XDP_TX support")
    Reported-by: Yichun Zhang <yichun@openresty.com>
    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:09 +02:00
Laurent Vivier c465f824d0 virtio_net: separate the logic of checking whether sq is full
JIRA: https://issues.redhat.com/browse/RHEL-346

commit b8ef4809bc7faa22e63de921ef56de21ed191af0
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Wed Mar 8 10:49:34 2023 +0800

    virtio_net: separate the logic of checking whether sq is full

    Separate the logic of checking whether sq is full. The subsequent patch
    will reuse this func.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:09 +02:00
Laurent Vivier 31b288fd1b virtio_net: reorder some funcs
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 25074a44ac4e5dae5b4a25dcb9bbfcbd00f15ae2
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Wed Mar 8 10:49:33 2023 +0800

    virtio_net: reorder some funcs

    The purpose of this is to facilitate the subsequent addition of new
    functions without introducing a separate declaration.

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:09 +02:00
Laurent Vivier 546937f524 virtio-net: Maintain reverse cleanup order
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 27369c9c2b722617063d6b80c758ab153f1d95d4
Author: Parav Pandit <parav@nvidia.com>
Date:   Fri Feb 3 15:37:38 2023 +0200

    virtio-net: Maintain reverse cleanup order

    To easily audit the code, better to keep the device stop()
    sequence to be mirror of the device open() sequence.

    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Signed-off-by: Parav Pandit <parav@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:06 +02:00
Laurent Vivier 65fcc9abba virtio-net: Keep stop() to follow mirror sequence of open()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 63b114042d8a9c02d9939889177c36dbdb17a588
Author: Parav Pandit <parav@nvidia.com>
Date:   Thu Feb 2 18:35:16 2023 +0200

    virtio-net: Keep stop() to follow mirror sequence of open()

    Cited commit in fixes tag frees rxq xdp info while RQ NAPI is
    still enabled and packet processing may be ongoing.

    Follow the mirror sequence of open() in the stop() callback.
    This ensures that when rxq info is unregistered, no rx
    packet processing is ongoing.

    Fixes: 754b8a21a9 ("virtio_net: setup xdp_rxq_info")
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Signed-off-by: Parav Pandit <parav@nvidia.com>
    Link: https://lore.kernel.org/r/20230202163516.12559-1-parav@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:06 +02:00
Laurent Vivier 14ac52174f virtio-net: fix possible unsigned integer overflow
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 981f14d42a7f1610292a1ef0f7cd00138fff361d
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Tue Jan 31 16:50:04 2023 +0800

    virtio-net: fix possible unsigned integer overflow

    When the single-buffer xdp is loaded and after xdp_linearize_page()
    is called, *num_buf becomes 0 and (*num_buf - 1) may overflow into
    a large integer in virtnet_build_xdp_buff_mrg(), resulting in
    unexpected packet dropping.

    Fixes: ef75cb51f139 ("virtio-net: build xdp_buff with multi buffers")
    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Link: https://lore.kernel.org/r/20230131085004.98687-1-hengqi@linux.alibaba.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:06 +02:00
Laurent Vivier 8a3fc76621 virtio-net: execute xdp_do_flush() before napi_complete_done()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit ad7e615f646c9b5b2cf655cdfb9d91a28db4f25a
Author: Magnus Karlsson <magnus.karlsson@intel.com>
Date:   Wed Jan 25 08:48:59 2023 +0100

    virtio-net: execute xdp_do_flush() before napi_complete_done()

    Make sure that xdp_do_flush() is always executed before
    napi_complete_done(). This is important for two reasons. First, a
    redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
    napi context X on CPU Y will be followed by a xdp_do_flush() from the
    same napi context and CPU. This is not guaranteed if the
    napi_complete_done() is executed before xdp_do_flush(), as it tells
    the napi logic that it is fine to schedule napi context X on another
    CPU. Details from a production system triggering this bug using the
    veth driver can be found following the first link below.

    The second reason is that the XDP_REDIRECT logic in itself relies on
    being inside a single NAPI instance through to the xdp_do_flush() call
    for RCU protection of all in-kernel data structures. Details can be
    found in the second link below.

    Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
    Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
    Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:06 +02:00
Laurent Vivier 24db9d9cdc virtio-net: Reduce debug name field size to 16 bytes
JIRA: https://issues.redhat.com/browse/RHEL-346

commit d0671115869d19ec76d658c4bf86d3211a8ea121
Author: Parav Pandit <parav@nvidia.com>
Date:   Mon Jan 23 05:55:11 2023 +0200

    virtio-net: Reduce debug name field size to 16 bytes

    virtio queue index can be maximum of 65535. 16 bytes are enough to store
    the vq name with the existing string prefix.

    With this change, send queue struct saves 24 bytes and receive
    queue saves whole cache line worth 64 bytes per structure
    due to saving in alignment bytes.

    Pahole results before:

    pahole -s drivers/net/virtio_net.o | \
        grep -e "send_queue" -e "receive_queue"
    send_queue      1112    0
    receive_queue   1280    1

    Pahole results after:
    pahole -s drivers/net/virtio_net.o | \
        grep -e "send_queue" -e "receive_queue"
    send_queue      1088    0
    receive_queue   1216    1

    Signed-off-by: Parav Pandit <parav@nvidia.com>
    Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier 4e77d9036e virtio-net: correctly enable callback during start_xmit
JIRA: https://issues.redhat.com/browse/RHEL-346

commit d71ebe8114b4bf622804b810f5e274069060a174
Author: Jason Wang <jasowang@redhat.com>
Date:   Tue Jan 17 11:47:07 2023 +0800

    virtio-net: correctly enable callback during start_xmit

    Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
    virtqueue callback via the following statement:

            do {
                    if (use_napi)
                            virtqueue_disable_cb(sq->vq);

                    free_old_xmit_skbs(sq, false);

            } while (use_napi && kick &&
                   unlikely(!virtqueue_enable_cb_delayed(sq->vq)));

    When NAPI is used and kick is false, the callback won't be enabled
    here. And when the virtqueue is about to be full, the tx will be
    disabled, but we still don't enable tx interrupt which will cause a TX
    hang. This could be observed when using pktgen with burst enabled.

    TO be consistent with the logic that tries to disable cb only for
    NAPI, fixing this by trying to enable delayed callback only when NAPI
    is enabled when the queue is about to be full.

    Fixes: a7766ef18b ("virtio_net: disable cb aggressively")
    Signed-off-by: Jason Wang <jasowang@redhat.com>
    Tested-by: Laurent Vivier <lvivier@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier fcebb9361f virtio_net: Reuse buffer free function
JIRA: https://issues.redhat.com/browse/RHEL-346

commit eb1d929f1551f226f59e38465c542df9071166d6
Author: Parav Pandit <parav@nvidia.com>
Date:   Mon Jan 16 22:27:08 2023 +0200

    virtio_net: Reuse buffer free function

    virtnet_rq_free_unused_buf() helper function to free the buffer
    already exists. Avoid code duplication by reusing existing function.

    Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Signed-off-by: Parav Pandit <parav@nvidia.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier 7456bdcef3 virtio-net: support multi-buffer xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit fab89bafa95b6f333b0e2dca0ae07a0b3600e954
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:29 2023 +0800

    virtio-net: support multi-buffer xdp

    Driver can pass the skb to stack by build_skb_from_xdp_buff().

    Driver forwards multi-buffer packets using the send queue
    when XDP_TX and XDP_REDIRECT, and clears the reference of multi
    pages when XDP_DROP.

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier 86f51201f5 virtio-net: remove xdp related info from page_to_skb()
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 18117a842ab029df139f776bf31eebff6d0e0730
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:28 2023 +0800

    virtio-net: remove xdp related info from page_to_skb()

    For the clear construction of xdp_buff, we remove the xdp processing
    interleaved with page_to_skb(). Now, the logic of xdp and building
    skb from xdp are separate and independent.

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier 2a18caa1e6 virtio-net: build skb from multi-buffer xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit b26aa481b4b710051dd663c89ed31f705a7a67eb
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:27 2023 +0800

    virtio-net: build skb from multi-buffer xdp

    This converts the xdp_buff directly to a skb, including
    multi-buffer and single buffer xdp. We'll isolate the
    construction of skb based on xdp from page_to_skb().

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier 7a114089d1 virtio-net: transmit the multi-buffer xdp
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 97717e8dbda1dede65c4df12891332502df632f3
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:26 2023 +0800

    virtio-net: transmit the multi-buffer xdp

    This serves as the basis for XDP_TX and XDP_REDIRECT
    to send a multi-buffer xdp_frame.

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:05 +02:00
Laurent Vivier 5284ea5142 virtio-net: construct multi-buffer xdp in mergeable
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 22174f79a44baf5e46faafff1d7b21363431b93a
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:25 2023 +0800

    virtio-net: construct multi-buffer xdp in mergeable

    Build multi-buffer xdp using virtnet_build_xdp_buff_mrg().

    For the prefilled buffer before xdp is set, we will probably use
    vq reset in the future. At the same time, virtio net currently
    uses comp pages, and bpf_xdp_frags_increase_tail() needs to calculate
    the tailroom of the last frag, which will involve the offset of the
    corresponding page and cause a negative value, so we disable tail
    increase by not setting xdp_rxq->frag_size.

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:04 +02:00
Laurent Vivier 5425621682 virtio-net: build xdp_buff with multi buffers
JIRA: https://issues.redhat.com/browse/RHEL-346

commit ef75cb51f13941cf7633227ade43c4d86ecbc336
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:24 2023 +0800

    virtio-net: build xdp_buff with multi buffers

    Support xdp for multi buffer packets in mergeable mode.

    Putting the first buffer as the linear part for xdp_buff,
    and the rest of the buffers as non-linear fragments to struct
    skb_shared_info in the tailroom belonging to xdp_buff.

    Let 'truesize' return to its literal meaning, that is, when
    xdp is set, it includes the length of headroom and tailroom.

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:04 +02:00
Laurent Vivier 31a304b99a virtio-net: update bytes calculation for xdp_frame
JIRA: https://issues.redhat.com/browse/RHEL-346

commit 50bd14bc98fa0c86ea1e688d93ef1ffe8f1572a0
Author: Heng Qi <hengqi@linux.alibaba.com>
Date:   Sat Jan 14 16:22:23 2023 +0800

    virtio-net: update bytes calculation for xdp_frame

    Update relative record value for xdp_frame as basis
    for multi-buffer xdp transmission.

    Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2023-08-08 22:16:04 +02:00