Commit Graph

550 Commits

Author SHA1 Message Date
CKI Backport Bot 10e16692d5 af_packet: do not call packet_read_pending() from tpacket_destruct_skb()
JIRA: https://issues.redhat.com/browse/RHEL-78307

commit 581073f626e387d3e7eed55c48c8495584ead7ba
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed May 15 16:33:58 2024 +0000

    af_packet: do not call packet_read_pending() from tpacket_destruct_skb()

    trafgen performance considerably sank on hosts with many cores
    after the blamed commit.

    packet_read_pending() is very expensive, and calling it
    in af_packet fast path defeats Daniel intent in commit
    b013840810 ("packet: use percpu mmap tx frame pending refcount")

    tpacket_destruct_skb() makes room for one packet, we can immediately
    wakeup a producer, no need to completely drain the tx ring.

    Fixes: 89ed5b5190 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Neil Horman <nhorman@tuxdriver.com>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/20240515163358.4105915-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2025-02-07 13:57:37 +00:00
Antoine Tenart 651f9c3281 af_packet: use sk_skb_reason_drop to free rx packets
JIRA: https://issues.redhat.com/browse/RHEL-48648
Upstream Status: net-next.git

commit e2e7d78d9a25c78dc829da400bcec857b8c41b78
Author: Yan Zhai <yan@cloudflare.com>
Date:   Mon Jun 17 11:09:27 2024 -0700

    af_packet: use sk_skb_reason_drop to free rx packets

    Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving
    socket to the tracepoint.

    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/r/202406011859.Aacus8GV-lkp@intel.com/
    Signed-off-by: Yan Zhai <yan@cloudflare.com>
    Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Antoine Tenart <atenart@redhat.com>
2024-07-16 17:29:42 +02:00
Lucas Zampieri 2a4499bdc5 Merge: CVE-2024-26862: packet: annotate data-races around ignore_outgoing
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4579

JIRA: https://issues.redhat.com/browse/RHEL-33238  
CVE: CVE-2024-26862

```
packet: annotate data-races around ignore_outgoing

ignore_outgoing is read locklessly from dev_queue_xmit_nit()
and packet_getsockopt()

Add appropriate READ_ONCE()/WRITE_ONCE() annotations.

syzbot reported:

BUG: KCSAN: data-race in dev_queue_xmit_nit / packet_setsockopt

write to 0xffff888107804542 of 1 bytes by task 22618 on cpu 0:
 packet_setsockopt+0xd83/0xfd0 net/packet/af_packet.c:4003
 do_sock_setsockopt net/socket.c:2311 [inline]
 __sys_setsockopt+0x1d8/0x250 net/socket.c:2334
 __do_sys_setsockopt net/socket.c:2343 [inline]
 __se_sys_setsockopt net/socket.c:2340 [inline]
 __x64_sys_setsockopt+0x66/0x80 net/socket.c:2340
 do_syscall_64+0xd3/0x1d0
 entry_SYSCALL_64_after_hwframe+0x6d/0x75

read to 0xffff888107804542 of 1 bytes by task 27 on cpu 1:
 dev_queue_xmit_nit+0x82/0x620 net/core/dev.c:2248
 xmit_one net/core/dev.c:3527 [inline]
 dev_hard_start_xmit+0xcc/0x3f0 net/core/dev.c:3547
 __dev_queue_xmit+0xf24/0x1dd0 net/core/dev.c:4335
 dev_queue_xmit include/linux/netdevice.h:3091 [inline]
 batadv_send_skb_packet+0x264/0x300 net/batman-adv/send.c:108
 batadv_send_broadcast_skb+0x24/0x30 net/batman-adv/send.c:127
 batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:392 [inline]
 batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:420 [inline]
 batadv_iv_send_outstanding_bat_ogm_packet+0x3f0/0x4b0 net/batman-adv/bat_iv_ogm.c:1700
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0x465/0x990 kernel/workqueue.c:3335
 worker_thread+0x526/0x730 kernel/workqueue.c:3416
 kthread+0x1d1/0x210 kernel/kthread.c:388
 ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243

value changed: 0x00 -> 0x01

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 27 Comm: kworker/u8:1 Tainted: G        W          6.8.0-syzkaller-08073-g480e035fc4c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet

Fixes: fa788d986a ("packet: add sockopt to ignore outgoing packets")
Reported-by: syzbot+c669c1136495a2e7c31f@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/CANn89i+Z7MfbkBLOv=p7KZ7=K1rKHO4P1OL5LYDCtBiyqsa9oQ@mail.gmail.com/T/#t
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6ebfad33161afacb3e1e59ed1c2feefef70f9f97)
```

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>

Approved-by: Xin Long <lxin@redhat.com>
Approved-by: Hangbin Liu <haliu@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-08 12:58:53 +00:00
cki-backport-bot 2e00a59dfd packet: annotate data-races around ignore_outgoing
JIRA: https://issues.redhat.com/browse/RHEL-33238
CVE: CVE-2024-26862

commit 6ebfad33161afacb3e1e59ed1c2feefef70f9f97
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 14 14:18:16 2024 +0000

    packet: annotate data-races around ignore_outgoing

    ignore_outgoing is read locklessly from dev_queue_xmit_nit()
    and packet_getsockopt()

    Add appropriate READ_ONCE()/WRITE_ONCE() annotations.

    syzbot reported:

    BUG: KCSAN: data-race in dev_queue_xmit_nit / packet_setsockopt

    write to 0xffff888107804542 of 1 bytes by task 22618 on cpu 0:
     packet_setsockopt+0xd83/0xfd0 net/packet/af_packet.c:4003
     do_sock_setsockopt net/socket.c:2311 [inline]
     __sys_setsockopt+0x1d8/0x250 net/socket.c:2334
     __do_sys_setsockopt net/socket.c:2343 [inline]
     __se_sys_setsockopt net/socket.c:2340 [inline]
     __x64_sys_setsockopt+0x66/0x80 net/socket.c:2340
     do_syscall_64+0xd3/0x1d0
     entry_SYSCALL_64_after_hwframe+0x6d/0x75

    read to 0xffff888107804542 of 1 bytes by task 27 on cpu 1:
     dev_queue_xmit_nit+0x82/0x620 net/core/dev.c:2248
     xmit_one net/core/dev.c:3527 [inline]
     dev_hard_start_xmit+0xcc/0x3f0 net/core/dev.c:3547
     __dev_queue_xmit+0xf24/0x1dd0 net/core/dev.c:4335
     dev_queue_xmit include/linux/netdevice.h:3091 [inline]
     batadv_send_skb_packet+0x264/0x300 net/batman-adv/send.c:108
     batadv_send_broadcast_skb+0x24/0x30 net/batman-adv/send.c:127
     batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:392 [inline]
     batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:420 [inline]
     batadv_iv_send_outstanding_bat_ogm_packet+0x3f0/0x4b0 net/batman-adv/bat_iv_ogm.c:1700
     process_one_work kernel/workqueue.c:3254 [inline]
     process_scheduled_works+0x465/0x990 kernel/workqueue.c:3335
     worker_thread+0x526/0x730 kernel/workqueue.c:3416
     kthread+0x1d1/0x210 kernel/kthread.c:388
     ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243

    value changed: 0x00 -> 0x01

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 27 Comm: kworker/u8:1 Tainted: G        W          6.8.0-syzkaller-08073-g480e035fc4c7 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
    Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet

    Fixes: fa788d986a ("packet: add sockopt to ignore outgoing packets")
    Reported-by: syzbot+c669c1136495a2e7c31f@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/netdev/CANn89i+Z7MfbkBLOv=p7KZ7=K1rKHO4P1OL5LYDCtBiyqsa9oQ@mail.gmail.com/T/#t
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-06-25 17:47:49 +00:00
Davide Caratti 433f0446bb af_packet: do not use READ_ONCE() in packet_bind()
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 6ffc57ea004234d9373c57b204fd10370a69f392

commit 6ffc57ea004234d9373c57b204fd10370a69f392
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri May 26 15:43:42 2023 +0000

    af_packet: do not use READ_ONCE() in packet_bind()

    A recent patch added READ_ONCE() in packet_bind() and packet_bind_spkt()

    This is better handled by reading pkt_sk(sk)->num later
    in packet_do_bind() while appropriate lock is held.

    READ_ONCE() in writers are often an evidence of something being wrong.

    Fixes: 822b5a1c17df ("af_packet: Fix data-races of pkt_sk(sk)->num.")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/r/20230526154342.2533026-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti 6d6eba2687 af_packet: Fix data-races of pkt_sk(sk)->num.
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 822b5a1c17df7e338b9f05d1cfe5764e37c7f74f

commit 822b5a1c17df7e338b9f05d1cfe5764e37c7f74f
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Wed May 24 16:29:34 2023 -0700

    af_packet: Fix data-races of pkt_sk(sk)->num.

    syzkaller found a data race of pkt_sk(sk)->num.

    The value is changed under lock_sock() and po->bind_lock, so we
    need READ_ONCE() to access pkt_sk(sk)->num without these locks in
    packet_bind_spkt(), packet_bind(), and sk_diag_fill().

    Note that WRITE_ONCE() is already added by commit c7d2ef5dd4
    ("net/packet: annotate accesses to po->bind").

    BUG: KCSAN: data-race in packet_bind / packet_do_bind

    write (marked) to 0xffff88802ffd1cee of 2 bytes by task 7322 on cpu 0:
     packet_do_bind+0x446/0x640 net/packet/af_packet.c:3236
     packet_bind+0x99/0xe0 net/packet/af_packet.c:3321
     __sys_bind+0x19b/0x1e0 net/socket.c:1803
     __do_sys_bind net/socket.c:1814 [inline]
     __se_sys_bind net/socket.c:1812 [inline]
     __x64_sys_bind+0x40/0x50 net/socket.c:1812
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x72/0xdc

    read to 0xffff88802ffd1cee of 2 bytes by task 7318 on cpu 1:
     packet_bind+0xbf/0xe0 net/packet/af_packet.c:3322
     __sys_bind+0x19b/0x1e0 net/socket.c:1803
     __do_sys_bind net/socket.c:1814 [inline]
     __se_sys_bind net/socket.c:1812 [inline]
     __x64_sys_bind+0x40/0x50 net/socket.c:1812
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x72/0xdc

    value changed: 0x0300 -> 0x0000

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 7318 Comm: syz-executor.4 Not tainted 6.3.0-13380-g7fddb5b5300c #4
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

    Fixes: 96ec632714 ("packet: Diag core and basic socket info dumping")
    Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/20230524232934.50950-1-kuniyu@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti ff0c2e3658 net/packet: convert po->pressure to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 791a3e9f1a86fe8eb09173c9788493b8b5c957f4

commit 791a3e9f1a86fe8eb09173c9788493b8b5c957f4
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:14 2023 +0000

    net/packet: convert po->pressure to an atomic flag

    Not only this removes some READ_ONCE()/WRITE_ONCE(),
    this also removes one integer.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti a62234c386 net/packet: convert po->running to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 61edf479818e63978cabd243b82ca80f8948a313

commit 61edf479818e63978cabd243b82ca80f8948a313
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:13 2023 +0000

    net/packet: convert po->running to an atomic flag

    Instead of consuming 32 bits for po->running, use
    one available bit in po->flags.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti 239ba4fd45 net/packet: convert po->has_vnet_hdr to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 50d935eafee292fc432d5ac8c8715a6492961abc

commit 50d935eafee292fc432d5ac8c8715a6492961abc
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:12 2023 +0000

    net/packet: convert po->has_vnet_hdr to an atomic flag

    po->has_vnet_hdr can be read locklessly.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti b2de794ec1 net/packet: convert po->tp_loss to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 164bddace2e03f6005e650cb88f101a66ebdc05a

commit 164bddace2e03f6005e650cb88f101a66ebdc05a
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:11 2023 +0000

    net/packet: convert po->tp_loss to an atomic flag

    tp_loss can be read locklessly.

    Convert it to an atomic flag to avoid races.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti d737f9b391 net/packet: convert po->tp_tx_has_off to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 7438344660fa55b33b8234c1797c886eb73667a7

commit 7438344660fa55b33b8234c1797c886eb73667a7
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:10 2023 +0000

    net/packet: convert po->tp_tx_has_off to an atomic flag

    This is to use existing space in po->flags, and reclaim
    the storage used by the non atomic bit fields.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti 80441b75f9 net/packet: annotate accesses to po->tp_tstamp
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit 1051ce4ab64db91f7b62369ddc321ba8747f8c84

commit 1051ce4ab64db91f7b62369ddc321ba8747f8c84
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:09 2023 +0000

    net/packet: annotate accesses to po->tp_tstamp

    tp_tstamp is read locklessly.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti 54f2f5c9a9 net/packet: convert po->auxdata to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit fd53c297aa7b077ae98a3d3d2d3aa278a1686ba6

commit fd53c297aa7b077ae98a3d3d2d3aa278a1686ba6
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:08 2023 +0000

    net/packet: convert po->auxdata to an atomic flag

    po->auxdata can be read while another thread
    is changing its value, potentially raising KCSAN splat.

    Convert it to PACKET_SOCK_AUXDATA flag.

    Fixes: 8dc4194474 ("[PACKET]: Add optional checksum computation for recvmsg")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Davide Caratti 952dcc9a03 net/packet: convert po->origdev to an atomic flag
JIRA: https://issues.redhat.com/browse/RHEL-33410
Upstream Status: net.git commit ee5675ecdf7a4e713ed21d98a70c2871d6ebed01

commit ee5675ecdf7a4e713ed21d98a70c2871d6ebed01
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 16 01:10:07 2023 +0000

    net/packet: convert po->origdev to an atomic flag

    syzbot/KCAN reported that po->origdev can be read
    while another thread is changing its value.

    We can avoid this splat by converting this field
    to an actual bit.

    Following patches will convert remaining 1bit fields.

    Fixes: 80feaacb8a ("[AF_PACKET]: Add option to return orig_dev to userspace.")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-06-06 11:39:13 +02:00
Ivan Vecera 27201dba76 packet: add a generic drop reason for receive
JIRA: https://issues.redhat.com/browse/RHEL-36218

commit 2f57dd94bdef083855366138646b26b05f410d99
Author: Yan Zhai <yan@cloudflare.com>
Date:   Mon Dec 4 11:33:28 2023 -0800

    packet: add a generic drop reason for receive

    Commit da37845fdc ("packet: uses kfree_skb() for errors.") switches
    from consume_skb to kfree_skb to improve error handling. However, this
    could bring a lot of noises when we monitor real packet drops in
    kfree_skb[1], because in tpacket_rcv or packet_rcv only packet clones
    can be freed, not actual packets.

    Adding a generic drop reason to allow distinguish these "clone drops".

    [1]: https://lore.kernel.org/netdev/CABWYdi00L+O30Q=Zah28QwZ_5RU-xcxLFUK2Zj08A8MrLk9jzg@mail.gmail.com/
    Fixes: da37845fdc ("packet: uses kfree_skb() for errors.")
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
    Signed-off-by: Yan Zhai <yan@cloudflare.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/ZW4piNbx3IenYnuw@debian.debian
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2024-05-14 13:13:23 +02:00
Davide Caratti d39eb7e09a net/packet: annotate data-races around tp->status
JIRA: https://issues.redhat.com/browse/RHEL-14526
Upstream Status: net.git commit 8a9896177784063d01068293caea3f74f6830ff6

commit 8a9896177784063d01068293caea3f74f6830ff6
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 3 14:56:00 2023 +0000

    net/packet: annotate data-races around tp->status

    Another syzbot report [1] is about tp->status lockless reads
    from __packet_get_status()

    [1]
    BUG: KCSAN: data-race in __packet_rcv_has_room / __packet_set_status

    write to 0xffff888117d7c080 of 8 bytes by interrupt on cpu 0:
    __packet_set_status+0x78/0xa0 net/packet/af_packet.c:407
    tpacket_rcv+0x18bb/0x1a60 net/packet/af_packet.c:2483
    deliver_skb net/core/dev.c:2173 [inline]
    __netif_receive_skb_core+0x408/0x1e80 net/core/dev.c:5337
    __netif_receive_skb_one_core net/core/dev.c:5491 [inline]
    __netif_receive_skb+0x57/0x1b0 net/core/dev.c:5607
    process_backlog+0x21f/0x380 net/core/dev.c:5935
    __napi_poll+0x60/0x3b0 net/core/dev.c:6498
    napi_poll net/core/dev.c:6565 [inline]
    net_rx_action+0x32b/0x750 net/core/dev.c:6698
    __do_softirq+0xc1/0x265 kernel/softirq.c:571
    invoke_softirq kernel/softirq.c:445 [inline]
    __irq_exit_rcu+0x57/0xa0 kernel/softirq.c:650
    sysvec_apic_timer_interrupt+0x6d/0x80 arch/x86/kernel/apic/apic.c:1106
    asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
    smpboot_thread_fn+0x33c/0x4a0 kernel/smpboot.c:112
    kthread+0x1d7/0x210 kernel/kthread.c:379
    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

    read to 0xffff888117d7c080 of 8 bytes by interrupt on cpu 1:
    __packet_get_status net/packet/af_packet.c:436 [inline]
    packet_lookup_frame net/packet/af_packet.c:524 [inline]
    __tpacket_has_room net/packet/af_packet.c:1255 [inline]
    __packet_rcv_has_room+0x3f9/0x450 net/packet/af_packet.c:1298
    tpacket_rcv+0x275/0x1a60 net/packet/af_packet.c:2285
    deliver_skb net/core/dev.c:2173 [inline]
    dev_queue_xmit_nit+0x38a/0x5e0 net/core/dev.c:2243
    xmit_one net/core/dev.c:3574 [inline]
    dev_hard_start_xmit+0xcf/0x3f0 net/core/dev.c:3594
    __dev_queue_xmit+0xefb/0x1d10 net/core/dev.c:4244
    dev_queue_xmit include/linux/netdevice.h:3088 [inline]
    can_send+0x4eb/0x5d0 net/can/af_can.c:276
    bcm_can_tx+0x314/0x410 net/can/bcm.c:302
    bcm_tx_timeout_handler+0xdb/0x260
    __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
    __hrtimer_run_queues+0x217/0x700 kernel/time/hrtimer.c:1749
    hrtimer_run_softirq+0xd6/0x120 kernel/time/hrtimer.c:1766
    __do_softirq+0xc1/0x265 kernel/softirq.c:571
    run_ksoftirqd+0x17/0x20 kernel/softirq.c:939
    smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
    kthread+0x1d7/0x210 kernel/kthread.c:379
    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

    value changed: 0x0000000000000000 -> 0x0000000020000081

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 6.4.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023

    Fixes: 69e3c75f4d ("net: TX_RING and packet mmap")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/20230803145600.2937518-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2023-10-24 15:12:11 +02:00
Jan Stancek de5c823f78 Merge: net/other: phase-2 backports for RHEL-9.3
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2763

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2219326
Upstream Status: all mainline in net.git
Tested: boot-tested only
Conflicts: None

Signed-off-by: Davide Caratti <dcaratti@redhat.com>

Approved-by: Florian Westphal <fwestpha@redhat.com>
Approved-by: Sabrina Dubroca <sdubroca@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-07-21 17:32:19 +02:00
Davide Caratti 223af5a247 af_packet: Don't send zero-byte data in packet_sendmsg_spkt().
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2219326
Upstream Status: net.git commit 6a341729fb31

commit 6a341729fb31b4c5df9f74f24b4b1c98410c9b87
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon May 1 13:28:57 2023 -0700

    af_packet: Don't send zero-byte data in packet_sendmsg_spkt().

    syzkaller reported a warning below [0].

    We can reproduce it by sending 0-byte data from the (AF_PACKET,
    SOCK_PACKET) socket via some devices whose dev->hard_header_len
    is 0.

        struct sockaddr_pkt addr = {
            .spkt_family = AF_PACKET,
            .spkt_device = "tun0",
        };
        int fd;

        fd = socket(AF_PACKET, SOCK_PACKET, 0);
        sendto(fd, NULL, 0, 0, (struct sockaddr *)&addr, sizeof(addr));

    We have a similar fix for the (AF_PACKET, SOCK_RAW) socket as
    commit dc633700f00f ("net/af_packet: check len when min_header_len
    equals to 0").

    Let's add the same test for the SOCK_PACKET socket.

    [0]:
    skb_assert_len
    WARNING: CPU: 1 PID: 19945 at include/linux/skbuff.h:2552 skb_assert_len include/linux/skbuff.h:2552 [inline]
    WARNING: CPU: 1 PID: 19945 at include/linux/skbuff.h:2552 __dev_queue_xmit+0x1f26/0x31d0 net/core/dev.c:4159
    Modules linked in:
    CPU: 1 PID: 19945 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    RIP: 0010:skb_assert_len include/linux/skbuff.h:2552 [inline]
    RIP: 0010:__dev_queue_xmit+0x1f26/0x31d0 net/core/dev.c:4159
    Code: 89 de e8 1d a2 85 fd 84 db 75 21 e8 64 a9 85 fd 48 c7 c6 80 2a 1f 86 48 c7 c7 c0 06 1f 86 c6 05 23 cf 27 04 01 e8 fa ee 56 fd <0f> 0b e8 43 a9 85 fd 0f b6 1d 0f cf 27 04 31 ff 89 de e8 e3 a1 85
    RSP: 0018:ffff8880217af6e0 EFLAGS: 00010282
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90001133000
    RDX: 0000000000040000 RSI: ffffffff81186922 RDI: 0000000000000001
    RBP: ffff8880217af8b0 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000001 R12: ffff888030045640
    R13: ffff8880300456b0 R14: ffff888030045650 R15: ffff888030045718
    FS:  00007fc5864da640(0000) GS:ffff88806cd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020005740 CR3: 000000003f856003 CR4: 0000000000770ee0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     <TASK>
     dev_queue_xmit include/linux/netdevice.h:3085 [inline]
     packet_sendmsg_spkt+0xc4b/0x1230 net/packet/af_packet.c:2066
     sock_sendmsg_nosec net/socket.c:724 [inline]
     sock_sendmsg+0x1b4/0x200 net/socket.c:747
     ____sys_sendmsg+0x331/0x970 net/socket.c:2503
     ___sys_sendmsg+0x11d/0x1c0 net/socket.c:2557
     __sys_sendmmsg+0x18c/0x430 net/socket.c:2643
     __do_sys_sendmmsg net/socket.c:2672 [inline]
     __se_sys_sendmmsg net/socket.c:2669 [inline]
     __x64_sys_sendmmsg+0x9c/0x100 net/socket.c:2669
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3c/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x72/0xdc
    RIP: 0033:0x7fc58791de5d
    Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
    RSP: 002b:00007fc5864d9cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00000000004bbf80 RCX: 00007fc58791de5d
    RDX: 0000000000000001 RSI: 0000000020005740 RDI: 0000000000000004
    RBP: 00000000004bbf80 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 000000000000000b R14: 00007fc58797e530 R15: 0000000000000000
     </TASK>
    ---[ end trace 0000000000000000 ]---
    skb len=0 headroom=16 headlen=0 tailroom=304
    mac=(16,0) net=(16,-1) trans=-1
    shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
    csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
    hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
    dev name=sit0 feat=0x00000006401d7869
    sk family=17 type=10 proto=0

    Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2023-07-03 10:57:40 +02:00
Paolo Abeni e4256bf256 net: add vlan_get_protocol_and_depth() helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2217529
Tested: LNST, Tier1

Upstream commit:
commit 4063384ef762cc5946fc7a3f89879e76c6ec51e2
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue May 9 13:18:57 2023 +0000

    net: add vlan_get_protocol_and_depth() helper

    Before blamed commit, pskb_may_pull() was used instead
    of skb_header_pointer() in __vlan_get_protocol() and friends.

    Few callers depended on skb->head being populated with MAC header,
    syzbot caught one of them (skb_mac_gso_segment())

    Add vlan_get_protocol_and_depth() to make the intent clearer
    and use it where sensible.

    This is a more generic fix than commit e9d3f80935b6
    ("net/af_packet: make sure to pull mac header") which was
    dealing with a similar issue.

    kernel BUG at include/linux/skbuff.h:2655 !
    invalid opcode: 0000 [#1] SMP KASAN
    CPU: 0 PID: 1441 Comm: syz-executor199 Not tainted 6.1.24-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
    RIP: 0010:__skb_pull include/linux/skbuff.h:2655 [inline]
    RIP: 0010:skb_mac_gso_segment+0x68f/0x6a0 net/core/gro.c:136
    Code: fd 48 8b 5c 24 10 44 89 6b 70 48 c7 c7 c0 ae 0d 86 44 89 e6 e8 a1 91 d0 00 48 c7 c7 00 af 0d 86 48 89 de 31 d2 e8 d1 4a e9 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
    RSP: 0018:ffffc90001bd7520 EFLAGS: 00010286
    RAX: ffffffff8469736a RBX: ffff88810f31dac0 RCX: ffff888115a18b00
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffffc90001bd75e8 R08: ffffffff84697183 R09: fffff5200037adf9
    R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000012
    R13: 000000000000fee5 R14: 0000000000005865 R15: 000000000000fed7
    FS: 000055555633f300(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000000 CR3: 0000000116fea000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    <TASK>
    [<ffffffff847018dd>] __skb_gso_segment+0x32d/0x4c0 net/core/dev.c:3419
    [<ffffffff8470398a>] skb_gso_segment include/linux/netdevice.h:4819 [inline]
    [<ffffffff8470398a>] validate_xmit_skb+0x3aa/0xee0 net/core/dev.c:3725
    [<ffffffff84707042>] __dev_queue_xmit+0x1332/0x3300 net/core/dev.c:4313
    [<ffffffff851a9ec7>] dev_queue_xmit+0x17/0x20 include/linux/netdevice.h:3029
    [<ffffffff851b4a82>] packet_snd net/packet/af_packet.c:3111 [inline]
    [<ffffffff851b4a82>] packet_sendmsg+0x49d2/0x6470 net/packet/af_packet.c:3142
    [<ffffffff84669a12>] sock_sendmsg_nosec net/socket.c:716 [inline]
    [<ffffffff84669a12>] sock_sendmsg net/socket.c:736 [inline]
    [<ffffffff84669a12>] __sys_sendto+0x472/0x5f0 net/socket.c:2139
    [<ffffffff84669c75>] __do_sys_sendto net/socket.c:2151 [inline]
    [<ffffffff84669c75>] __se_sys_sendto net/socket.c:2147 [inline]
    [<ffffffff84669c75>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
    [<ffffffff8551d40f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<ffffffff8551d40f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
    [<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

    Fixes: 469aceddfa ("vlan: consolidate VLAN parsing code and limit max parsing depth")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Toke Høiland-Jørgensen <toke@redhat.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-26 16:58:59 +02:00
Jan Stancek 9c11981ed8 Merge: net: support ipv4 big tcp
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2404

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2185290
Tested: big tcp selftest
Depends: !2458

v1->v2:
- add patch 1/13 to fix the UAPI break caused by the backport of dca56c3038c3 ("net: expose devlink port over rtnetlink").
- adjust patch 10/13 based on the latest rhel-9 kernel.
- add patch 13/13 to fix a crash for both IPv4 and IPv6 BIG TCP.
v2->v3:
- removed the 1/13, and rebase on top of !2458.

Signed-off-by: Xin Long <lxin@redhat.com>

Approved-by: Ivan Vecera <ivecera@redhat.com>
Approved-by: Andrea Claudi <aclaudi@redhat.com>
Approved-by: Sabrina Dubroca <sdubroca@redhat.com>
Approved-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-05-10 10:51:42 +02:00
Xin Long 0504322d7e packet: add TP_STATUS_GSO_TCP for tp_status
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2185290
Tested: compile only

commit 8e08bb75b60f7f9ed319185cef80188b87d9b43a
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sat Jan 28 10:58:37 2023 -0500

    packet: add TP_STATUS_GSO_TCP for tp_status

    Introduce TP_STATUS_GSO_TCP tp_status flag to tell the af_packet user
    that this is a TCP GSO packet. When parsing IPv4 BIG TCP packets in
    tcpdump/libpcap, it can use tp_len as the IPv4 packet len when this
    flag is set, as iph tot_len is set to 0 for IPv4 BIG TCP packets.

    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Xin Long <lxin@redhat.com>
2023-05-02 10:36:11 -04:00
Davide Caratti 49d4258224 packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2190429
Upstream Status: net.git commit b85f628aa158

commit b85f628aa158a653c006e9c1405a117baef8c868
Author: Willem de Bruijn <willemb@google.com>
Date:   Mon Nov 28 11:18:12 2022 -0500

    packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE

    CHECKSUM_COMPLETE signals that skb->csum stores the sum over the
    entire packet. It does not imply that an embedded l4 checksum
    field has been validated.

    Fixes: 682f048bd4 ("af_packet: pass checksum validation status to the user")
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/20221128161812.640098-1-willemdebruijn.kernel@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2023-04-28 14:11:35 +02:00
Íñigo Huguet 3a91b473a8 net: rename reference+tracking helpers
Bugzilla: https://bugzilla.redhat.com/2175258

Conflicts:
 - Removed chunks of unsupported protocol AX.25
 - Renamed the funtions also in ipvlan. Commit 40b9d1ab63f5 ("ipvlan: hold lower
   dev to avoid possible use-after-free") was backported out of order so it had
   to use the old functions names.

commit d62607c3fe45911b2331fac073355a8c914bbde2
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue Jun 7 21:39:55 2022 -0700

    net: rename reference+tracking helpers

    Netdev reference helpers have a dev_ prefix for historic
    reasons. Renaming the old helpers would be too much churn
    but we can rename the tracking ones which are relatively
    recent and should be the default for new code.

    Rename:
     dev_hold_track()    -> netdev_hold()
     dev_put_track()     -> netdev_put()
     dev_replace_track() -> netdev_ref_replace()

    Link: https://lore.kernel.org/r/20220608043955.919359-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2023-03-23 16:19:21 +01:00
Frantisek Hrbata 34b02be423 Merge: CNB: net: remove noblock parameter from skb_recv_datagram()
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1655

Bugzilla: https://bugzilla.redhat.com/2143360
Tested: build, boot

Conflicts:
 - isotp: missing many commits, such as:
   30ffd5332e06 ("can: isotp: return -EADDRNOTAVAIL when reading from unbound socket")
   42bf50a1795a ("can: isotp: support MSG_TRUNC flag when reading from socket")
   e382fea8ae54 ("can: isotp: restore accidentally removed MSG_PEEK feature")
 - removed chunks of non existent net/mctp

```
commit f4b41f062c424209e3939a81e6da022e049a45f2
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date:   Mon Apr 4 18:30:22 2022 +0200

    net: remove noblock parameter from skb_recv_datagram()

    skb_recv_datagram() has two parameters 'flags' and 'noblock' that are
    merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)'

    As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags'
    into 'flags' and 'noblock' with finally obsolete bit operations like this:

    skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc);

    And this is not even done consistently with the 'flags' parameter.

    This patch removes the obsolete and costly splitting into two parameters
    and only performs bit operations when really needed on the caller side.

    One missing conversion thankfully reported by kernel test robot. I missed
    to enable kunit tests to build the mctp code.

    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
```

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>

Approved-by: Ivan Vecera <ivecera@redhat.com>
Approved-by: Xin Long <lxin@redhat.com>

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
2022-11-30 08:10:47 -05:00
Íñigo Huguet e24462420c net: remove noblock parameter from skb_recv_datagram()
Bugzilla: https://bugzilla.redhat.com/2143360

Conflicts:
 - isotp: missing many commits, such as:
   30ffd5332e06 ("can: isotp: return -EADDRNOTAVAIL when reading from unbound socket")
   42bf50a1795a ("can: isotp: support MSG_TRUNC flag when reading from socket")
   e382fea8ae54 ("can: isotp: restore accidentally removed MSG_PEEK feature")
 - removed chunks of non existent net/mctp

commit f4b41f062c424209e3939a81e6da022e049a45f2
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date:   Mon Apr 4 18:30:22 2022 +0200

    net: remove noblock parameter from skb_recv_datagram()
    
    skb_recv_datagram() has two parameters 'flags' and 'noblock' that are
    merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)'
    
    As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags'
    into 'flags' and 'noblock' with finally obsolete bit operations like this:
    
    skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc);
    
    And this is not even done consistently with the 'flags' parameter.
    
    This patch removes the obsolete and costly splitting into two parameters
    and only performs bit operations when really needed on the caller side.
    
    One missing conversion thankfully reported by kernel test robot. I missed
    to enable kunit tests to build the mctp code.
    
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
2022-11-18 11:18:14 +01:00
Jiri Benc c387356f8d net: Handle delivery_time in skb->tstamp during network tapping with af_packet
Bugzilla: https://bugzilla.redhat.com/2120966

commit 27942a15209f564ed8ee2a9e126cb7b105181355
Author: Martin KaFai Lau <kafai@fb.com>
Date:   Wed Mar 2 11:55:38 2022 -0800

    net: Handle delivery_time in skb->tstamp during network tapping with af_packet

    A latter patch will set the skb->mono_delivery_time to flag the skb->tstamp
    is used as the mono delivery_time (EDT) instead of the (rcv) timestamp.
    skb_clear_tstamp() will then keep this delivery_time during forwarding.

    This patch is to make the network tapping (with af_packet) to handle
    the delivery_time stored in skb->tstamp.

    Regardless of tapping at the ingress or egress,  the tapped skb is
    received by the af_packet socket, so it is ingress to the af_packet
    socket and it expects the (rcv) timestamp.

    When tapping at egress, dev_queue_xmit_nit() is used.  It has already
    expected skb->tstamp may have delivery_time,  so it does
    skb_clone()+net_timestamp_set() to ensure the cloned skb has
    the (rcv) timestamp before passing to the af_packet sk.
    This patch only adds to clear the skb->mono_delivery_time
    bit in net_timestamp_set().

    When tapping at ingress, it currently expects the skb->tstamp is either 0
    or the (rcv) timestamp.  Meaning, the tapping at ingress path
    has already expected the skb->tstamp could be 0 and it will get
    the (rcv) timestamp by ktime_get_real() when needed.

    There are two cases for tapping at ingress:

    One case is af_packet queues the skb to its sk_receive_queue.
    The skb is either not shared or new clone created.  The newly
    added skb_clear_delivery_time() is called to clear the
    delivery_time (if any) and set the (rcv) timestamp if
    needed before the skb is queued to the sk_receive_queue.

    Another case, the ingress skb is directly copied to the rx_ring
    and tpacket_get_timestamp() is used to get the (rcv) timestamp.
    The newly added skb_tstamp() is used in tpacket_get_timestamp()
    to check the skb->mono_delivery_time bit before returning skb->tstamp.
    As mentioned earlier, the tapping@ingress has already expected
    the skb may not have the (rcv) timestamp (because no sk has asked
    for it) and has handled this case by directly calling ktime_get_real().

    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Jiri Benc <jbenc@redhat.com>
2022-10-25 14:58:00 +02:00
Petr Oros 21e2fb0e83 net: Don't include filter.h from net/sock.h
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2101792

Conflicts:
drivers/infiniband/core/cache.c
- adjusted context conflict due to missing b74525f21e33ab ("RDMA/core:
  Delete useless module.h include")
drivers/infiniband/hw/mlx5/fs.c
- missing upstream commit ffa501ef196312 ("RDMA/mlx5: Add steering support in
  optional flow counters") adding net/inet_ecn.h. Without inet_ecn.h missing
  declarations for ether_addr_copy() and is_multicast_ether_addr()
  We add net/inet_ecn.h include in this commit.
drivers/net/amt.c
- Unmerged because file missing in RHEL

Upstream commit(s):
commit b6459415b384cb829f0b2a4268f211c789f6cf0b
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue Dec 28 16:49:13 2021 -0800

    net: Don't include filter.h from net/sock.h

    sock.h is pretty heavily used (5k objects rebuilt on x86 after
    it's touched). We can drop the include of filter.h from it and
    add a forward declaration of struct sk_filter instead.
    This decreases the number of rebuilt objects when bpf.h
    is touched from ~5k to ~1k.

    There's a lot of missing includes this was masking. Primarily
    in networking tho, this time.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
    Acked-by: Florian Fainelli <f.fainelli@gmail.com>
    Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Acked-by: Stefano Garzarella <sgarzare@redhat.com>
    Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org

Signed-off-by: Petr Oros <poros@redhat.com>
2022-07-13 10:49:16 +02:00
Ivan Vecera 23c617e70c af_packet: fix tracking issues in packet_do_bind()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

commit bf44077c1b3ae86668bce02d9466e7134a6569ec
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Jan 7 10:39:53 2022 -0800

    af_packet: fix tracking issues in packet_do_bind()

    It appears that my changes in packet_do_bind() were
    slightly wrong.

    syzbot found that calling bind() twice would trigger
    a false positive.

    Remove proto_curr/dev_curr variables and rewrite things
    to be less confusing (like not having to use netdev_tracker_alloc(),
    and instead use the standard dev_hold_track())

    Fixes: f1d9268e0618 ("net: add net device refcount tracker to struct packet_type")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Link: https://lore.kernel.org/r/20220107183953.3886647-1-eric.dumazet@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:39:17 +02:00
Ivan Vecera f0390026af net: add net device refcount tracker to struct packet_type
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

commit f1d9268e061863ead77b07f5a6807d063e28a1c2
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Dec 14 07:09:33 2021 -0800

    net: add net device refcount tracker to struct packet_type

    Most notable changes are in af_packet, tipc ones are trivial.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Jon Maloy <jmaloy@redhat.com>
    Cc: Ying Xue <ying.xue@windriver.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:39:15 +02:00
Ivan Vecera fa0c210030 net: drop nopreempt requirement on sock_prot_inuse_add()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

commit b3cb764aa1d753cf6a58858f9e2097ba71e8100b
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Nov 15 09:11:50 2021 -0800

    net: drop nopreempt requirement on sock_prot_inuse_add()

    This is distracting really, let's make this simpler,
    because many callers had to take care of this
    by themselves, even if on x86 this adds more
    code than really needed.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:35:56 +02:00
Hangbin Liu dca27c0c94 net/af_packet: make sure to pull mac header
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2089566
Upstream Status: net.git commit e9d3f80935b6

commit e9d3f80935b6607dcdc5682b00b1d4b28e0a0c5d
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Jun 2 09:18:59 2022 -0700

    net/af_packet: make sure to pull mac header

    GSO assumes skb->head contains link layer headers.

    tun device in some case can provide base 14 bytes,
    regardless of VLAN being used or not.

    After blamed commit, we can end up setting a network
    header offset of 18+, we better pull the missing
    bytes to avoid a posible crash in GSO.

    syzbot report was:
    kernel BUG at include/linux/skbuff.h:2699!
    invalid opcode: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 3601 Comm: syz-executor210 Not tainted 5.18.0-syzkaller-11338-g2c5ca23f7414 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:__skb_pull include/linux/skbuff.h:2699 [inline]
    RIP: 0010:skb_mac_gso_segment+0x48f/0x530 net/core/gro.c:136
    Code: 00 48 c7 c7 00 96 d4 8a c6 05 cb d3 45 06 01 e8 26 bb d0 01 e9 2f fd ff ff 49 c7 c4 ea ff ff ff e9 f1 fe ff ff e8 91 84 19 fa <0f> 0b 48 89 df e8 97 44 66 fa e9 7f fd ff ff e8 ad 44 66 fa e9 48
    RSP: 0018:ffffc90002e2f4b8 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000000
    RDX: ffff88805bb58000 RSI: ffffffff8760ed0f RDI: 0000000000000004
    RBP: 0000000000005dbc R08: 0000000000000004 R09: 0000000000000fe0
    R10: 0000000000000fe4 R11: 0000000000000000 R12: 0000000000000fe0
    R13: ffff88807194d780 R14: 1ffff920005c5e9b R15: 0000000000000012
    FS:  000055555730f300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000200015c0 CR3: 0000000071ff8000 CR4: 0000000000350ee0
    Call Trace:
     <TASK>
     __skb_gso_segment+0x327/0x6e0 net/core/dev.c:3411
     skb_gso_segment include/linux/netdevice.h:4749 [inline]
     validate_xmit_skb+0x6bc/0xf10 net/core/dev.c:3669
     validate_xmit_skb_list+0xbc/0x120 net/core/dev.c:3719
     sch_direct_xmit+0x3d1/0xbe0 net/sched/sch_generic.c:327
     __dev_xmit_skb net/core/dev.c:3815 [inline]
     __dev_queue_xmit+0x14a1/0x3a00 net/core/dev.c:4219
     packet_snd net/packet/af_packet.c:3071 [inline]
     packet_sendmsg+0x21cb/0x5550 net/packet/af_packet.c:3102
     sock_sendmsg_nosec net/socket.c:714 [inline]
     sock_sendmsg+0xcf/0x120 net/socket.c:734
     ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
     ___sys_sendmsg+0xf3/0x170 net/socket.c:2546
     __sys_sendmsg net/socket.c:2575 [inline]
     __do_sys_sendmsg net/socket.c:2584 [inline]
     __se_sys_sendmsg net/socket.c:2582 [inline]
     __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x46/0xb0
    RIP: 0033:0x7f4b95da06c9
    Code: 28 c3 e8 4a 15 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007ffd7defc4c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007ffd7defc4f0 RCX: 00007f4b95da06c9
    RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
    RBP: 0000000000000003 R08: bb1414ac00000050 R09: bb1414ac00000050
    R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007ffd7defc4e0 R14: 00007ffd7defc4d8 R15: 00007ffd7defc4d4
     </TASK>

    Fixes: dfed913e8b55 ("net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Acked-by: Hangbin Liu <liuhangbin@gmail.com>
    Acked-by: Willem de Bruijn <willemb@google.com>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Hangbin Liu <haliu@redhat.com>
2022-06-06 10:54:08 +08:00
Hangbin Liu 13fff61813 net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2089566
Upstream Status: net-next.git commit dfed913e8b55

commit dfed913e8b55a0c2c4906f1242fd38fd9a116e49
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Mon Apr 25 09:45:02 2022 +0800

    net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO

    Currently, the kernel drops GSO VLAN tagged packet if it's created with
    socket(AF_PACKET, SOCK_RAW, 0) plus virtio_net_hdr.

    The reason is AF_PACKET doesn't adjust the skb network header if there is
    a VLAN tag. Then after virtio_net_hdr_set_proto() called, the skb->protocol
    will be set to ETH_P_IP/IPv6. And in later inet/ipv6_gso_segment() the skb
    is dropped as network header position is invalid.

    Let's handle VLAN packets by adjusting network header position in
    packet_parse_headers(). The adjustment is safe and does not affect the
    later xmit as tap device also did that.

    In packet_snd(), packet_parse_headers() need to be moved before calling
    virtio_net_hdr_set_proto(), so we can set correct skb->protocol and
    network header first.

    There is no need to update tpacket_snd() as it calls packet_parse_headers()
    in tpacket_fill_skb(), which is already before calling virtio_net_hdr_*
    functions.

    skb->no_fcs setting is also moved upper to make all skb settings together
    and keep consistency with function packet_sendmsg_spkt().

    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Acked-by: Willem de Bruijn <willemb@google.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Link: https://lore.kernel.org/r/20220425014502.985464-1-liuhangbin@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Hangbin Liu <haliu@redhat.com>
2022-06-06 10:54:08 +08:00
Patrick Talbert f311aab772 Merge: net: backport core fixes from upstream
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/832

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2081920

A bunch of fixes for net core path.

Signed-off-by: Hangbin Liu <haliu@redhat.com>

Approved-by: Antoine Tenart <atenart@redhat.com>
Approved-by: Jarod Wilson <jarod@redhat.com>

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
2022-05-18 10:58:56 +02:00
Hangbin Liu c1cf36bbb0 net: fix information leakage in /proc/net/ptype
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2081920
Upstream Status: net.git commit 47934e06b656

commit 47934e06b65637c88a762d9c98329ae6e3238888
Author: Congyu Liu <liu3101@purdue.edu>
Date:   Tue Jan 18 14:20:13 2022 -0500

    net: fix information leakage in /proc/net/ptype

    In one net namespace, after creating a packet socket without binding
    it to a device, users in other net namespaces can observe the new
    `packet_type` added by this packet socket by reading `/proc/net/ptype`
    file. This is minor information leakage as packet socket is
    namespace aware.

    Add a net pointer in `packet_type` to keep the net namespace of
    of corresponding packet socket. In `ptype_seq_show`, this net pointer
    must be checked when it is not NULL.

    Fixes: 2feb27dbe0 ("[NETNS]: Minor information leak via /proc/net/ptype file.")
    Signed-off-by: Congyu Liu <liu3101@purdue.edu>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Hangbin Liu <haliu@redhat.com>
2022-05-05 12:26:41 +08:00
Xin Long fb94651cf5 net/packet: fix packet_sock xmit return value checking
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2080477
Tested: compile only

commit 29e8e659f984be00d75ec5fef4e37c88def72712
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Thu Apr 14 16:49:25 2022 +0800

    net/packet: fix packet_sock xmit return value checking

    packet_sock xmit could be dev_queue_xmit, which also returns negative
    errors. So only checking positive errors is not enough, or userspace
    sendmsg may return success while packet is not send out.

    Move the net_xmit_errno() assignment in the braces as checkpatch.pl said
    do not use assignment in if condition.

    Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
    Reported-by: Flavio Leitner <fbl@redhat.com>
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Xin Long <lxin@redhat.com>
2022-05-03 10:49:18 -04:00
Xin Long 84379502c6 net/packet: fix slab-out-of-bounds access in packet_recvmsg()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2080477
Tested: compile only

commit c700525fcc06b05adfea78039de02628af79e07a
Author: Eric Dumazet <edumazet@google.com>
Date:   Sat Mar 12 15:29:58 2022 -0800

    net/packet: fix slab-out-of-bounds access in packet_recvmsg()

    syzbot found that when an AF_PACKET socket is using PACKET_COPY_THRESH
    and mmap operations, tpacket_rcv() is queueing skbs with
    garbage in skb->cb[], triggering a too big copy [1]

    Presumably, users of af_packet using mmap() already gets correct
    metadata from the mapped buffer, we can simply make sure
    to clear 12 bytes that might be copied to user space later.

    BUG: KASAN: stack-out-of-bounds in memcpy include/linux/fortify-string.h:225 [inline]
    BUG: KASAN: stack-out-of-bounds in packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
    Write of size 165 at addr ffffc9000385fb78 by task syz-executor233/3631

    CPU: 0 PID: 3631 Comm: syz-executor233 Not tainted 5.17.0-rc7-syzkaller-02396-g0b3660695e80 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:88 [inline]
     dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
     print_address_description.constprop.0.cold+0xf/0x336 mm/kasan/report.c:255
     __kasan_report mm/kasan/report.c:442 [inline]
     kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
     check_region_inline mm/kasan/generic.c:183 [inline]
     kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
     memcpy+0x39/0x60 mm/kasan/shadow.c:66
     memcpy include/linux/fortify-string.h:225 [inline]
     packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
     sock_recvmsg_nosec net/socket.c:948 [inline]
     sock_recvmsg net/socket.c:966 [inline]
     sock_recvmsg net/socket.c:962 [inline]
     ____sys_recvmsg+0x2c4/0x600 net/socket.c:2632
     ___sys_recvmsg+0x127/0x200 net/socket.c:2674
     __sys_recvmsg+0xe2/0x1a0 net/socket.c:2704
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7fdfd5954c29
    Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007ffcf8e71e48 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fdfd5954c29
    RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000005
    RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcf8e71e60
    R13: 00000000000f4240 R14: 000000000000c1ff R15: 00007ffcf8e71e54
     </TASK>

    addr ffffc9000385fb78 is located in stack of task syz-executor233/3631 at offset 32 in frame:
     ____sys_recvmsg+0x0/0x600 include/linux/uio.h:246

    this frame has 1 object:
     [32, 160) 'addr'

    Memory state around the buggy address:
     ffffc9000385fa80: 00 04 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
     ffffc9000385fb00: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
    >ffffc9000385fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3
                                                                    ^
     ffffc9000385fc00: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1
     ffffc9000385fc80: f1 f1 f1 00 f2 f2 f2 00 f2 f2 f2 00 00 00 00 00
    ==================================================================

    Fixes: 0fb375fb9b ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Link: https://lore.kernel.org/r/20220312232958.3535620-1-eric.dumazet@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Xin Long <lxin@redhat.com>
2022-05-03 10:49:18 -04:00
Xin Long d9dfae5ce4 af_packet: fix data-race in packet_setsockopt / packet_setsockopt
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2080477
Tested: compile only

commit e42e70ad6ae2ae511a6143d2e8da929366e58bd9
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Jan 31 18:23:58 2022 -0800

    af_packet: fix data-race in packet_setsockopt / packet_setsockopt

    When packet_setsockopt( PACKET_FANOUT_DATA ) reads po->fanout,
    no lock is held, meaning that another thread can change po->fanout.

    Given that po->fanout can only be set once during the socket lifetime
    (it is only cleared from fanout_release()), we can use
    READ_ONCE()/WRITE_ONCE() to document the race.

    BUG: KCSAN: data-race in packet_setsockopt / packet_setsockopt

    write to 0xffff88813ae8e300 of 8 bytes by task 14653 on cpu 0:
     fanout_add net/packet/af_packet.c:1791 [inline]
     packet_setsockopt+0x22fe/0x24a0 net/packet/af_packet.c:3931
     __sys_setsockopt+0x209/0x2a0 net/socket.c:2180
     __do_sys_setsockopt net/socket.c:2191 [inline]
     __se_sys_setsockopt net/socket.c:2188 [inline]
     __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x44/0xae

    read to 0xffff88813ae8e300 of 8 bytes by task 14654 on cpu 1:
     packet_setsockopt+0x691/0x24a0 net/packet/af_packet.c:3935
     __sys_setsockopt+0x209/0x2a0 net/socket.c:2180
     __do_sys_setsockopt net/socket.c:2191 [inline]
     __se_sys_setsockopt net/socket.c:2188 [inline]
     __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x44/0xae

    value changed: 0x0000000000000000 -> 0xffff888106f8c000

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 14654 Comm: syz-executor.3 Not tainted 5.16.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

    Fixes: 47dceb8ecd ("packet: add classic BPF fanout mode")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Link: https://lore.kernel.org/r/20220201022358.330621-1-eric.dumazet@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Xin Long <lxin@redhat.com>
2022-05-03 10:49:18 -04:00
Hangbin Liu d2de7fb941 net/packet: rx_owner_map depends on pg_vec
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2050329
Upstream Status: net.git commit ec6af094ea28
CVE: CVE-2021-22600

commit ec6af094ea28f0f2dda1a6a33b14cd57e36a9755
Author: Willem de Bruijn <willemb@google.com>
Date:   Wed Dec 15 09:39:37 2021 -0500

    net/packet: rx_owner_map depends on pg_vec

    Packet sockets may switch ring versions. Avoid misinterpreting state
    between versions, whose fields share a union. rx_owner_map is only
    allocated with a packet ring (pg_vec) and both are swapped together.
    If pg_vec is NULL, meaning no packet ring was allocated, then neither
    was rx_owner_map. And the field may be old state from a tpacket_v3.

    Fixes: 61fad6816f ("net/packet: tpacket_rcv: avoid a producer race condition")
    Reported-by: Syzbot <syzbot+1ac0994a0a0c55151121@syzkaller.appspotmail.com>
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20211215143937.106178-1-willemdebruijn.kernel@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Hangbin Liu <haliu@redhat.com>
2022-02-14 10:37:48 +08:00
Petr Oros ea6b084bc4 net: Remove redundant if statements
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2037315

Upstream commit(s):
commit 1160dfa178eb848327e9dec39960a735f4dc1685
Author: Yajun Deng <yajun.deng@linux.dev>
Date:   Thu Aug 5 19:55:27 2021 +0800

    net: Remove redundant if statements

    The 'if (dev)' statement already move into dev_{put , hold}, so remove
    redundant if statements.

    Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Petr Oros <poros@redhat.com>
2022-01-10 16:20:08 +01:00
Linus Torvalds dbe69e4337 Networking changes for 5.14.
Core:
 
  - BPF:
    - add syscall program type and libbpf support for generating
      instructions and bindings for in-kernel BPF loaders (BPF loaders
      for BPF), this is a stepping stone for signed BPF programs
    - infrastructure to migrate TCP child sockets from one listener
      to another in the same reuseport group/map to improve flexibility
      of service hand-off/restart
    - add broadcast support to XDP redirect
 
  - allow bypass of the lockless qdisc to improving performance
    (for pktgen: +23% with one thread, +44% with 2 threads)
 
  - add a simpler version of "DO_ONCE()" which does not require
    jump labels, intended for slow-path usage
 
  - virtio/vsock: introduce SOCK_SEQPACKET support
 
  - add getsocketopt to retrieve netns cookie
 
  - ip: treat lowest address of a IPv4 subnet as ordinary unicast address
        allowing reclaiming of precious IPv4 addresses
 
  - ipv6: use prandom_u32() for ID generation
 
  - ip: add support for more flexible field selection for hashing
        across multi-path routes (w/ offload to mlxsw)
 
  - icmp: add support for extended RFC 8335 PROBE (ping)
 
  - seg6: add support for SRv6 End.DT46 behavior
 
  - mptcp:
     - DSS checksum support (RFC 8684) to detect middlebox meddling
     - support Connection-time 'C' flag
     - time stamping support
 
  - sctp: packetization Layer Path MTU Discovery (RFC 8899)
 
  - xfrm: speed up state addition with seq set
 
  - WiFi:
     - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
     - aggregation handling improvements for some drivers
     - minstrel improvements for no-ack frames
     - deferred rate control for TXQs to improve reaction times
     - switch from round robin to virtual time-based airtime scheduler
 
  - add trace points:
     - tcp checksum errors
     - openvswitch - action execution, upcalls
     - socket errors via sk_error_report
 
 Device APIs:
 
  - devlink: add rate API for hierarchical control of max egress rate
             of virtual devices (VFs, SFs etc.)
 
  - don't require RCU read lock to be held around BPF hooks
    in NAPI context
 
  - page_pool: generic buffer recycling
 
 New hardware/drivers:
 
  - mobile:
     - iosm: PCIe Driver for Intel M.2 Modem
     - support for Qualcomm MSM8998 (ipa)
 
  - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
 
  - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
 
  - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
 
  - NXP SJA1110 Automotive Ethernet 10-port switch
 
  - Qualcomm QCA8327 switch support (qca8k)
 
  - Mikrotik 10/25G NIC (atl1c)
 
 Driver changes:
 
  - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP
    (our first foray into MAC/PHY description via ACPI)
 
  - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
 
  - Mellanox/Nvidia NIC (mlx5)
    - NIC VF offload of L2 bridging
    - support IRQ distribution to Sub-functions
 
  - Marvell (prestera):
     - add flower and match all
     - devlink trap
     - link aggregation
 
  - Netronome (nfp): connection tracking offload
 
  - Intel 1GE (igc): add AF_XDP support
 
  - Marvell DPU (octeontx2): ingress ratelimit offload
 
  - Google vNIC (gve): new ring/descriptor format support
 
  - Qualcomm mobile (rmnet & ipa): inline checksum offload support
 
  - MediaTek WiFi (mt76)
     - mt7915 MSI support
     - mt7915 Tx status reporting
     - mt7915 thermal sensors support
     - mt7921 decapsulation offload
     - mt7921 enable runtime pm and deep sleep
 
  - Realtek WiFi (rtw88)
     - beacon filter support
     - Tx antenna path diversity support
     - firmware crash information via devcoredump
 
  - Qualcomm 60GHz WiFi (wcn36xx)
     - Wake-on-WLAN support with magic packets and GTK rekeying
 
  - Micrel PHY (ksz886x/ksz8081): add cable test support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S
 Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX
 M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE
 mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS
 OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/
 w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc
 HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/
 io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y
 5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF
 78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh
 Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu
 hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4=
 =LvtX
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - BPF:
      - add syscall program type and libbpf support for generating
        instructions and bindings for in-kernel BPF loaders (BPF loaders
        for BPF), this is a stepping stone for signed BPF programs
      - infrastructure to migrate TCP child sockets from one listener to
        another in the same reuseport group/map to improve flexibility
        of service hand-off/restart
      - add broadcast support to XDP redirect

   - allow bypass of the lockless qdisc to improving performance (for
     pktgen: +23% with one thread, +44% with 2 threads)

   - add a simpler version of "DO_ONCE()" which does not require jump
     labels, intended for slow-path usage

   - virtio/vsock: introduce SOCK_SEQPACKET support

   - add getsocketopt to retrieve netns cookie

   - ip: treat lowest address of a IPv4 subnet as ordinary unicast
     address allowing reclaiming of precious IPv4 addresses

   - ipv6: use prandom_u32() for ID generation

   - ip: add support for more flexible field selection for hashing
     across multi-path routes (w/ offload to mlxsw)

   - icmp: add support for extended RFC 8335 PROBE (ping)

   - seg6: add support for SRv6 End.DT46 behavior

   - mptcp:
      - DSS checksum support (RFC 8684) to detect middlebox meddling
      - support Connection-time 'C' flag
      - time stamping support

   - sctp: packetization Layer Path MTU Discovery (RFC 8899)

   - xfrm: speed up state addition with seq set

   - WiFi:
      - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
      - aggregation handling improvements for some drivers
      - minstrel improvements for no-ack frames
      - deferred rate control for TXQs to improve reaction times
      - switch from round robin to virtual time-based airtime scheduler

   - add trace points:
      - tcp checksum errors
      - openvswitch - action execution, upcalls
      - socket errors via sk_error_report

  Device APIs:

   - devlink: add rate API for hierarchical control of max egress rate
     of virtual devices (VFs, SFs etc.)

   - don't require RCU read lock to be held around BPF hooks in NAPI
     context

   - page_pool: generic buffer recycling

  New hardware/drivers:

   - mobile:
      - iosm: PCIe Driver for Intel M.2 Modem
      - support for Qualcomm MSM8998 (ipa)

   - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices

   - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches

   - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)

   - NXP SJA1110 Automotive Ethernet 10-port switch

   - Qualcomm QCA8327 switch support (qca8k)

   - Mikrotik 10/25G NIC (atl1c)

  Driver changes:

   - ACPI support for some MDIO, MAC and PHY devices from Marvell and
     NXP (our first foray into MAC/PHY description via ACPI)

   - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx

   - Mellanox/Nvidia NIC (mlx5)
      - NIC VF offload of L2 bridging
      - support IRQ distribution to Sub-functions

   - Marvell (prestera):
      - add flower and match all
      - devlink trap
      - link aggregation

   - Netronome (nfp): connection tracking offload

   - Intel 1GE (igc): add AF_XDP support

   - Marvell DPU (octeontx2): ingress ratelimit offload

   - Google vNIC (gve): new ring/descriptor format support

   - Qualcomm mobile (rmnet & ipa): inline checksum offload support

   - MediaTek WiFi (mt76)
      - mt7915 MSI support
      - mt7915 Tx status reporting
      - mt7915 thermal sensors support
      - mt7921 decapsulation offload
      - mt7921 enable runtime pm and deep sleep

   - Realtek WiFi (rtw88)
      - beacon filter support
      - Tx antenna path diversity support
      - firmware crash information via devcoredump

   - Qualcomm WiFi (wcn36xx)
      - Wake-on-WLAN support with magic packets and GTK rekeying

   - Micrel PHY (ksz886x/ksz8081): add cable test support"

* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
  tcp: change ICSK_CA_PRIV_SIZE definition
  tcp_yeah: check struct yeah size at compile time
  gve: DQO: Fix off by one in gve_rx_dqo()
  stmmac: intel: set PCI_D3hot in suspend
  stmmac: intel: Enable PHY WOL option in EHL
  net: stmmac: option to enable PHY WOL with PMT enabled
  net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
  net: use netdev_info in ndo_dflt_fdb_{add,del}
  ptp: Set lookup cookie when creating a PTP PPS source.
  net: sock: add trace for socket errors
  net: sock: introduce sk_error_report
  net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
  net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
  net: dsa: include fdb entries pointing to bridge in the host fdb list
  net: dsa: include bridge addresses which are local in the host fdb list
  net: dsa: sync static FDB entries on foreign interfaces to hardware
  net: dsa: install the host MDB and FDB entries in the master's RX filter
  net: dsa: reference count the FDB addresses at the cross-chip notifier level
  net: dsa: introduce a separate cross-chip notifier type for host FDBs
  net: dsa: reference count the MDB entries at the cross-chip notifier level
  ...
2021-06-30 15:51:09 -07:00
Alexander Aring e3ae2365ef net: sock: introduce sk_error_report
This patch introduces a function wrapper to call the sk_error_report
callback. That will prepare to add additional handling whenever
sk_error_report is called, for example to trace socket errors.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29 11:28:21 -07:00
Linus Torvalds 8ec035ac4a fallthrough fixes for Clang for 5.14-rc1
Hi Linus,
 
 Please, pull the following patches that fix many fall-through warnings
 when building with Clang 12.0.0 and this[1] change reverted. Notice
 that in order to enable -Wimplicit-fallthrough for Clang, such change[1]
 is meant to be reverted at some point. So, these patches help to move
 in that direction.
 
 Thanks!
 
 [1] commit e2079e93f5 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmDaNe8ACgkQRwW0y0cG
 2zFfGA/9G1A/Hrf261/P9olyYe2TRBwLnO1tUDREm3qtJ2JdKpf+7EM3VDm+Ue/A
 qhNmwp5G7nmp7Nqq8MfbdFjeo/rPS67voXiOfO8b0pU+E4XlOc+B1BXL0BWtnP7b
 xvuauklQU6dmCp2u44vsxdBIO6ooR0uQh+7/+1la+mPyEk9mlooQ4lyFcpfA53yt
 zxEGrx0tZBrDXghEI1CkHxOaJaX3qhw4EUYvxe8n2L7Dgx+o2djL/G4/SRYH/xoq
 MZa8TLyCuR3J0Ph4TfDONhMmf8ZLn+j70xBhewcVfZ1JfvGSVw4DQNN44KZCDnrK
 tGsBo5VFksjbmX83LmT8UlqB1rTP4nVQtRmtOPvbQA9kd19yy+Y64Y58FcGU2FHl
 PWt3rQJ1JzBo3TtzQoz7HSJCt9QTil4U7hFbNtcp5BbWQfUPkRgpWcL3FOchZbZ6
 FnLMqHanw2lrKMzZEoyHvg6G7BT67k3rrFgtd/xGSn8ohtfKXaZBYa9PKrQ0LwuG
 o8tQtIX1owj4rbdI1t6Ob4X/tT6Y7DzH8nsF+TsJQ4XeSCD2rURUcYltBMIlEr16
 DFj7iWKIrrX80/JRsBXu7a9h8nn5YptxV12SGRq/Cu/2jfRwjDye4IzsCyqMf67n
 oEN6YC1XYaEUmKXTnI8Z0CxY0qwSTcNjeH5Ci9jWepinsqD3Jxw=
 =Kt2q
 -----END PGP SIGNATURE-----

Merge tag 'fallthrough-fixes-clang-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull fallthrough fixes from Gustavo Silva:
 "Fix many fall-through warnings when building with Clang 12.0.0 and
  '-Wimplicit-fallthrough' so that we at some point will be able to
  enable that warning by default"

* tag 'fallthrough-fixes-clang-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (26 commits)
  rxrpc: Fix fall-through warnings for Clang
  drm/nouveau/clk: Fix fall-through warnings for Clang
  drm/nouveau/therm: Fix fall-through warnings for Clang
  drm/nouveau: Fix fall-through warnings for Clang
  xfs: Fix fall-through warnings for Clang
  xfrm: Fix fall-through warnings for Clang
  tipc: Fix fall-through warnings for Clang
  sctp: Fix fall-through warnings for Clang
  rds: Fix fall-through warnings for Clang
  net/packet: Fix fall-through warnings for Clang
  net: netrom: Fix fall-through warnings for Clang
  ide: Fix fall-through warnings for Clang
  hwmon: (max6621) Fix fall-through warnings for Clang
  hwmon: (corsair-cpro) Fix fall-through warnings for Clang
  firewire: core: Fix fall-through warnings for Clang
  braille_console: Fix fall-through warnings for Clang
  ipv4: Fix fall-through warnings for Clang
  qlcnic: Fix fall-through warnings for Clang
  bnxt_en: Fix fall-through warnings for Clang
  netxen_nic: Fix fall-through warnings for Clang
  ...
2021-06-28 20:03:38 -07:00
Jakub Kicinski adc2e56ebe Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflicts in net/can/isotp.c and
tools/testing/selftests/net/mptcp/mptcp_connect.sh

scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c
to include/linux/ptp_clock_kernel.h in -next so re-apply
the fix there.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-18 19:47:02 -07:00
Eric Dumazet e032f7c9c7 net/packet: annotate accesses to po->ifindex
Like prior patch, we need to annotate lockless accesses to po->ifindex
For instance, packet_getname() is reading po->ifindex (twice) while
another thread is able to change po->ifindex.

KCSAN reported:

BUG: KCSAN: data-race in packet_do_bind / packet_getname

write to 0xffff888143ce3cbc of 4 bytes by task 25573 on cpu 1:
 packet_do_bind+0x420/0x7e0 net/packet/af_packet.c:3191
 packet_bind+0xc3/0xd0 net/packet/af_packet.c:3255
 __sys_bind+0x200/0x290 net/socket.c:1637
 __do_sys_bind net/socket.c:1648 [inline]
 __se_sys_bind net/socket.c:1646 [inline]
 __x64_sys_bind+0x3d/0x50 net/socket.c:1646
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888143ce3cbc of 4 bytes by task 25578 on cpu 0:
 packet_getname+0x5b/0x1a0 net/packet/af_packet.c:3525
 __sys_getsockname+0x10e/0x1a0 net/socket.c:1887
 __do_sys_getsockname net/socket.c:1902 [inline]
 __se_sys_getsockname net/socket.c:1899 [inline]
 __x64_sys_getsockname+0x3e/0x50 net/socket.c:1899
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000000 -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 25578 Comm: syz-executor.5 Not tainted 5.13.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:48:18 -07:00
Eric Dumazet c7d2ef5dd4 net/packet: annotate accesses to po->bind
tpacket_snd(), packet_snd(), packet_getname() and packet_seq_show()
can read po->num without holding a lock. This means other threads
can change po->num at the same time.

KCSAN complained about this known fact [1]
Add READ_ONCE()/WRITE_ONCE() to address the issue.

[1] BUG: KCSAN: data-race in packet_do_bind / packet_sendmsg

write to 0xffff888131a0dcc0 of 2 bytes by task 24714 on cpu 0:
 packet_do_bind+0x3ab/0x7e0 net/packet/af_packet.c:3181
 packet_bind+0xc3/0xd0 net/packet/af_packet.c:3255
 __sys_bind+0x200/0x290 net/socket.c:1637
 __do_sys_bind net/socket.c:1648 [inline]
 __se_sys_bind net/socket.c:1646 [inline]
 __x64_sys_bind+0x3d/0x50 net/socket.c:1646
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888131a0dcc0 of 2 bytes by task 24719 on cpu 1:
 packet_snd net/packet/af_packet.c:2899 [inline]
 packet_sendmsg+0x317/0x3570 net/packet/af_packet.c:3040
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmsg+0x1ed/0x270 net/socket.c:2433
 __do_sys_sendmsg net/socket.c:2442 [inline]
 __se_sys_sendmsg net/socket.c:2440 [inline]
 __x64_sys_sendmsg+0x42/0x50 net/socket.c:2440
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000 -> 0x1200

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24719 Comm: syz-executor.5 Not tainted 5.13.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:48:18 -07:00
Eric Dumazet d1b5bee4c8 net/packet: annotate data race in packet_sendmsg()
There is a known race in packet_sendmsg(), addressed
in commit 32d3182cd2 ("net/packet: fix race in tpacket_snd()")

Now we have data_race(), we can use it to avoid a future KCSAN warning,
as syzbot loves stressing af_packet sockets :)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:12:54 -07:00
Jakub Kicinski 5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Gustavo A. R. Silva 5af5a020dd net/packet: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of letting the code fall
through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-05-17 19:57:18 -05:00
Jiapeng Chong 25c55b38d8 net/packet: Remove redundant assignment to ret
Variable ret is set to '0' or '-EBUSY', but this value is never read
as it is not used later on, hence it is a redundant assignment and
can be removed.

Clean up the following clang-analyzer warning:

net/packet/af_packet.c:3936:4: warning: Value stored to 'ret' is never
read [clang-analyzer-deadcode.DeadStores].

net/packet/af_packet.c:3933:4: warning: Value stored to 'ret' is never
read [clang-analyzer-deadcode.DeadStores].

No functional change.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 16:03:56 -07:00
Richard Sanger 171c3b1511 net: packetmmap: fix only tx timestamp on request
The packetmmap tx ring should only return timestamps if requested via
setsockopt PACKET_TIMESTAMP, as documented. This allows compatibility
with non-timestamp aware user-space code which checks
tp_status == TP_STATUS_AVAILABLE; not expecting additional timestamp
flags to be set in tp_status.

Fixes: b9c32fb271 ("packet: if hw/sw ts enabled in rx/tx ring, report which ts we got")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Richard Sanger <rsanger@wand.net.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-12 14:00:04 -07:00