Commit Graph

135 Commits

Author SHA1 Message Date
cki-backport-bot 9586b781a6 tls: fix missing memory barrier in tls_init
JIRA: https://issues.redhat.com/browse/RHEL-44477
CVE: CVE-2024-36489

commit 91e61dd7a0af660408e87372d8330ceb218be302
Author: Dae R. Jeong <threeearcat@gmail.com>
Date:   Tue May 21 19:34:38 2024 +0900

    tls: fix missing memory barrier in tls_init

    In tls_init(), a write memory barrier is missing, and store-store
    reordering may cause NULL dereference in tls_{setsockopt,getsockopt}.

    CPU0                               CPU1
    -----                              -----
    // In tls_init()
    // In tls_ctx_create()
    ctx = kzalloc()
    ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1)

    // In update_sk_prot()
    WRITE_ONCE(sk->sk_prot, tls_prots)     -(2)

                                       // In sock_common_setsockopt()
                                       READ_ONCE(sk->sk_prot)->setsockopt()

                                       // In tls_{setsockopt,getsockopt}()
                                       ctx->sk_proto->setsockopt()    -(3)

    In the above scenario, when (1) and (2) are reordered, (3) can observe
    the NULL value of ctx->sk_proto, causing NULL dereference.

    To fix it, we rely on rcu_assign_pointer() which implies the release
    barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is
    initialized, we can ensure that ctx->sk_proto are visible when
    changing sk->sk_prot.

    Fixes: d5bee7374b ("net/tls: Annotate access to sk_prot with READ_ONCE/WRITE_ONCE")
    Signed-off-by: Yewon Choi <woni9911@gmail.com>
    Signed-off-by: Dae R. Jeong <threeearcat@gmail.com>
    Link: https://lore.kernel.org/netdev/ZU4OJG56g2V9z_H7@dragonet/T/
    Link: https://lore.kernel.org/r/Zkx4vjSFp0mfpjQ2@libra05
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: cki-backport-bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-06-25 19:15:43 +00:00
Davide Caratti d9c18bd98a mptcp: fix lockless access in subflow ULP diag
JIRA: https://issues.redhat.com/browse/RHEL-32669
Upstream Status: net.git commit b8adb69a7d29c2d33eb327bca66476fb6066516b

commit b8adb69a7d29c2d33eb327bca66476fb6066516b
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Thu Feb 15 19:25:30 2024 +0100

    mptcp: fix lockless access in subflow ULP diag

    Since the introduction of the subflow ULP diag interface, the
    dump callback accessed all the subflow data with lockless.

    We need either to annotate all the read and write operation accordingly,
    or acquire the subflow socket lock. Let's do latter, even if slower, to
    avoid a diffstat havoc.

    Fixes: 5147dfb508 ("mptcp: allow dumping subflow context to userspace")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2024-04-18 17:25:35 +02:00
Sabrina Dubroca ad048bc0eb net: add reserved fields to tls_crypto_context
JIRA: https://issues.redhat.com/browse/RHEL-21356
Upstream Status: RHEL-only

union tls_crypto_context is embedded inside struct tls_context and
protected under kABI. Artificially grow the union by adding a large
padding field.

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2024-01-12 14:27:38 +01:00
Sabrina Dubroca a74e692506 tls: validate crypto_info in a separate helper
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 1cf7fbcee60af932f815af5fc0ca5e7e8544ef82
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Oct 9 22:50:52 2023 +0200

    tls: validate crypto_info in a separate helper

    Simplify do_tls_setsockopt_conf a bit.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:48 +01:00
Sabrina Dubroca e1a47432ca tls: remove tls_context argument from tls_set_device_offload
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 4f4866991847738a216bb5920b3d3902cee13fd0
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Oct 9 22:50:51 2023 +0200

    tls: remove tls_context argument from tls_set_device_offload

    It's not really needed since we end up refetching it as tls_ctx. We
    can also remove the NULL check, since we have already dereferenced ctx
    in do_tls_setsockopt_conf.

    While at it, fix up the reverse xmas tree ordering.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:48 +01:00
Sabrina Dubroca 78f2836ff2 tls: remove tls_context argument from tls_set_sw_offload
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit b6a30ec9239a1fa1a622608176bb78646a539608
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Oct 9 22:50:50 2023 +0200

    tls: remove tls_context argument from tls_set_sw_offload

    It's not really needed since we end up refetching it as tls_ctx. We
    can also remove the NULL check, since we have already dereferenced ctx
    in do_tls_setsockopt_conf.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:48 +01:00
Sabrina Dubroca c2d3c601d7 tls: store iv directly within cipher_context
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 1c1cb3110d7ed2897e65d9a352a8fb709723e057
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Oct 9 22:50:45 2023 +0200

    tls: store iv directly within cipher_context

    TLS_MAX_IV_SIZE + TLS_MAX_SALT_SIZE is 20B, we don't get much benefit
    in cipher_context's size and can simplify the init code a bit.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:48 +01:00
Sabrina Dubroca e9b62cd0a3 tls: rename MAX_IV_SIZE to TLS_MAX_IV_SIZE
JIRA: https://issues.redhat.com/browse/RHEL-14902

Conflicts: tls_decrypt_ctx doesn't have the sk member, missing commit
    8d338c76f7cf  ("tls: Only use data field in crypto completion function")

commit bee6b7b30706e7693d91cb28c8ff3cb69e094f65
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Oct 9 22:50:44 2023 +0200

    tls: rename MAX_IV_SIZE to TLS_MAX_IV_SIZE

    It's defined in include/net/tls.h, avoid using an overly generic name.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:48 +01:00
Sabrina Dubroca 53cd5f4c4b tls: store rec_seq directly within cipher_context
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 6d5029e54700b2427581513c533232b02ce05043
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Oct 9 22:50:43 2023 +0200

    tls: store rec_seq directly within cipher_context

    TLS_MAX_REC_SEQ_SIZE is 8B, we don't get anything by using kmalloc.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:47 +01:00
Sabrina Dubroca 4d29158ce2 tls: use tls_cipher_desc to simplify do_tls_getsockopt_conf
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 077e05d135489e144d9e0d01454886bf613d32a4
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:19 2023 +0200

    tls: use tls_cipher_desc to simplify do_tls_getsockopt_conf

    Every cipher uses the same code to update its crypto_info struct based
    on the values contained in the cctx, with only the struct type and
    size/offset changing. We can get those  from tls_cipher_desc, and use
    a single pair of memcpy and final copy_to_user.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/c21a904b91e972bdbbf9d1c6d2731ccfa1eedf72.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:47 +01:00
Sabrina Dubroca d6b3b44b2e tls: get crypto_info size from tls_cipher_desc in do_tls_setsockopt_conf
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 5f309ade49c7068b1149ecf825c4c16e56a3b865
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:18 2023 +0200

    tls: get crypto_info size from tls_cipher_desc in do_tls_setsockopt_conf

    We can simplify do_tls_setsockopt_conf using tls_cipher_desc. Also use
    get_cipher_desc's result to check if the cipher_type coming from
    userspace is valid.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/e97658eb4c6a5832f8ba20a06c4f36a77763c59e.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:47 +01:00
Sabrina Dubroca 3158d34991 tls: validate cipher descriptions at compile time
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 0d98cc02022d60004f78f6e7e6cc1bd39db80ef9
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:14 2023 +0200

    tls: validate cipher descriptions at compile time

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/b38fb8cf60e099e82ae9979c3c9c92421042417c.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:46 +01:00
Sabrina Dubroca 2bd88890c6 tls: extend tls_cipher_desc to fully describe the ciphers
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 176a3f50bc6a327c82c6b051b0bedd19917081a2
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:13 2023 +0200

    tls: extend tls_cipher_desc to fully describe the ciphers

    - add nonce, usually equal to iv_size but not for chacha
     - add offsets into the crypto_info for each field
     - add algorithm name
     - add offloadable flag

    Also add helpers to access each field of a crypto_info struct
    described by a tls_cipher_desc.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/39d5f476d63c171097764e8d38f6f158b7c109ae.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:46 +01:00
Sabrina Dubroca f6eee1183c tls: rename tls_cipher_size_desc to tls_cipher_desc
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 8db44ab26bebe969851468bea6072d9a094b2ace
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:12 2023 +0200

    tls: rename tls_cipher_size_desc to tls_cipher_desc

    We're going to add other fields to it to fully describe a cipher, so
    the "_size" name won't match the contents.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/76ca6c7686bd6d1534dfa188fb0f1f6fabebc791.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:46 +01:00
Sabrina Dubroca 8ef2226d61 tls: reduce size of tls_cipher_size_desc
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 037303d6760751fdb95ba62cf448ecbc1ac29c98
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:11 2023 +0200

    tls: reduce size of tls_cipher_size_desc

    tls_cipher_size_desc indexes ciphers by their type, but we're not
    using indices 0..50 of the array. Each struct tls_cipher_size_desc is
    20B, so that's a lot of unused memory. We can reindex the array
    starting at the lowest used cipher_type.

    Introduce the get_cipher_size_desc helper to find the right item and
    avoid out-of-bounds accesses, and make tls_cipher_size_desc's size
    explicit so that gcc reminds us to update TLS_CIPHER_MIN/MAX when we
    add a new cipher.

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/5e054e370e240247a5d37881a1cd93a67c15f4ca.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:46 +01:00
Sabrina Dubroca 606245d75e tls: add TLS_CIPHER_ARIA_GCM_* to tls_cipher_size_desc
JIRA: https://issues.redhat.com/browse/RHEL-14902

commit 200e23165109a173ffde3310dffa5ef5e502d97f
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 25 23:35:10 2023 +0200

    tls: add TLS_CIPHER_ARIA_GCM_* to tls_cipher_size_desc

    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/b2e0fb79e6d0a4478be9bf33781dc9c9281c9d56.1692977948.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:27:46 +01:00
Sabrina Dubroca 3dbfc27101 tls: suppress wakeups unless we have a full record
JIRA: https://issues.redhat.com/browse/RHEL-14902

Conflicts: context, commit 662fbcec32f4 ("net/tls: implement ->read_sock()")
    backported out of order

commit 121dca784fc0f6c022493a5d23d86b3cc20380f4
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed May 31 08:35:50 2023 -0700

    tls: suppress wakeups unless we have a full record

    TLS does not override .poll() so TLS-enabled socket will generate
    an event whenever data arrives at the TCP socket. This leads to
    unnecessary wakeups on slow connections.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-12-13 14:26:27 +01:00
Jeffrey Layton fdcc17d962 net/tls: implement ->read_sock()
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit 662fbcec32f4af6bdcf5b4006b792ebe9543d945
Author: Hannes Reinecke <hare@suse.de>
Date:   Wed Jul 26 21:15:56 2023 +0200

    net/tls: implement ->read_sock()

    Implement ->read_sock() function for use with nvme-tcp.

    Signed-off-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Reviewed-by: Jakub Kicinski <kuba@kernel.org>
    Cc: Boris Pismenny <boris.pismenny@gmail.com>
    Link: https://lore.kernel.org/r/20230726191556.41714-7-hare@suse.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:25 -05:00
Paolo Abeni 0240ed7c51 tcp: allow again tcp_disconnect() when threads are waiting
JIRA: https://issues.redhat.com/browse/RHEL-12593
Tested: vs bz reproducer
Conflicts: the tls_sw chunk is mangling and applied in \
  tls_rx_reader_acquire(), as rhel lacks the upstream commit \
  f9ae3204fb45 ("net/tls: split  tls_rx_reader_lock"). \
  the wait_on_pending_writer() chunk did not contain the ONCE \
  annotation, as rhel lacks the upstream commit d0ac89f6f987 ("net: \
  deal with most data-races in sk_wait_event()"). The same for \
  sk_stream_wait_memory() chunk.

Upstream commit:
commit 419ce133ab928ab5efd7b50b2ef36ddfd4eadbd2
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Wed Oct 11 09:20:55 2023 +0200

    tcp: allow again tcp_disconnect() when threads are waiting

    As reported by Tom, .NET and applications build on top of it rely
    on connect(AF_UNSPEC) to async cancel pending I/O operations on TCP
    socket.

    The blamed commit below caused a regression, as such cancellation
    can now fail.

    As suggested by Eric, this change addresses the problem explicitly
    causing blocking I/O operation to terminate immediately (with an error)
    when a concurrent disconnect() is executed.

    Instead of tracking the number of threads blocked on a given socket,
    track the number of disconnect() issued on such socket. If such counter
    changes after a blocking operation releasing and re-acquiring the socket
    lock, error out the current operation.

    Fixes: 4faeee0cf8a5 ("tcp: deny tcp_disconnect() when threads are waiting")
    Reported-by: Tom Deseyn <tdeseyn@redhat.com>
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1886305
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/f3b95e47e3dbed840960548aebaa8d954372db41.1697008693.git.pabeni@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-23 16:47:41 +02:00
Sabrina Dubroca 35a3e9f3c6 net: tls: fix possible race condition between do_tls_getsockopt_conf() and do_tls_setsockopt_conf()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2179816
CVE: CVE-2023-28466
Tested: tls selftests

commit 49c47cc21b5b7a3d8deb18fc57b0aa2ab1286962
Author: Hangyu Hua <hbh25y@gmail.com>
Date:   Tue Feb 28 10:33:44 2023 +0800

    net: tls: fix possible race condition between do_tls_getsockopt_conf() and do_tls_setsockopt_conf()

    ctx->crypto_send.info is not protected by lock_sock in
    do_tls_getsockopt_conf(). A race condition between do_tls_getsockopt_conf()
    and error paths of do_tls_setsockopt_conf() may lead to a use-after-free
    or null-deref.

    More discussion:  https://lore.kernel.org/all/Y/ht6gQL+u6fj3dG@hog/

    Fixes: 3c4d755915 ("tls: kernel TLS support")
    Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
    Link: https://lore.kernel.org/r/20230228023344.9623-1-hbh25y@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-05-10 16:58:47 +02:00
Sabrina Dubroca 40cc5f29da net: tls: Add ARIA-GCM algorithm
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183538
Tested: tls selftests

commit 62e56ef57c04c0cacb33433d7984a4d71b690b3f
Author: Taehee Yoo <ap420073@gmail.com>
Date:   Sun Sep 25 15:00:33 2022 +0000

    net: tls: Add ARIA-GCM algorithm

    RFC 6209 describes ARIA for TLS 1.2.
    ARIA-128-GCM and ARIA-256-GCM are defined in RFC 6209.

    This patch would offer performance increment and an opportunity for
    hardware offload.

    Benchmark results:
    iperf-ssl are used.
    CPU: intel i3-12100.

      TLS(openssl-3.0-dev)
    [  3]  0.0- 1.0 sec   185 MBytes  1.55 Gbits/sec
    [  3]  1.0- 2.0 sec   186 MBytes  1.56 Gbits/sec
    [  3]  2.0- 3.0 sec   186 MBytes  1.56 Gbits/sec
    [  3]  3.0- 4.0 sec   186 MBytes  1.56 Gbits/sec
    [  3]  4.0- 5.0 sec   186 MBytes  1.56 Gbits/sec
    [  3]  0.0- 5.0 sec   927 MBytes  1.56 Gbits/sec
      kTLS(aria-generic)
    [  3]  0.0- 1.0 sec   198 MBytes  1.66 Gbits/sec
    [  3]  1.0- 2.0 sec   194 MBytes  1.62 Gbits/sec
    [  3]  2.0- 3.0 sec   194 MBytes  1.63 Gbits/sec
    [  3]  3.0- 4.0 sec   194 MBytes  1.63 Gbits/sec
    [  3]  4.0- 5.0 sec   194 MBytes  1.62 Gbits/sec
    [  3]  0.0- 5.0 sec   974 MBytes  1.63 Gbits/sec
      kTLS(aria-avx wirh GFNI)
    [  3]  0.0- 1.0 sec   632 MBytes  5.30 Gbits/sec
    [  3]  1.0- 2.0 sec   657 MBytes  5.51 Gbits/sec
    [  3]  2.0- 3.0 sec   657 MBytes  5.51 Gbits/sec
    [  3]  3.0- 4.0 sec   656 MBytes  5.50 Gbits/sec
    [  3]  4.0- 5.0 sec   656 MBytes  5.50 Gbits/sec
    [  3]  0.0- 5.0 sec  3.18 GBytes  5.47 Gbits/sec

    Signed-off-by: Taehee Yoo <ap420073@gmail.com>
    Reviewed-by: Vadim Fedorenko <vfedorenko@novek.ru>
    Link: https://lore.kernel.org/r/20220925150033.24615-1-ap420073@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-04-06 15:13:51 +02:00
Sabrina Dubroca a58a7a2122 net/tls: Describe ciphers sizes by const structs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183538
Tested: tls selftests

commit 2d2c5ea24243eb3ed12f232b2aef43981fa15360
Author: Tariq Toukan <tariqt@nvidia.com>
Date:   Tue Sep 20 16:01:47 2022 +0300

    net/tls: Describe ciphers sizes by const structs

    Introduce cipher sizes descriptor. It helps reducing the amount of code
    duplications and repeated switch/cases that assigns the proper sizes
    according to the cipher type.

    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Signed-off-by: Gal Pressman <gal@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2023-04-06 15:13:50 +02:00
Sabrina Dubroca 68df2ddfe7 tls: rx: do not use the standard strparser
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit 84c61fe1a75b4255df1e1e7c054c9e6d048da417
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Jul 22 16:50:33 2022 -0700

    tls: rx: do not use the standard strparser

    TLS is a relatively poor fit for strparser. We pause the input
    every time a message is received, wait for a read which will
    decrypt the message, start the parser, repeat. strparser is
    built to delineate the messages, wrap them in individual skbs
    and let them float off into the stack or a different socket.
    TLS wants the data pages and nothing else. There's no need
    for TLS to keep cloning (and occasionally skb_unclone()'ing)
    the TCP rx queue.

    This patch uses a pre-allocated skb and attaches the skbs
    from the TCP rx queue to it as frags. TLS is careful never
    to modify the input skb without CoW'ing / detaching it first.

    Since we call TCP rx queue cleanup directly we also get back
    the benefit of skb deferred free.

    Overall this results in a 6% gain in my benchmarks.

    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-12-02 08:54:44 +01:00
Sabrina Dubroca fcc8960586 net/tls: Check for errors in tls_device_init
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

Conflicts: tls_device_init was moved from include/net/tls.h to
    net/tls/tls.h in commit 587903142308 ("tls: create an internal
    header")

commit 3d8c51b25a235e283e37750943bbf356ef187230
Author: Tariq Toukan <tariqt@nvidia.com>
Date:   Thu Jul 14 10:07:54 2022 +0300

    net/tls: Check for errors in tls_device_init

    Add missing error checks in tls_device_init.

    Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
    Reported-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://lore.kernel.org/r/20220714070754.1428-1-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:43:04 +01:00
Sabrina Dubroca 03c5f36f97 tls: rx: fix the NoPad getsockopt
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit 57128e98c33d79285adc523e670fe02d11b7e5da
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Jul 8 19:52:54 2022 -0700

    tls: rx: fix the NoPad getsockopt

    Maxim reports do_tls_getsockopt_no_pad() will
    always return an error. Indeed looks like refactoring
    gone wrong - remove err and use value.

    Reported-by: Maxim Mikityanskiy <maximmi@nvidia.com>
    Fixes: 88527790c079 ("tls: rx: add sockopt for enabling optimistic decrypt with TLS 1.3")
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:43:04 +01:00
Sabrina Dubroca 4502d7fd21 tls: create an internal header
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

Conflicts: tls_sw_recvmsg still has the nonblock argument, missing
    commit ec095263a965 ("net: remove noblock parameter from recvmsg()
    entities")

commit 5879031423089b2e19b769f30fc618af742264c3
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu Jul 7 18:03:13 2022 -0700

    tls: create an internal header

    include/net/tls.h is getting a little long, and is probably hard
    for driver authors to navigate. Split out the internals into a
    header which will live under net/tls/. While at it move some
    static inlines with a single user into the source files, add
    a few tls_ prefixes and fix spelling of 'proccess'.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:43:03 +01:00
Sabrina Dubroca 62b380612a tls: rx: add sockopt for enabling optimistic decrypt with TLS 1.3
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit 88527790c079fb1ea41cbcfa4450ee37906a2fb0
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue Jul 5 16:59:24 2022 -0700

    tls: rx: add sockopt for enabling optimistic decrypt with TLS 1.3

    Since optimisitic decrypt may add extra load in case of retries
    require socket owner to explicitly opt-in.

    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:43:01 +01:00
Sabrina Dubroca e67165f465 tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit b489a6e5871690735752f8875f411e4d0cd8e5df
Author: Maxim Mikityanskiy <maximmi@nvidia.com>
Date:   Wed Jun 8 18:34:25 2022 +0300

    tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX

    To embrace possible future optimizations of TLS, rename zerocopy
    sendfile definitions to more generic ones:

    * setsockopt: TLS_TX_ZEROCOPY_SENDFILE- > TLS_TX_ZEROCOPY_RO
    * sock_diag: TLS_INFO_ZC_SENDFILE -> TLS_INFO_ZC_RO_TX

    RO stands for readonly and emphasizes that the application shouldn't
    modify the data being transmitted with zerocopy to avoid potential
    disconnection.

    Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()")
    Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
    Link: https://lore.kernel.org/r/20220608153425.3151146-1-maximmi@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:43:00 +01:00
Sabrina Dubroca 24b0e059d4 tls: Add opt-in zerocopy mode of sendfile()
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit c1318b39c7d36bd5139a9c71044ff2b2d3c6f9d8
Author: Boris Pismenny <borisp@nvidia.com>
Date:   Wed May 18 12:27:31 2022 +0300

    tls: Add opt-in zerocopy mode of sendfile()

    TLS device offload copies sendfile data to a bounce buffer before
    transmitting. It allows to maintain the valid MAC on TLS records when
    the file contents change and a part of TLS record has to be
    retransmitted on TCP level.

    In many common use cases (like serving static files over HTTPS) the file
    contents are not changed on the fly. In many use cases breaking the
    connection is totally acceptable if the file is changed during
    transmission, because it would be received corrupted in any case.

    This commit allows to optimize performance for such use cases to
    providing a new optional mode of TLS sendfile(), in which the extra copy
    is skipped. Removing this copy improves performance significantly, as
    TLS and TCP sendfile perform the same operations, and the only overhead
    is TLS header/trailer insertion.

    The new mode can only be enabled with the new socket option named
    TLS_TX_ZEROCOPY_SENDFILE on per-socket basis. It preserves backwards
    compatibility with existing applications that rely on the copying
    behavior.

    The new mode is safe, meaning that unsolicited modifications of the file
    being sent can't break integrity of the kernel. The worst thing that can
    happen is sending a corrupted TLS record, which is in any case not
    forbidden when using regular TCP sockets.

    Sockets other than TLS device offload are not affected by the new socket
    option. The actual status of zerocopy sendfile can be queried with
    sock_diag.

    Performance numbers in a single-core test with 24 HTTPS streams on
    nginx, under 100% CPU load:

    * non-zerocopy: 33.6 Gbit/s
    * zerocopy: 79.92 Gbit/s

    CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz

    Signed-off-by: Boris Pismenny <borisp@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
    Reviewed-by: Jakub Kicinski <kuba@kernel.org>
    Link: https://lore.kernel.org/r/20220518092731.1243494-1-maximmi@nvidia.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:42:59 +01:00
Sabrina Dubroca d52b61191a net/tls: remove unnecessary jump instructions in do_tls_setsockopt_conf()
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit 1ddcbfbf9dc9b59258dc5a4429f607a6828863d0
Author: Ziyang Xuan <william.xuanziyang@huawei.com>
Date:   Sat Mar 19 11:14:33 2022 +0800

    net/tls: remove unnecessary jump instructions in do_tls_setsockopt_conf()

    Avoid using "goto" jump instruction unconditionally when we
    can return directly. Remove unnecessary jump instructions in
    do_tls_setsockopt_conf().

    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:42:36 +01:00
Sabrina Dubroca 3672acd413 net/tls: getsockopt supports complete algorithm list
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit 3fb59a5de5cbb04de76915d9f5bff01d16aa1fc4
Author: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Date:   Mon Oct 25 21:05:00 2021 +0800

    net/tls: getsockopt supports complete algorithm list

    AES_CCM_128 and CHACHA20_POLY1305 are already supported by tls,
    similar to setsockopt, getsockopt also needs to support these
    two algorithms.

    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:42:35 +01:00
Sabrina Dubroca ea8afa0d63 net/tls: support SM4 GCM/CCM algorithm
Tested: selftests
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2143700

commit 227b9644ab16d2ecd98d593edbe15c32c0c9620a
Author: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Date:   Thu Sep 16 11:37:38 2021 +0800

    net/tls: support SM4 GCM/CCM algorithm

    The RFC8998 specification defines the use of the ShangMi algorithm
    cipher suites in TLS 1.3, and also supports the GCM/CCM mode using
    the SM4 algorithm.

    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Acked-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-11-30 23:42:34 +01:00
Hangbin Liu 92ae9687c5 sock: redo the psock vs ULP protection check
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2101278
Upstream Status: net.git commit e34a07c0ae39

commit e34a07c0ae3906f97eb18df50902e2a01c1015b6
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Jun 20 12:13:53 2022 -0700

    sock: redo the psock vs ULP protection check

    Commit 8a59f9d1e3 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
    has moved the inet_csk_has_ulp(sk) check from sk_psock_init() to
    the new tcp_bpf_update_proto() function. I'm guessing that this
    was done to allow creating psocks for non-inet sockets.

    Unfortunately the destruction path for psock includes the ULP
    unwind, so we need to fail the sk_psock_init() itself.
    Otherwise if ULP is already present we'll notice that later,
    and call tcp_update_ulp() with the sk_proto of the ULP
    itself, which will most likely result in the ULP looping
    its callbacks.

    Fixes: 8a59f9d1e3 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: John Fastabend <john.fastabend@gmail.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/r/20220620191353.1184629-2-kuba@kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Hangbin Liu <haliu@redhat.com>
2022-06-27 14:15:22 +08:00
Patrick Talbert 8c5b3f7fd9 Merge: XDP and networking eBPF rebase to v5.15
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/674

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071618

Depends: !572

Tested: Using bpf selftests, everything passes.

This rebases XDP and networking eBPF to upstream kernel version 5.15.

Signed-off-by: Jiri Benc <jbenc@redhat.com>

Approved-by: Hangbin Liu <haliu@redhat.com>
Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Toke Høiland-Jørgensen <toke@redhat.com>
Approved-by: Íñigo Huguet <ihuguet@redhat.com>

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
2022-06-03 09:26:25 +02:00
Jiri Benc df10d51307 net: Rename ->stream_memory_read to ->sock_is_readable
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2071618

Conflicts:
- [minor] Context difference in struct proto due to missing 6c302e799a0d
  "net: forward_alloc_get depends on CONFIG_MPTCP".
- [minor] Context difference in sock.h due to out of order backport of
  4c1e34c0dbff "vsock: Enable y2038 safe timeval for timeout".

commit 7b50ecfcc6cdfe87488576bc3ed443dc8d083b90
Author: Cong Wang <cong.wang@bytedance.com>
Date:   Fri Oct 8 13:33:03 2021 -0700

    net: Rename ->stream_memory_read to ->sock_is_readable

    The proto ops ->stream_memory_read() is currently only used
    by TCP to check whether psock queue is empty or not. We need
    to rename it before reusing it for non-TCP protocols, and
    adjust the exsiting users accordingly.

    Signed-off-by: Cong Wang <cong.wang@bytedance.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com

Signed-off-by: Jiri Benc <jbenc@redhat.com>
2022-05-12 17:29:53 +02:00
Sabrina Dubroca 5d58b13f10 tls: fix replacing proto_ops
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2080356
Tested: selftests

commit f3911f73f51d1534f4db70b516cc1fcb6be05bae
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Nov 24 15:25:56 2021 -0800

    tls: fix replacing proto_ops

    We replace proto_ops whenever TLS is configured for RX. But our
    replacement also overrides sendpage_locked, which will crash
    unless TX is also configured. Similarly we plug both of those
    in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely
    different implementation for TX.

    Last but not least we always plug in something based on inet_stream_ops
    even though a few of the callbacks differ for IPv6 (getname, release,
    bind).

    Use a callback building method similar to what we do for struct proto.

    Fixes: c46234ebb4 ("tls: RX path for ktls")
    Fixes: d4ffb02dee ("net/tls: enable sk_msg redirect to tls socket egress")
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Sabrina Dubroca <sdubroca@redhat.com>
2022-04-29 16:16:29 +02:00
Maxim Mikityanskiy c55dcdd435 net/tls: Fix use-after-free after the TLS device goes down and up
When a netdev with active TLS offload goes down, tls_device_down is
called to stop the offload and tear down the TLS context. However, the
socket stays alive, and it still points to the TLS context, which is now
deallocated. If a netdev goes up, while the connection is still active,
and the data flow resumes after a number of TCP retransmissions, it will
lead to a use-after-free of the TLS context.

This commit addresses this bug by keeping the context alive until its
normal destruction, and implements the necessary fallbacks, so that the
connection can resume in software (non-offloaded) kTLS mode.

On the TX side tls_sw_fallback is used to encrypt all packets. The RX
side already has all the necessary fallbacks, because receiving
non-decrypted packets is supported. The thing needed on the RX side is
to block resync requests, which are normally produced after receiving
non-decrypted packets.

The necessary synchronization is implemented for a graceful teardown:
first the fallbacks are deployed, then the driver resources are released
(it used to be possible to have a tls_dev_resync after tls_dev_del).

A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
mode. It's used to skip the RX resync logic completely, as it becomes
useless, and some objects may be released (for example, resync_async,
which is allocated and freed by the driver).

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:58:05 -07:00
Vadim Fedorenko 74ea610602 net/tls: add CHACHA20-POLY1305 configuration
Add ChaCha-Poly specific configuration code.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Julia Lawall 0403a2b53c net/tls: use semicolons rather than commas to separate statements
Replace commas with semicolons.  Commas introduce unnecessary
variability in the code structure and are hard to see.  What is done
is essentially described by the following Coccinelle semantic patch
(http://coccinelle.lip6.fr/):

// <smpl>
@@ expression e1,e2; @@
e1
-,
+;
e2
... when any
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/1602412498-32025-6-git-send-email-Julia.Lawall@inria.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-13 17:11:52 -07:00
Yutaro Hayakawa ffa81fa46e net/tls: Implement getsockopt SOL_TLS TLS_RX
Implement the getsockopt SOL_TLS TLS_RX which is currently missing. The
primary usecase is to use it in conjunction with TCP_REPAIR to
checkpoint/restore the TLS record layer state.

TLS connection state usually exists on the user space library. So
basically we can easily extract it from there, but when the TLS
connections are delegated to the kTLS, it is not the case. We need to
have a way to extract the TLS state from the kernel for both of TX and
RX side.

The new TLS_RX getsockopt copies the crypto_info to user in the same
way as TLS_TX does.

We have described use cases in our research work in Netdev 0x14
Transport Workshop [1].

Also, there is an TLS implementation called tlse [2] which supports
TLS connection migration. They have support of kTLS and their code
shows that they are expecting the future support of this option.

[1] https://speakerdeck.com/yutarohayakawa/prism-proxies-without-the-pain
[2] https://github.com/eduardsui/tlse

Signed-off-by: Yutaro Hayakawa <yhayakawa3720@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-01 11:47:12 -07:00
Christoph Hellwig d3c4815151 net: remove sockptr_advance
sockptr_advance never properly worked.  Replace it with _offset variants
of copy_from_sockptr and copy_to_sockptr.

Fixes: ba423fdaa5 ("net: add a new sockptr_t type")
Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 13:43:40 -07:00
Christoph Hellwig a7b75c5a8c net: pass a sockptr_t into ->setsockopt
Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer.  This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154]
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 15:41:54 -07:00
Linus Torvalds 4152d146ee Merge branch 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux
Pull READ/WRITE_ONCE rework from Will Deacon:
 "This the READ_ONCE rework I've been working on for a while, which
  bumps the minimum GCC version and improves code-gen on arm64 when
  stack protector is enabled"

[ Side note: I'm _really_ tempted to raise the minimum gcc version to
  4.9, so that we can just say that we require _Generic() support.

  That would allow us to more cleanly handle a lot of the cases where we
  depend on very complex macros with 'sizeof' or __builtin_choose_expr()
  with __builtin_types_compatible_p() etc.

  This branch has a workaround for sparse not handling _Generic(),
  either, but that was already fixed in the sparse development branch,
  so it's really just gcc-4.9 that we'd require.   - Linus ]

* 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
  compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse
  compiler_types.h: Optimize __unqual_scalar_typeof compilation time
  compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long)
  compiler-types.h: Include naked type in __pick_integer_type() match
  READ_ONCE: Fix comment describing 2x32-bit atomicity
  gcov: Remove old GCC 3.4 support
  arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
  locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
  READ_ONCE: Drop pointer qualifiers when reading from scalar types
  READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
  READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
  arm64: csum: Disable KASAN for do_csum()
  fault_inject: Don't rely on "return value" from WRITE_ONCE()
  net: tls: Avoid assigning 'const' pointer to non-const pointer
  netfilter: Avoid assigning 'const' pointer to non-const pointer
  compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
2020-06-10 14:46:54 -07:00
Will Deacon 9a8939490d net: tls: Avoid assigning 'const' pointer to non-const pointer
tls_build_proto() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

  | net/tls/tls_main.c: In function ‘tls_build_proto’:
  | ./include/linux/compiler.h:229:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  | net/tls/tls_main.c:640:4: note: in expansion of macro ‘smp_store_release’
  |   640 |    smp_store_release(&saved_tcpv6_prot, prot);
  |       |    ^~~~~~~~~~~~~~~~~

Drop the const qualifier from the local 'prot' variable, as it isn't
needed.

Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-15 21:36:41 +01:00
Arnd Bergmann f691a25ce5 net/tls: fix const assignment warning
Building with some experimental patches, I came across a warning
in the tls code:

include/linux/compiler.h:215:30: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
  215 |  *(volatile typeof(x) *)&(x) = (val);  \
      |                              ^
net/tls/tls_main.c:650:4: note: in expansion of macro 'smp_store_release'
  650 |    smp_store_release(&saved_tcpv4_prot, prot);

This appears to be a legitimate warning about assigning a const pointer
into the non-const 'saved_tcpv4_prot' global. Annotate both the ipv4 and
ipv6 pointers 'const' to make the code internally consistent.

Fixes: 5bb4c45d46 ("net/tls: Read sk_prot once when building tls proto ops")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-08 14:34:02 -07:00
Jakub Sitnicki d5bee7374b net/tls: Annotate access to sk_prot with READ_ONCE/WRITE_ONCE
sockmap performs lockless writes to sk->sk_prot on the following paths:

tcp_bpf_{recvmsg|sendmsg} / sock_map_unref
  sk_psock_put
    sk_psock_drop
      sk_psock_restore_proto
        WRITE_ONCE(sk->sk_prot, proto)

To prevent load/store tearing [1], and to make tooling aware of intentional
shared access [2], we need to annotate other sites that access sk_prot with
READ_ONCE/WRITE_ONCE macros.

Change done with Coccinelle with following semantic patch:

@@
expression E;
identifier I;
struct sock *sk;
identifier sk_prot =~ "^sk_prot$";
@@
(
 E =
-sk->sk_prot
+READ_ONCE(sk->sk_prot)
|
-sk->sk_prot = E
+WRITE_ONCE(sk->sk_prot, E)
|
-sk->sk_prot
+READ_ONCE(sk->sk_prot)
 ->I
)

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21 20:08:17 -07:00
Jakub Sitnicki 5bb4c45d46 net/tls: Read sk_prot once when building tls proto ops
Apart from being a "tremendous" win when it comes to generated machine
code (see bloat-o-meter output for x86-64 below) this mainly prepares
ground for annotating access to sk_prot with READ_ONCE, so that we don't
pepper the code with access annotations and needlessly repeat loads.

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-46 (-46)
Function                                     old     new   delta
tls_init                                     851     805     -46
Total: Before=21063, After=21017, chg -0.22%

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21 20:08:17 -07:00
Jakub Sitnicki f13fe3e60c net/tls: Constify base proto ops used for building tls proto
The helper that builds kTLS proto ops doesn't need to and should not modify
the base proto ops. Annotate the parameter as read-only.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21 20:08:17 -07:00
Jakub Sitnicki b8e202d1d1 net, sk_msg: Annotate lockless access to sk_prot on clone
sk_msg and ULP frameworks override protocol callbacks pointer in
sk->sk_prot, while tcp accesses it locklessly when cloning the listening
socket, that is with neither sk_lock nor sk_callback_lock held.

Once we enable use of listening sockets with sockmap (and hence sk_msg),
there will be shared access to sk->sk_prot if socket is getting cloned
while being inserted/deleted to/from the sockmap from another CPU:

Read side:

tcp_v4_rcv
  sk = __inet_lookup_skb(...)
  tcp_check_req(sk)
    inet_csk(sk)->icsk_af_ops->syn_recv_sock
      tcp_v4_syn_recv_sock
        tcp_create_openreq_child
          inet_csk_clone_lock
            sk_clone_lock
              READ_ONCE(sk->sk_prot)

Write side:

sock_map_ops->map_update_elem
  sock_map_update_elem
    sock_map_update_common
      sock_map_link_no_progs
        tcp_bpf_init
          tcp_bpf_update_sk_prot
            sk_psock_update_proto
              WRITE_ONCE(sk->sk_prot, ops)

sock_map_ops->map_delete_elem
  sock_map_delete_elem
    __sock_map_delete
     sock_map_unref
       sk_psock_put
         sk_psock_drop
           sk_psock_restore_proto
             tcp_update_ulp
               WRITE_ONCE(sk->sk_prot, proto)

Mark the shared access with READ_ONCE/WRITE_ONCE annotations.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200218171023.844439-2-jakub@cloudflare.com
2020-02-21 22:29:45 +01:00
John Fastabend 33bfe20dd7 bpf: Sockmap/tls, push write_space updates through ulp updates
When sockmap sock with TLS enabled is removed we cleanup bpf/psock state
and call tcp_update_ulp() to push updates to TLS ULP on top. However, we
don't push the write_space callback up and instead simply overwrite the
op with the psock stored previous op. This may or may not be correct so
to ensure we don't overwrite the TLS write space hook pass this field to
the ULP and have it fixup the ctx.

This completes a previous fix that pushed the ops through to the ULP
but at the time missed doing this for write_space, presumably because
write_space TLS hook was added around the same time.

Fixes: 95fa145479 ("bpf: sockmap/tls, close can race with map free")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com
2020-01-15 23:26:13 +01:00