Commit Graph

146 Commits

Author SHA1 Message Date
Scott Mayhew e1dd216a42 NFS: enable nconnect for RDMA
JIRA: https://issues.redhat.com/browse/RHEL-59704

commit b326df4a8ec6ef53e2e2f1c2cbf14f8a20e85baa
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Tue Feb 13 13:31:47 2024 -0500

    NFS: enable nconnect for RDMA

    It appears that in certain cases, RDMA capable transports can benefit
    from the ability to establish multiple connections to increase their
    throughput. This patch therefore enables the use of the "nconnect" mount
    option for those use cases.

    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2024-10-25 12:36:08 -04:00
Benjamin Coddington 01e831819b NFSv4.1 another fix for EXCHGID4_FLAG_USE_PNFS_DS for DS server
JIRA: https://issues.redhat.com/browse/RHEL-53004

commit 4840c00003a2275668a13b82c9f5b1aed80183aa
Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Mon Jun 24 09:28:27 2024 -0400

    NFSv4.1 another fix for EXCHGID4_FLAG_USE_PNFS_DS for DS server

    Previously in order to mark the communication with the DS server,
    we tried to use NFS_CS_DS in cl_flags. However, this flag would
    only be saved for the DS server and in case where DS equals MDS,
    the client would not find a matching nfs_client in nfs_match_client
    that represents the MDS (but is also a DS).

    Instead, don't rely on the NFS_CS_DS but instead use NFS_CS_PNFS.

    Fixes: 379e4adfddd6 ("NFSv4.1: fixup use EXCHGID4_FLAG_USE_PNFS_DS for DS server")
    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2024-08-06 09:32:37 -04:00
Jeffrey Layton 8c7f797688 NFSv4.1: fix pnfs MDS=DS session trunking
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit 806a3bc421a115fbb287c1efce63a48c54ee804b
Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Wed Aug 30 15:29:34 2023 -0400

    NFSv4.1: fix pnfs MDS=DS session trunking

    Currently, when GETDEVICEINFO returns multiple locations where each
    is a different IP but the server's identity is same as MDS, then
    nfs4_set_ds_client() finds the existing nfs_client structure which
    has the MDS's max_connect value (and if it's 1), then the 1st IP
    on the DS's list will get dropped due to MDS trunking rules. Other
    IPs would be added as they fall under the pnfs trunking rules.

    For the list of IPs the 1st goes thru calling nfs4_set_ds_client()
    which will eventually call nfs4_add_trunk() and call into
    rpc_clnt_test_and_add_xprt() which has the check for MDS trunking.
    The other IPs (after the 1st one), would call rpc_clnt_add_xprt()
    which doesn't go thru that check.

    nfs4_add_trunk() is called when MDS trunking is happening and it
    needs to enforce the usage of max_connect mount option of the
    1st mount. However, this shouldn't be applied to pnfs flow.

    Instead, this patch proposed to treat MDS=DS as DS trunking and
    make sure that MDS's max_connect limit does not apply to the
    1st IP returned in the GETDEVICEINFO list. It does so by
    marking the newly created client with a new flag NFS_CS_PNFS
    which then used to pass max_connect value to use into the
    rpc_clnt_test_and_add_xprt() instead of the existing rpc
    client's max_connect value set by the MDS connection.

    For example, mount was done without max_connect value set
    so MDS's rpc client has cl_max_connect=1. Upon calling into
    rpc_clnt_test_and_add_xprt() and using rpc client's value,
    the caller passes in max_connect value which is previously
    been set in the pnfs path (as a part of handling
    GETDEVICEINFO list of IPs) in nfs4_set_ds_client().

    However, when NFS_CS_PNFS flag is not set and we know we
    are doing MDS trunking, comparing a new IP of the same
    server, we then set the max_connect value to the
    existing MDS's value and pass that into
    rpc_clnt_test_and_add_xprt().

    Fixes: dc48e0abee24 ("SUNRPC enforce creation of no more than max_connect xprts")
    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:43 -05:00
Jeffrey Layton aadc181cda NFSv4.1: use EXCHGID4_FLAG_USE_PNFS_DS for DS server
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit 51d674a5e4889f1c8e223ac131cf218e1631e423
Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Thu Jul 13 13:02:38 2023 -0400

    NFSv4.1: use EXCHGID4_FLAG_USE_PNFS_DS for DS server

    After receiving the location(s) of the DS server(s) in the
    GETDEVINCEINFO, create the request for the clientid to such
    server and indicate that the client is connecting to a DS.

    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:38 -05:00
Jeffrey Layton ae10e2ae2d NFS: Add sysfs links to sunrpc clients for nfs_clients
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit e13b549319a684dd80c4cc25e9567a5c84007e32
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Thu Jun 15 14:07:27 2023 -0400

    NFS: Add sysfs links to sunrpc clients for nfs_clients

    For the general and state management nfs_client under each mount, create
    symlinks to their respective rpc_client sysfs entries.

    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:35 -05:00
Jeffrey Layton 595d9c14a8 NFS: add superblock sysfs entries
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit 1c7251187dc067a6d460cf33ca67da9c1dd87807
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Thu Jun 15 14:07:26 2023 -0400

    NFS: add superblock sysfs entries

    Create a sysfs directory for each mount that corresponds to the mount's
    nfs_server struct.  As the mount is being constructed, use the name
    "server-n", but rename it to the "MAJOR:MINOR" of the mount after assigning
    a device_id. The rename approach allows us to populate the mount's directory
    with links to the various rpc_client objects during the mount's
    construction.  The naming convention (MAJOR:MINOR) can be used to reference
    a particular NFS mount's sysfs tree.

    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:35 -05:00
Jeffrey Layton 1cd44bd5eb NFS: Add an "xprtsec=" NFS mount option
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit c8407f2e560c53c4c73e77cb5604c8a408dbe7f7
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jun 7 10:00:09 2023 -0400

    NFS: Add an "xprtsec=" NFS mount option

    After some discussion, we decided that controlling transport layer
    security policy should be separate from the setting for the user
    authentication flavor. To accomplish this, add a new NFS mount
    option to select a transport layer security policy for RPC
    operations associated with the mount point.

      xprtsec=none     - Transport layer security is forced off.

      xprtsec=tls      - Establish an encryption-only TLS session. If
                         the initial handshake fails, the mount fails.
                         If TLS is not available on a reconnect, drop
                         the connection and try again.

      xprtsec=mtls     - Both sides authenticate and an encrypted
                         session is created. If the initial handshake
                         fails, the mount fails. If TLS is not available
                         on a reconnect, drop the connection and try
                         again.

    To support client peer authentication (mtls), the handshake daemon
    will have configurable default authentication material (certificate
    or pre-shared key). In the future, mount options can be added that
    can provide this material on a per-mount basis.

    Updates to mount.nfs (to support xprtsec=auto) and nfs(5) will be
    sent under separate cover.

    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:34 -05:00
Jeffrey Layton 905b73d2e4 NFS: Have struct nfs_client carry a TLS policy field
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit 6c0a8c5fcf7158e889dbdd077f67c81984704710
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Jun 7 09:59:42 2023 -0400

    NFS: Have struct nfs_client carry a TLS policy field

    The new field is used to match struct nfs_clients that have the same
    TLS policy setting.

    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:34 -05:00
Scott Mayhew dc6fba107d NFS: Avoid memcpy() run-time warning for struct sockaddr overflows
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621

commit cf0d7e7f4520814f45e1313872ad5777ed504004
Author: Kees Cook <keescook@chromium.org>
Date:   Sun Oct 16 21:36:50 2022 -0700

    NFS: Avoid memcpy() run-time warning for struct sockaddr overflows

    The 'nfs_server' and 'mount_server' structures include a union of
    'struct sockaddr' (with the older 16 bytes max address size) and
    'struct sockaddr_storage' which is large enough to hold all the
    supported sa_family types (128 bytes max size). The runtime memcpy()
    buffer overflow checker is seeing attempts to write beyond the 16
    bytes as an overflow, but the actual expected size is that of 'struct
    sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more
    completely through-out NFS, which results in adjusting the memcpy()
    buffers to the correct union members. Avoids this false positive run-time
    warning under CONFIG_FORTIFY_SOURCE:

      memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16)

    Reported-by: kernel test robot <yujie.liu@intel.com>
    Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com
    Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
    Cc: Anna Schumaker <anna@kernel.org>
    Cc: linux-nfs@vger.kernel.org
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2023-05-08 10:41:17 -04:00
Scott Mayhew 323c45f1ed NFS: move from strlcpy with unused retval to strscpy
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621

commit 0dd7439f382518e9997cfa7ca9d06799dbeb33fa
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date:   Thu Aug 18 23:01:15 2022 +0200

    NFS: move from strlcpy with unused retval to strscpy

    Follow the advice of the below link and prefer 'strscpy' in this
    subsystem. Conversion is 1:1 because the return value is not used.
    Generated by a coccinelle script.

    Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2023-05-08 10:41:12 -04:00
Benjamin Coddington 77cc1917c2 nfs4: Fix kmemleak when allocate slot failed
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2154879

commit 7e8436728e22181c3f12a5dbabd35ed3a8b8c593
Author: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Date:   Thu Oct 20 11:20:54 2022 +0800

    nfs4: Fix kmemleak when allocate slot failed

    If one of the slot allocate failed, should cleanup all the other
    allocated slots, otherwise, the allocated slots will leak:

      unreferenced object 0xffff8881115aa100 (size 64):
        comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s)
        hex dump (first 32 bytes):
          00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff  ...s......Z.....
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130
          [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270
          [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90
          [<00000000128486db>] nfs4_init_client+0xce/0x270
          [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0
          [<000000000e593b52>] nfs4_create_server+0x300/0x5f0
          [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110
          [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0
          [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0
          [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0
          [<000000005d56bdec>] do_syscall_64+0x35/0x80
          [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

    Fixes: abf79bb341 ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking")
    Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2023-03-02 13:55:26 -05:00
Benjamin Coddington 1ac84d4498 NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2107347

commit 940261a195080cf1cdcd56948d363fe363b69da1
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date:   Fri Jun 17 16:23:36 2022 -0400

    NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE

    Previously, we required this to value to be a power of 2 for UDP related
    reasons. This patch keeps the power of 2 rule for UDP but allows more
    flexibility for TCP and RDMA.

    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2022-12-21 09:49:09 -05:00
Benjamin Coddington 7fe7e73fcd nfs: nfs4clinet: check the return value of kstrdup()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072

commit fbd2057e5329d3502a27491190237b6be52a1cb6
Author: Xiaoke Wang <xkernel.wang@foxmail.com>
Date:   Fri Dec 17 01:01:33 2021 +0800

    nfs: nfs4clinet: check the return value of kstrdup()

    kstrdup() returns NULL when some internal memory errors happen, it is
    better to check the return value of it so to catch the memory error in
    time.

    Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2022-09-26 09:34:04 -04:00
Scott Mayhew 14a7c6428d NFS: Replace calls to nfs_probe_fsinfo() with nfs_probe_server()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200

commit 4d4cf8d2d6ccb43c68bc5925dc83500b81b50f9e
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date:   Thu Oct 14 13:55:06 2021 -0400

    NFS: Replace calls to nfs_probe_fsinfo() with nfs_probe_server()

    Clean up. There are a few places where we want to probe the server, but
    don't actually care about the fsinfo result. Change these to use
    nfs_probe_server(), which handles the fattr allocation for us.

    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2022-02-03 11:44:01 -05:00
Scott Mayhew 6b2ba6e244 NFS: Move nfs_probe_destination() into the generic client
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200

commit e5731131fb6fefaa69064ca511b7c4971d6cf54f
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date:   Thu Oct 14 13:55:05 2021 -0400

    NFS: Move nfs_probe_destination() into the generic client

    And rename it to nfs_probe_server(). I also change it to take the nfs_fh
    as an argument so callers can choose what filehandle to probe.

    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2022-02-03 11:44:01 -05:00
Scott Mayhew 5afae24456 NFS: Create an nfs4_server_set_init_caps() function
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200

commit 01dde76e471229e3437a2686c572f4980b2c483e
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date:   Thu Oct 14 13:55:04 2021 -0400

    NFS: Create an nfs4_server_set_init_caps() function

    And call it before doing an FSINFO probe to reset to the baseline
    capabilities before probing.

    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2022-02-03 11:44:01 -05:00
Scott Mayhew 94a9530f6d NFSv4.1 add network transport when session trunking is detected
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200

commit 2a7a451a9084877a0b9d335c77d57e4cda1e5882
Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Fri Aug 27 14:37:19 2021 -0400

    NFSv4.1 add network transport when session trunking is detected

    After trunking is discovered in nfs4_discover_server_trunking(),
    add the transport to the old client structure if the allowed limit
    of transports has not been reached.

    An example: there exists a multi-homed server and client mounts
    one server address and some volume and then doest another mount to
    a different address of the same server and perhaps a different
    volume. Previously, the client checks that this is a session
    trunkable servers (same server), and removes the newly created
    client structure along with its transport. Now, the client
    adds the connection from the 2nd mount into the xprt switch of
    the existing client (it leads to having 2 available connections).

    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2022-02-03 11:39:49 -05:00
Scott Mayhew e867c58e87 NFSv4 introduce max_connect mount options
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200

commit 7e134205f62955369619021a695cd78fefd32451
Author: Olga Kornievskaia <kolga@netapp.com>
Date:   Fri Aug 27 14:37:17 2021 -0400

    NFSv4 introduce max_connect mount options

    This option will control up to how many xprts can the client
    establish to the server with a distinct address (that means
    nconnect connections are not counted towards this new limit).
    This patch is setting up nfs structures to keeep track of the
    max_connect limit (does not enforce it).

    The default value is kept at 1 so that no current mounts that
    don't want any additional connections would be effected. The
    maximum value is set at 16.

    Mounts to DS are not limited to default value of 1 but instead
    set to the maximum default value of 16 (NFS_MAX_TRANSPORTS).

    Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
2022-02-03 11:39:49 -05:00
Trond Myklebust dd99e9f98f NFSv4: Initialise connection to the server in nfs4_alloc_client()
Set up the connection to the NFSv4 server in nfs4_alloc_client(), before
we've added the struct nfs_client to the net-namespace's nfs_client_list
so that a downed server won't cause other mounts to hang in the trunking
detection code.

Reported-by: Michael Wakabayashi <mwakabayashi@vmware.com>
Fixes: 5c6e5b60aa ("NFS: Fix an Oops in the pNFS files and flexfiles connection setup to the DS")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-13 19:36:49 -04:00
Anna Schumaker 476bdb04c5 NFS: Fix use-after-free in nfs4_init_client()
KASAN reports a use-after-free when attempting to mount two different
exports through two different NICs that belong to the same server.

Olga was able to hit this with kernels starting somewhere between 5.7
and 5.10, but I traced the patch that introduced the clear_bit() call to
4.13. So something must have changed in the refcounting of the clp
pointer to make this call to nfs_put_client() the very last one.

Fixes: 8dcbec6d20 ("NFSv41: Handle EXCHID4_FLAG_CONFIRMED_R during NFSv4.1 migration")
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Gustavo A. R. Silva ffb81717a1 nfs: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly add multiple break/goto/return/fallthrough
statements instead of just letting the code fall through to the next
case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-02-01 13:32:32 -05:00
Sargun Dhillon d3ff46fe69 NFSv4: Refactor to use user namespaces for nfs4idmap
In several patches work has been done to enable NFSv4 to use user
namespaces:
58002399da65: NFSv4: Convert the NFS client idmapper to use the container user namespace
3b7eb5e35d0f: NFS: When mounting, don't share filesystems between different user namespaces

Unfortunately, the userspace APIs were only such that the userspace facing
side of the filesystem (superblock s_user_ns) could be set to a non init
user namespace. This furthers the fs_context related refactoring, and
piggybacks on top of that logic, so the superblock user namespace, and the
NFS user namespace are the same.

Users can still use rpc.idmapd if they choose to, but there are complexities
with user namespaces and request-key that have yet to be addresssed.

Eventually, we will need to at least:
  * Come up with an upcall mechanism that can be triggered inside of the container,
    or safely triggered outside, with the requisite context to do the right
    mapping. * Handle whatever refactoring needs to be done in net/sunrpc.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Tested-by: Alban Crequy <alban.crequy@gmail.com>
Fixes: 62a55d088c ("NFS: Additional refactoring for fs_context conversion")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02 14:05:54 -05:00
Anna Schumaker c567552612 NFS: Add READ_PLUS data segment support
This patch adds client support for decoding a single NFS4_CONTENT_DATA
segment returned by the server. This is the simplest implementation
possible, since it does not account for any hole segments in the reply.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-10-07 14:28:39 -04:00
Olga Kornievskaia dbc4fec6b6 NFSv4.0 allow nconnect for v4.0
It looks like this "else" is just a typo.  It turns off nconnect for
NFSv4.0 even though it works for every other version.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-17 13:16:23 -04:00
Frank van der Linden 04a5da690e NFSv4.2: define limits and sizes for user xattr handling
Set limits for extended attributes (attribute value size and listxattr
buffer size), based on the fs-independent limits (XATTR_*_MAX).

Define the maximum XDR sizes for the RFC 8276 XATTR operations.
In the case of operations that carry a larger payload (SETXATTR,
GETXATTR, LISTXATTR), these exclude that payload, which is added
as separate pages, like other operations do.

Define, much like for read and write operations, the maximum overhead
sizes for get/set/listxattr, and use them to limit the maximum payload
size for those operations, in combination with the channel attributes.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-13 17:52:45 -04:00
Scott Mayhew 55dee1bc0d nfs: add minor version to nfs_server_key for fscache
An NFS client that mounts multiple exports from the same NFS
server with higher NFSv4 versions disabled (i.e. 4.2) and without
forcing a specific NFS version results in fscache index cookie
collisions and the following messages:
[  570.004348] FS-Cache: Duplicate cookie detected

Each nfs_client structure should have its own fscache index cookie,
so add the minorversion to nfs_server_key.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200145
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-02-25 13:53:24 -05:00
Scott Mayhew 62a55d088c NFS: Additional refactoring for fs_context conversion
Split out from commit "NFS: Add fs_context support."

This patch adds additional refactoring for the conversion of NFS to use
fs_context, namely:

 (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context.
     nfs_clone_mount has had several fields removed, and nfs_mount_info
     has been removed altogether.
 (*) Various functions now take an fs_context as an argument instead
     of nfs_mount_info, nfs_fs_context, etc.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
David Howells 5eb005caf5 NFS: Rename struct nfs_parsed_mount_data to struct nfs_fs_context
Rename struct nfs_parsed_mount_data to struct nfs_fs_context and rename
pointers to it to "ctx".  At some point this will be pointed to by an
fs_context struct's fs_private pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
Al Viro 0c38f2131d nfs: don't pass nfs_subversion to ->create_server()
pick it from mount_info

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Trond Myklebust 807ce06c24 Merge branch 'linux-ssc-for-5.5' 2019-11-06 08:55:23 -05:00
Trond Myklebust e6237b6feb NFSv4.1: Don't rebind to the same source port when reconnecting to the server
NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
at the source port when deciding whether or not an RPC call is a replay
of a previous call. This requires clients to perform strange TCP gymnastics
in order to ensure that when they reconnect to the server, they bind
to the same source port.

NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
that do not look at the source port of the connection. This patch therefore
ensures they can ignore the rebind requirement.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:45 -05:00
Trond Myklebust d0372b679c NFS: Use non-atomic bit ops when initialising struct nfs_client_initdata
We don't need atomic bit ops when initialising a local structure on the
stack.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:44 -05:00
Olga Kornievskaia 0491567b51 NFS: add COPY_NOTIFY operation
Try using the delegation stateid, then the open stateid.

Only NL4_NETATTR, No support for NL4_NAME and NL4_URL.
Allow only one source server address to be returned for now.

To distinguish between same server copy offload ("intra") and
a copy between different server ("inter"), do a check of server
owner identity and also make sure server is capable of doing
a copy offload.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-10-09 12:05:45 -04:00
Trond Myklebust c77e22834a NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
John Hubbard reports seeing the following stack trace:

nfs4_do_reclaim
   rcu_read_lock /* we are now in_atomic() and must not sleep */
       nfs4_purge_state_owners
           nfs4_free_state_owner
               nfs4_destroy_seqid_counter
                   rpc_destroy_wait_queue
                       cancel_delayed_work_sync
                           __cancel_work_timer
                               __flush_work
                                   start_flush_work
                                       might_sleep:
                                        (kernel/workqueue.c:2975: BUG)

The solution is to separate out the freeing of the state owners
from nfs4_purge_state_owners(), and perform that outside the atomic
context.

Reported-by: John Hubbard <jhubbard@nvidia.com>
Fixes: 0aaaf5c424 ("NFS: Cache state owners after files are closed")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust bb71e4a5d7 pNFS: Allow multiple connections to the DS
If the user specifies -onconnect=<number> mount option, and the transport
protocol is TCP, then set up <number> connections to the pNFS data server
as well. The connections will all go to the same IP address.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2019-07-06 14:54:50 -04:00
Trond Myklebust 6619079d05 NFSv4: Allow multiple connections to NFSv4.x (x>0) servers
If the user specifies the -onconn=<number> mount option, and the transport
protocol is TCP, then set up <number> connections to the server. The
connections will all go to the same IP address.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2019-07-06 14:54:50 -04:00
Thomas Gleixner 457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Trond Myklebust 1a58e8a0e5 NFS: Store the credential of the mount process in the nfs_server
Store the credential of the mount process so that we can determine
information such as the user namespace.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-04-26 16:11:54 -04:00
Julia Lawall 45bb8d8027 NFS: drop useless LIST_HEAD
Drop LIST_HEAD where the variable it declares has never
been used.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier x;
@@
- LIST_HEAD(x);
  ... when != x
// </smpl>

Fixes: 0e20162ed1 ("NFSv4.1 Use MDS auth flavor for data server connection")
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20 17:33:55 -05:00
Trond Myklebust 302fad7bd5 NFS: Fix up documentation warnings
Fix up some compiler warnings about function parameters, etc not being
correctly described or formatted.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20 15:14:21 -05:00
NeilBrown a52458b48a NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.
SUNRPC has two sorts of credentials, both of which appear as
"struct rpc_cred".
There are "generic credentials" which are supplied by clients
such as NFS and passed in 'struct rpc_message' to indicate
which user should be used to authorize the request, and there
are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
which describe the credential to be sent over the wires.

This patch replaces all the generic credentials by 'struct cred'
pointers - the credential structure used throughout Linux.

For machine credentials, there is a special 'struct cred *' pointer
which is statically allocated and recognized where needed as
having a special meaning.  A look-up of a low-level cred will
map this to a machine credential.

Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
Trond Myklebust 943cff67b8 NFSv4.1: Fix the r/wsize checking
The intention of nfs4_session_set_rwsize() was to cap the r/wsize to the
buffer sizes negotiated by the CREATE_SESSION. The initial code had a
bug whereby we would not check the values negotiated by nfs_probe_fsinfo()
(the assumption being that CREATE_SESSION will always negotiate buffer values
that are sane w.r.t. the server's preferred r/wsizes) but would only check
values set by the user in the 'mount' command.

The code was changed in 4.11 to _always_ set the r/wsize, meaning that we
now never use the server preferred r/wsizes. This is the regression that
this patch fixes.
Also rename the function to nfs4_session_limit_rwsize() in order to avoid
future confusion.

Fixes: 033853325f (NFSv4.1 respect server's max size in CREATE_SESSION")
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30 15:35:17 -04:00
Olga Kornievskaia bc0c9079b4 NFS handle COPY reply CB_OFFLOAD call race
It's possible that server replies back with CB_OFFLOAD call and
COPY reply at the same time such that client will process
CB_OFFLOAD before reply to COPY. For that keep a list of pending
callback stateids received and then before waiting on completion
check the pending list.

Cleanup any pending copies on the client shutdown.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09 12:56:39 -04:00
Calum Mackay 23a88ade71 nfs: Referrals not inheriting proto setting from parent
Commit 530ea42192 ("nfs: Referrals should use the same proto setting
as their parent") encloses the fix with #ifdef CONFIG_SUNRPC_XPRT_RDMA.

CONFIG_SUNRPC_XPRT_RDMA is a tristate option, so it should be tested
with #if IS_ENABLED().

Fixes: 530ea42192 ("nfs: Referrals should use the same proto setting as their parent")
Reported-by: Helen Chao <helen.chao@oracle.com>
Tested-by: Helen Chao <helen.chao@oracle.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Bill Baker <bill.baker@oracle.com>
Signed-off-by: Calum Mackay <calum.mackay@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-07-30 13:19:40 -04:00
Bill.Baker@oracle.com ad86f605c5 nfs: system crashes after NFS4ERR_MOVED recovery
nfs4_update_server unconditionally releases the nfs_client for the
source server. If migration fails, this can cause the source server's
nfs_client struct to be left with a low reference count, resulting in
use-after-free.  Also, adjust reference count handling for ELOOP.

NFS: state manager: migration failed on NFSv4 server nfsvmu10 with error 6
WARNING: CPU: 16 PID: 17960 at fs/nfs/client.c:281 nfs_put_client+0xfa/0x110 [nfs]()
	nfs_put_client+0xfa/0x110 [nfs]
	nfs4_run_state_manager+0x30/0x40 [nfsv4]
	kthread+0xd8/0xf0

BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
	nfs4_xdr_enc_write+0x6b/0x160 [nfsv4]
	rpcauth_wrap_req+0xac/0xf0 [sunrpc]
	call_transmit+0x18c/0x2c0 [sunrpc]
	__rpc_execute+0xa6/0x490 [sunrpc]
	rpc_async_schedule+0x15/0x20 [sunrpc]
	process_one_work+0x160/0x470
	worker_thread+0x112/0x540
	? rescuer_thread+0x3f0/0x3f0
	kthread+0xd8/0xf0

This bug was introduced by 32e62b7c ("NFS: Add nfs4_update_server"),
but the fix applies cleanly to 52442f9b ("NFS4: Avoid migration loops")

Reported-by: Helen Chao <helen.chao@oracle.com>
Fixes: 52442f9b11 ("NFS4: Avoid migration loops")
Signed-off-by: Bill Baker <bill.baker@oracle.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-02-22 12:17:42 -05:00
Chuck Lever 801b564309 nfs: Update server port after referral or migration
After traversing a referral or recovering from a migration event,
ensure that the server port reported in /proc/mounts is updated
to the correct port setting for the new submount.

Reported-by: Helen Chao <helen.chao@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14 23:06:30 -05:00
Chuck Lever 530ea42192 nfs: Referrals should use the same proto setting as their parent
Helen Chao <helen.chao@oracle.com> noticed that when a user
traverses a referral on an NFS/RDMA mount, the resulting submount
always uses TCP.

This behavior does not match the vers= setting when traversing
a referral (vers=4.1 is preserved). It also does not match the
behavior of crossing from the pseudofs into a real filesystem
(proto=rdma is preserved in that case).

The Linux NFS client does not currently support the
fs_locations_info attribute. The situation is similar for all
NFSv4 servers I know of. Therefore until the community has broad
support for fs_locations_info, when following a referral:

 - First try to connect with RPC-over-RDMA. This will fail quickly
   if the client has no RDMA-capable interfaces.

 - If connecting with RPC-over-RDMA fails, or the RPC-over-RDMA
   transport is not available, use TCP.

Reported-by: Helen Chao <helen.chao@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-14 23:06:30 -05:00
Scott Mayhew c156618e15 nfs: fix a deadlock in nfs client initialization
The following deadlock can occur between a process waiting for a client
to initialize in while walking the client list during nfsv4 server trunking
detection and another process waiting for the nfs_clid_init_mutex so it
can initialize that client:

Process 1                               Process 2
---------                               ---------
spin_lock(&nn->nfs_client_lock);
list_add_tail(&CLIENTA->cl_share_link,
        &nn->nfs_client_list);
spin_unlock(&nn->nfs_client_lock);
                                        spin_lock(&nn->nfs_client_lock);
                                        list_add_tail(&CLIENTB->cl_share_link,
                                                &nn->nfs_client_list);
                                        spin_unlock(&nn->nfs_client_lock);
                                        mutex_lock(&nfs_clid_init_mutex);
                                        nfs41_walk_client_list(clp, result, cred);
                                        nfs_wait_client_init_complete(CLIENTA);
(waiting for nfs_clid_init_mutex)

Make sure nfs_match_client() only evaluates clients that have completed
initialization in order to prevent that deadlock.

This patch also fixes v4.0 trunking behavior by not marking the client
NFS_CS_READY until the clientid has been confirmed.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-12-15 14:31:49 -05:00
Thomas Meyer 6089dd0d73 NFS: Fix bool initialization/comparison
Bool initializations should use true and false. Bool tests don't need
comparisons.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-11-17 16:43:43 -05:00
Elena Reshetova 212bf41d88 fs, nfs: convert nfs_client.cl_count from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable nfs_client.cl_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-11-17 13:48:01 -05:00