JIRA: https://issues.redhat.com/browse/RHEL-59704
commit 7f296b25f2a6453bf052b03ed0676a18bee312a5
Author: Christoph Hellwig <hch@lst.de>
Date: Fri Jul 5 07:38:38 2024 +0200
nfs: remove nfs_page_length
The nfs_page_length is not used anywhere, remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-59704
Conflicts:
fs/nfs/write.c: Context difference due to RHEL not having
600f111ef51d ("fs: Rename mapping private members")
commit 7e8e78a0ba00c88f0ded86de64bdddc82e06b196
Author: Christoph Hellwig <hch@lst.de>
Date: Mon Jul 1 07:26:48 2024 +0200
nfs: remove dead code for the old swap over NFS implementation
Remove the code testing folio_test_swapcache either explicitly or
implicitly in pagemap.h headers, as is now handled using the direct I/O
path and not the buffered I/O path that these helpers are located in.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-59704
commit bf95f82e6a569f41bae1e37204b219a5e1e8b971
Author: Chen Hanxiao <chenhx.fnst@fujitsu.com>
Date: Tue Apr 2 18:33:55 2024 +0800
NFS: make sure lock/nolock overriding local_lock mount option
Currently, mount option lock/nolock and local_lock option
may override NFS_MOUNT_LOCAL_FLOCK NFS_MOUNT_LOCAL_FCNTL flags
when passing in different order:
mount -o vers=3,local_lock=all,lock:
local_lock=none
mount -o vers=3,lock,local_lock=all:
local_lock=all
This patch will let lock/nolock override local_lock option
as nfs(5) suggested.
Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-59704
commit 1548036ef1204df65ca5a16e8b199c858cb80075
Author: Josef Bacik <josef@toxicpanda.com>
Date: Thu Feb 15 14:57:32 2024 -0500
nfs: make the rpc_stat per net namespace
Now that we're exposing the rpc stats on a per-network namespace basis,
move this struct into struct nfs_net and use that to make sure only the
per-network namespace stats are exposed.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus
Conflicts: For consistency drop btrfs hunks because it isn't supported in
CentOS Stream and other backports also drop such hunks.
The cifs source has been moved in CentOS Stream so manually
apply rejected hunks to fs/smb/client/cifsfs.h and
fs/smb/client/dir.c.
Dropped hunks for ntfs3 because the source is not present in the
CentOS Stream source tree.
CentOS Stream commit 892da692fa ("shmem: support idmapped
mounts for tmpfs") is present, which cuases hunks #2-#4 to be
rejected, manually apply the hunks.
CentOS Stream commit f0f830cd7e ("ceph: create symlinks with
encrypted and base64-encoded targets") is present and resulted
in fuzz against fs/ceph/dir.c hunk #2.
Upstream commit 863f144f12add ("vfs: open inside ->tmpfile()")
is missing causing fuzz against fs/ext2/namei.c.
Upstream commit 7d37539037c2f ("fuse: implement ->tmpfile()")
is missing causing fuzz in hunk #4 against fs/fuse/dir.c.
CentOS Stream commit 892da692fa ("shmem: support idmapped
mounts for tmpfs") is present, so a patch reorder was needed
with appropriate adjustments.
commit 5ebb29bee8d5fc173b774e0755be8cb335503ee3
Author: Christian Brauner <brauner@kernel.org>
Date: Fri Jan 13 12:49:16 2023 +0100
fs: port ->mknod() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Ian Kent <ikent@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus
Conflicts: For consistency drop btrfs hunks because it isn't supported in
CentOS Stream and other backports also drop such hunks.
The cifs source has been moved in CentOS Stream so manually
apply rejected hunks to fs/smb/client/cifsfs.h and
fs/smb/client/inode.c.
Dropped hunks for ntfs3 because the source is not present in the
CentOS Stream source tree.
Upstream commit cc14d24026704 ("hpfs: Convert symlinks to
read_folio") is not present which causes fuzz 1 for hunk #1.
CentOS Stream commit 892da692fa ("shmem: support idmapped
mounts for tmpfs") is present, so a patch reorder was needed
with appropriate adjustments.
commit e18275ae55e07a2937e48134589c2f4c1d99a369
Author: Christian Brauner <brauner@kernel.org>
Date: Fri Jan 13 12:49:17 2023 +0100
fs: port ->rename() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Ian Kent <ikent@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus
Conflicts: For consistency drop btrfs hunks because it isn't supported in
CentOS Stream and other backports also drop such hunks.
The cifs source has been moved in CentOS Stream so manually
apply rejected hunks to fs/smb/client/cifsfs.h and
fs/smb/client/inode.c.
Dropped hunks for ntfs3 because the source is not present in the
CentOS Stream source tree.
commit c54bd91e9eaba43f09aadc25b52ea869ff3b5587
Author: Christian Brauner <brauner@kernel.org>
Date: Fri Jan 13 12:49:15 2023 +0100
fs: port ->mkdir() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Ian Kent <ikent@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus
Conflicts: The cifs source has been moved in CentOS Stream so manually
apply rejected hunks to fs/smb/client/cifsfs.h and
fs/smb/client/link.c.
Dropped hunks for ntfs3 because the source is not present in the
CentOS Stream source tree.
CentOS Stream commit f0f830cd7e ("ceph: create symlinks with
encrypted and base64-encoded targets") is present and resulted
in fuzz against fs/ceph/dir.c.
commit 7a77db95511c39be4b2db2ceca152ef589adc2dc
Author: Christian Brauner <brauner@kernel.org>
Date: Fri Jan 13 12:49:14 2023 +0100
fs: port ->symlink() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Ian Kent <ikent@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus
Conflicts: For consistency drop btrfs hunks because it isn't supported in
CentOS Stream and other backports also drop such hunks.
The cifs source has been moved in CentOS Stream so manually
apply rejected hunks to fs/smb/client/cifsfs.h and
fs/smb/client/dir.c.
Dropped hunks for ntfs3 because the source is not present in the
CentOS Stream source tree.
CentOS Stream commit 892da692fa ("shmem: support idmapped
mounts for tmpfs") is present, which cuases fuzz in mm/shmem.c.
commit 6c960e68aaed335a0040f16654f3c5e5bfcf9249
Author: Christian Brauner <brauner@kernel.org>
Date: Fri Jan 13 12:49:13 2023 +0100
fs: port ->create() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Ian Kent <ikent@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-53004
commit 3c0a2e0b0ae661457c8505fecc7be5501aa7a715
Author: Sergey Shtylyov <s.shtylyov@omp.ru>
Date: Fri May 10 23:24:04 2024 +0300
nfs: fix undefined behavior in nfs_block_bits()
Shifting *signed int* typed constant 1 left by 31 bits causes undefined
behavior. Specify the correct *unsigned long* type by using 1UL instead.
Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.
Cc: stable@vger.kernel.org
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-34875
commit 1fd5394e6ab8b11465a5d0867f188fad1835a762
Author: Benjamin Coddington <bcodding@redhat.com>
Date: Fri Nov 17 06:25:14 2023 -0500
NFS: drop unused nfs_direct_req bytes_left
Now that we're calculating how large a remaining IO should be based
on the current request's offset, we no longer need to track bytes_left on
each struct nfs_direct_req. Drop the field, and clean up the direct
request tracepoints.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-34875
commit 8a6291bf3b0eae1bf26621e6419a91682f2d6227
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Fri Nov 17 06:25:13 2023 -0500
pNFS: Fix the pnfs block driver's calculation of layoutget size
Instead of relying on the value of the 'bytes_left' field, we should
calculate the layout size based on the offset of the request that is
being written out.
Reported-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fixes: 954998b60caa ("NFS: Fix error handling for O_DIRECT write scheduling")
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-7936
commit 537935f72eb28a3dd0097386f06402e25e66359a
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Sat Aug 19 17:32:25 2023 -0400
NFS/pNFS: Set the connect timeout for the pNFS flexfiles driver
Ensure that the connect timeout for the pNFS flexfiles driver is of the
same order as the I/O timeout, so that we can fail over quickly when
trying to read from a data server that is down.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-7936
commit 303a78052091c81e9003915c521fdca1c7e117af
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Fri Jun 9 15:26:25 2023 -0400
NFSv4.2: Rework scratch handling for READ_PLUS (again)
I found that the read code might send multiple requests using the same
nfs_pgio_header, but nfs4_proc_read_setup() is only called once. This is
how we ended up occasionally double-freeing the scratch buffer, but also
means we set a NULL pointer but non-zero length to the xdr scratch
buffer. This results in an oops the first time decoding needs to copy
something to scratch, which frequently happens when decoding READ_PLUS
hole segments.
I fix this by moving scratch handling into the pageio read code. I
provide a function to allocate scratch space for decoding read replies,
and free the scratch buffer when the nfs_pgio_header is freed.
Fixes: fbd2a05f29a9 (NFSv4.2: Rework scratch handling for READ_PLUS)
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-7936
commit c8407f2e560c53c4c73e77cb5604c8a408dbe7f7
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Wed Jun 7 10:00:09 2023 -0400
NFS: Add an "xprtsec=" NFS mount option
After some discussion, we decided that controlling transport layer
security policy should be separate from the setting for the user
authentication flavor. To accomplish this, add a new NFS mount
option to select a transport layer security policy for RPC
operations associated with the mount point.
xprtsec=none - Transport layer security is forced off.
xprtsec=tls - Establish an encryption-only TLS session. If
the initial handshake fails, the mount fails.
If TLS is not available on a reconnect, drop
the connection and try again.
xprtsec=mtls - Both sides authenticate and an encrypted
session is created. If the initial handshake
fails, the mount fails. If TLS is not available
on a reconnect, drop the connection and try
again.
To support client peer authentication (mtls), the handshake daemon
will have configurable default authentication material (certificate
or pre-shared key). In the future, mount options can be added that
can provide this material on a per-mount basis.
Updates to mount.nfs (to support xprtsec=auto) and nfs(5) will be
sent under separate cover.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-7936
commit 6c0a8c5fcf7158e889dbdd077f67c81984704710
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Wed Jun 7 09:59:42 2023 -0400
NFS: Have struct nfs_client carry a TLS policy field
The new field is used to match struct nfs_clients that have the same
TLS policy setting.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2063818
commit e59fb6749ed833deee5b3cfd7e89925296d41f49
Author: Jeff Layton <jlayton@kernel.org>
Date: Fri Mar 3 07:16:02 2023 -0500
nfs: move nfs_fhandle_hash to common include file
lockd needs to be able to hash filehandles for tracepoints. Move the
nfs_fhandle_hash() helper to a common nfs include file.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2129854
commit 000dbe0bec058cbf2ca9e156e4a5584f5158b0f9
Author: Dave Wysochanski <dwysocha@redhat.com>
Date: Mon Feb 20 08:43:06 2023 -0500
NFS: Convert buffered read paths to use netfs when fscache is enabled
Convert the NFS buffered read code paths to corresponding netfs APIs,
but only when fscache is configured and enabled.
The netfs API defines struct netfs_request_ops which must be filled
in by the network filesystem. For NFS, we only need to define 5 of
the functions, the main one being the issue_read() function.
The issue_read() function is called by the netfs layer when a read
cannot be fulfilled locally, and must be sent to the server (either
the cache is not active, or it is active but the data is not available).
Once the read from the server is complete, netfs requires a call to
netfs_subreq_terminated() which conveys either how many bytes were read
successfully, or an error. Note that issue_read() is called with a
structure, netfs_io_subrequest, which defines the IO requested, and
contains a start and a length (both in bytes), and assumes the underlying
netfs will return a either an error on the whole region, or the number
of bytes successfully read.
The NFS IO path is page based and the main APIs are the pgio APIs defined
in pagelist.c. For the pgio APIs, there is no way for the caller to
know how many RPCs will be sent and how the pages will be broken up
into underlying RPCs, each of which will have their own completion and
return code. In contrast, netfs is subrequest based, a single
subrequest may contain multiple pages, and a single subrequest is
initiated with issue_read() and terminated with netfs_subreq_terminated().
Thus, to utilze the netfs APIs, NFS needs some way to accommodate
the netfs API requirement on the single response to the whole
subrequest, while also minimizing disruptive changes to the NFS
pgio layer.
The approach taken with this patch is to allocate a small structure
for each nfs_netfs_issue_read() call, store the final error and number
of bytes successfully transferred in the structure, and update these values
as each RPC completes. The refcount on the structure is used as a marker
for the last RPC completion, is incremented in nfs_netfs_read_initiate(),
and decremented inside nfs_netfs_read_completion(), when a nfs_pgio_header
contains a valid pointer to the data. On the final put (which signals
the final outstanding RPC is complete) in nfs_netfs_read_completion(),
call netfs_subreq_terminated() with either the final error value (if
one or more READs complete with an error) or the number of bytes
successfully transferred (if all RPCs complete successfully). Note
that when all RPCs complete successfully, the number of bytes transferred
is capped to the length of the subrequest. Capping the transferred length
to the subrequest length prevents "Subreq overread" warnings from netfs.
This is due to the "aligned_len" in nfs_pageio_add_page(), and the
corner case where NFS requests a full page at the end of the file,
even when i_size reflects only a partial page (NFS overread).
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Tested-by: Daire Byrne <daire@dneg.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Conflicts:
There were two upstream commits that changed the contents of fscache.c,
but then the changes were removed by this commit, so both of these
commits were omitted from the backport:
de4eda9de2d9 use less confusing names for iov_iter direction initializers
8bb7cd842c44 nfs: use bvec_set_page to initialize bvecs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621
commit 0c493b5cf16e28d761b6e77c7c32aa0e7af70813
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Thu Jan 19 16:33:43 2023 -0500
NFS: Convert buffered writes to use folios
Mostly mechanical conversion of struct page and functions into struct
folio equivalents.
The lack of support for folios in write_cache_pages(), means we still
only support order 0 folio allocations. However the rest of the
writeback code should now be ready for order n > 0.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621
commit ab75bff1140733f1b43e81f055acd7d27af7ac05
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Thu Jan 19 16:33:41 2023 -0500
NFS: Convert buffered reads to use folios
Perform a largely mechanical conversion of references to struct page and
page-specific functions to use the folio equivalents.
Note that the fscache functionality remains untouched. Instead we just
pass in the folio page. This should be OK, as long as we use order 0
folios together with fscache.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621
commit eb9f2a5a5e85fd24949480d1d02c2a497f26e154
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Thu Jan 19 16:33:36 2023 -0500
NFS: Support folios in nfs_generic_pgio()
Add support for multi-page folios in the generic NFS i/o engine.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621
commit cf0d7e7f4520814f45e1313872ad5777ed504004
Author: Kees Cook <keescook@chromium.org>
Date: Sun Oct 16 21:36:50 2022 -0700
NFS: Avoid memcpy() run-time warning for struct sockaddr overflows
The 'nfs_server' and 'mount_server' structures include a union of
'struct sockaddr' (with the older 16 bytes max address size) and
'struct sockaddr_storage' which is large enough to hold all the
supported sa_family types (128 bytes max size). The runtime memcpy()
buffer overflow checker is seeing attempts to write beyond the 16
bytes as an overflow, but the actual expected size is that of 'struct
sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more
completely through-out NFS, which results in adjusting the memcpy()
buffers to the correct union members. Avoids this false positive run-time
warning under CONFIG_FORTIFY_SOURCE:
memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16)
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2183621
commit a035618caf8718a1d4e840ec39dfc5fce0dcdee1
Author: Gaosheng Cui <cuigaosheng1@huawei.com>
Date: Fri Sep 9 14:24:11 2022 +0800
nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration
nfs_write_prepare() has been removed since
commit a4cdda5911 ("NFS: Create a common pgio_rpc_prepare
function"), so remove it.
nfs_wait_atomic_killable() has been removed since
commit 723c921e7d ("sched/wait, fs/nfs: Convert wait_on_atomic_t()
usage to the new wait_var_event() API"), so remove it.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2160210
commit 4ae84a80475144f739f77ed8bc789bc7feaa08ce
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date: Mon Jun 6 09:22:19 2022 -0400
nfs: Convert to migrate_folio
Use a folio throughout this function. migrate_page() will be converted
later.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2107347
commit a60214c2465493aac0b014d87ee19327b6204c42
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Wed Nov 30 15:30:47 2022 -0500
NFS: Allow very small rsize & wsize again
940261a19508 introduced nfs_io_size() to clamp the iosize to a multiple
of PAGE_SIZE. This had the unintended side effect of no longer allowing
iosizes less than a page, which could be useful in some situations.
UDP already has an exception that causes it to fall back on the
power-of-two style sizes instead. This patch adds an additional
exception for very small iosizes.
Reported-by: Jeff Layton <jlayton@kernel.org>
Fixes: 940261a19508 ("NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2107347
commit 940261a195080cf1cdcd56948d363fe363b69da1
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Fri Jun 17 16:23:36 2022 -0400
NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE
Previously, we required this to value to be a power of 2 for UDP related
reasons. This patch keeps the power of 2 rule for UDP but allows more
flexibility for TCP and RDMA.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit d7a5118635e725d195843bda80cc5c964d93ef31
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Wed Sep 7 16:34:21 2022 -0400
NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE
The fallocate call invalidates suid and sgid bits as part of normal
operation. We need to mark the mode bits as invalid when using fallocate
with an suid so these will be updated the next time the user looks at them.
This fixes xfstests generic/683 and generic/684.
Reported-by: Yue Cui <cuiyue-fnst@fujitsu.com>
Fixes: 913eca1aea ("NFS: Fallocate should use the nfs4_fattr_bitmap")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit 452284407c18d8a522c3039339b1860afa0025a8
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Sat May 14 10:08:10 2022 -0400
NFS: Memory allocation failures are not server fatal errors
We need to filter out ENOMEM in nfs_error_is_fatal_on_server(), because
running out of memory on our client is not a server error.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 2dc23afffb ("NFS: ENOMEM should also be a fatal error.")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit b243874f6f9568b2daf1a00e9222cacdc15e159c
Author: ChenXiaoSong <chenxiaosong2@huawei.com>
Date: Tue Mar 29 19:32:08 2022 +0800
NFSv4: fix open failure with O_ACCMODE flag
open() with O_ACCMODE|O_DIRECT flags secondly will fail.
Reproducer:
1. mount -t nfs -o vers=4.2 $server_ip:/ /mnt/
2. fd = open("/mnt/file", O_ACCMODE|O_DIRECT|O_CREAT)
3. close(fd)
4. fd = open("/mnt/file", O_ACCMODE|O_DIRECT)
Server nfsd4_decode_share_access() will fail with error nfserr_bad_xdr when
client use incorrect share access mode of 0.
Fix this by using NFS4_SHARE_ACCESS_BOTH share access mode in client,
just like firstly opening.
Fixes: ce4ef7c0a8 ("NFS: Split out NFS v4 file operations")
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit 515dcdcd48736576c6f5c197814da6f81c60a21e
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Mon Mar 21 12:34:19 2022 -0400
NFS: nfsiod should not block forever in mempool_alloc()
The concern is that since nfsiod is sometimes required to kick off a
commit, it can get locked up waiting forever in mempool_alloc() instead
of failing gracefully and leaving the commit until later.
Try to allocate from the slab first, with GFP_KERNEL | __GFP_NORETRY,
then fall back to a non-blocking attempt to allocate from the memory
pool.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit 230bc98f7a2a49eb472d184bdec91fd3096384b3
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Thu Feb 17 11:08:24 2022 -0500
NFS: Improve heuristic for readdirplus
The heuristic for readdirplus is designed to try to detect 'ls -l' and
similar patterns. It does so by looking for cache hit/miss patterns in
both the attribute cache and in the dcache of the files in a given
directory, and then sets a flag for the readdirplus code to interpret.
The problem with this approach is that a single attribute or dcache miss
can cause the NFS code to force a refresh of the attributes for the
entire set of files contained in the directory.
To be able to make a more nuanced decision, let's sample the number of
hits and misses in the set of open directory descriptors. That allows us
to set thresholds at which we start preferring READDIRPLUS over regular
READDIR, or at which we start to force a re-read of the remaining
readdir cache using READDIRPLUS.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit 84631f84ac95b6ff6f08a41ffba1f93eaab4e9c7
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Wed Feb 23 15:43:26 2022 -0500
NFS: Clean up NFSv4.2 xattrs
Add a helper for the xattr mask so that we can get rid of the inlined
ifdefs.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2094072
commit 00bdadc7accfce944dc30fbc205cd28a7eed657b
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Fri Dec 17 15:36:57 2021 -0500
NFS: Add a helper to remove case-insensitive aliases
When dealing with case insensitive names, the client has no idea how the
server performs the mapping, so cannot collapse the dentries into a
single representative. So both rename and unlink need to deal with the
fact that there could be several dentries representing the file, and
have to somehow force them to be revalidated. Use d_prune_aliases() as a
big hammer approach.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200
commit d755ad8dc752d44545613ea04d660aed674e540d
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Fri Oct 22 13:11:00 2021 -0400
NFS: Create a new nfs_alloc_fattr_with_label() function
For creating fattrs with the label field already allocated for us. I
also update nfs_free_fattr() to free the label in the end.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200
commit 5fe1210d259542f966bab130830ece08e97f68f5
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Thu Oct 14 13:55:08 2021 -0400
NFS: Unexport nfs_probe_fsinfo()
All the callers are now in client.c so we can remove the
EXPORT_SYMBOL_GPL() and make it static.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200
commit e5731131fb6fefaa69064ca511b7c4971d6cf54f
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Thu Oct 14 13:55:05 2021 -0400
NFS: Move nfs_probe_destination() into the generic client
And rename it to nfs_probe_server(). I also change it to take the nfs_fh
as an argument so callers can choose what filehandle to probe.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200
commit 01dde76e471229e3437a2686c572f4980b2c483e
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Thu Oct 14 13:55:04 2021 -0400
NFS: Create an nfs4_server_set_init_caps() function
And call it before doing an FSINFO probe to reset to the baseline
capabilities before probing.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049200
commit 7e134205f62955369619021a695cd78fefd32451
Author: Olga Kornievskaia <kolga@netapp.com>
Date: Fri Aug 27 14:37:17 2021 -0400
NFSv4 introduce max_connect mount options
This option will control up to how many xprts can the client
establish to the server with a distinct address (that means
nconnect connections are not counted towards this new limit).
This patch is setting up nfs structures to keeep track of the
max_connect limit (does not enforce it).
The default value is kept at 1 so that no current mounts that
don't want any additional connections would be effected. The
maximum value is set at 16.
Mounts to DS are not limited to default value of 1 but instead
set to the maximum default value of 16 (NFS_MAX_TRANSPORTS).
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Highlights include:
Stable fixes:
- Add validation of the UDP retrans parameter to prevent shift out-of-bounds
- Don't discard pNFS layout segments that are marked for return
Bugfixes:
- Fix a NULL dereference crash in xprt_complete_bc_request() when the
NFSv4.1 server misbehaves.
- Fix the handling of NFS READDIR cookie verifiers
- Sundry fixes to ensure attribute revalidation works correctly when the
server does not return post-op attributes.
- nfs4_bitmask_adjust() must not change the server global bitmasks
- Fix major timeout handling in the RPC code.
- NFSv4.2 fallocate() fixes.
- Fix the NFSv4.2 SEEK_HOLE/SEEK_DATA end-of-file handling
- Copy offload attribute revalidation fixes
- Fix an incorrect filehandle size check in the pNFS flexfiles driver
- Fix several RDMA transport setup/teardown races
- Fix several RDMA queue wrapping issues
- Fix a misplaced memory read barrier in sunrpc's call_decode()
Features:
- Micro optimisation of the TCP transmission queue using TCP_CORK
- statx() performance improvements by further splitting up the tracking
of invalid cached file metadata.
- Support the NFSv4.2 "change_attr_type" attribute and use it to
optimise handling of change attribute updates.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmCVLooACgkQZwvnipYK
APJB5BAAtIJyhx40ooMBzcucDmXd1qovlKsb8ZlvnSI6c7wvHhFPNk9z4zwThnjL
FpVYzJzK6XzAQY/PtgbrPwnSUmW925ngPWYR/hiYe+OGPBnYV+tXP8izCyEkNgMg
45goDOxojGWl7AGTuAJiKcDSdH9PyIrbvt28iwcNSGjslasGSbAoL/836l4OIGr1
Ymxs/NDML11dPco8GIKLGtHd8leFGleDx089VeNsgud8MdaFErp16O5Iz8DdzRKd
W1l2zDMb05j8eDZIfy3w3FyrLkDXA+KgLSADiC8TcpxoadPaQJMeCvoIq8oqVndn
bZBoxduXdLgf54Aec0WnNKFAOyc7pGvZoSNmFouT7EGV73g+g1LQ+ZbEE1bb8fCQ
XHqCVaBt2+47NiTUgdxjXlZRfcn8fYKx0tVxfG3mQVMXUAWfsjmMyQMNgijDRJI2
8Wz3lZMRGMILbR9j4QpP1biVy/2zGNWG/TB5ZZyZMSY4uT+aOpzlqdknb4UsRaSp
f7MfmB7xEWpS4DJr9RIBrJ/hIdnMu1mNInxDPFo5Kl5HNp4TaPm2dPir2ZD2wMZI
daURTX7giUhpE15ZebQDBqWD+mTR0bVDqLLeo131JRmMfMEHugNrr49xe+NkBu/R
QWnFzgkGdQsOeiKRRwEUuhsi74JspqfwzdZzHqcRM5WuXVvBLcA=
=h01b
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- Add validation of the UDP retrans parameter to prevent shift
out-of-bounds
- Don't discard pNFS layout segments that are marked for return
Bugfixes:
- Fix a NULL dereference crash in xprt_complete_bc_request() when the
NFSv4.1 server misbehaves.
- Fix the handling of NFS READDIR cookie verifiers
- Sundry fixes to ensure attribute revalidation works correctly when
the server does not return post-op attributes.
- nfs4_bitmask_adjust() must not change the server global bitmasks
- Fix major timeout handling in the RPC code.
- NFSv4.2 fallocate() fixes.
- Fix the NFSv4.2 SEEK_HOLE/SEEK_DATA end-of-file handling
- Copy offload attribute revalidation fixes
- Fix an incorrect filehandle size check in the pNFS flexfiles driver
- Fix several RDMA transport setup/teardown races
- Fix several RDMA queue wrapping issues
- Fix a misplaced memory read barrier in sunrpc's call_decode()
Features:
- Micro optimisation of the TCP transmission queue using TCP_CORK
- statx() performance improvements by further splitting up the
tracking of invalid cached file metadata.
- Support the NFSv4.2 'change_attr_type' attribute and use it to
optimise handling of change attribute updates"
* tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (85 commits)
xprtrdma: Fix a NULL dereference in frwr_unmap_sync()
sunrpc: Fix misplaced barrier in call_decode
NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code.
xprtrdma: Move fr_mr field to struct rpcrdma_mr
xprtrdma: Move the Work Request union to struct rpcrdma_mr
xprtrdma: Move fr_linv_done field to struct rpcrdma_mr
xprtrdma: Move cqe to struct rpcrdma_mr
xprtrdma: Move fr_cid to struct rpcrdma_mr
xprtrdma: Remove the RPC/RDMA QP event handler
xprtrdma: Don't display r_xprt memory addresses in tracepoints
xprtrdma: Add an rpcrdma_mr_completion_class
xprtrdma: Add tracepoints showing FastReg WRs and remote invalidation
xprtrdma: Avoid Send Queue wrapping
xprtrdma: Do not wake RPC consumer on a failed LocalInv
xprtrdma: Do not recycle MR after FastReg/LocalInv flushes
xprtrdma: Clarify use of barrier in frwr_wc_localinv_done()
xprtrdma: Rename frwr_release_mr()
xprtrdma: rpcrdma_mr_pop() already does list_del_init()
xprtrdma: Delete rpcrdma_recv_buffer_put()
xprtrdma: Fix cwnd update ordering
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmCHM2sUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNfCg/9GmoCyCh+ZRj5RGQ6M+yJas1+yyJQ
uEfTNde54yfATUTaaWYnZG59yqzM3I2uaV11U7tqg8ajiFPxJKqbs5R9jl3lnSjH
0Dg22nXPSCOTKcU0x/DeLoKRr+M9jO1K/nQ8NEZvYX4nC/OgtCvJqb/oEQZIKAk5
2a7OEmNNQyFGd274p9dELaDHxN9UIaJ2PzQFXtq7ROHgBXQO4ONb2ajOf6mDSFQb
vP/CDHwaH+pcE28w44oRy0/YBkO1SrdqoFQchg5yFagM5tQRLGkXK4OFSs5KHi5Q
YMtmaOzMPIv1e5eaC1HuuMJYA4pPb30T9hFHP7tmBVZfmZaFaDeUs+BhMm98WTiS
o0iTP7tfs36/poOR1Q0/sB06uvF9hUAAX1ZuE95YySifbXU9hsUc9b0uQSwCdg9P
/J9rcdHLTpWqjw9n02mezWmAvo5U8ZvbDs+0xPIwI+3RTUP5t6mp+Hd5Tc7bPTq1
0rpWXx+FQoSytFap5qiUSiwBp+HF6HQnNIXB0Muf6wctChoTjvo7TwoxH//z4kEm
+SddhOCNkB7VC/X7hOxhl0F/rdHuXvb1AFIWjpTLJH2CR1PvMtF+sGey+uPT6hKZ
/gvhmQGjFdph99eGlfVbCNvx1pM61O25IscaYD1T2wGImw+z7dX4WkG3WoOdDSkR
bRjrBkcHh0gLhWk=
=HTEy
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add support for measuring the SELinux state and policy capabilities
using IMA.
- A handful of SELinux/NFS patches to compare the SELinux state of one
mount with a set of mount options. Olga goes into more detail in the
patch descriptions, but this is important as it allows more
flexibility when using NFS and SELinux context mounts.
- Properly differentiate between the subjective and objective LSM
credentials; including support for the SELinux and Smack. My clumsy
attempt at a proper fix for AppArmor didn't quite pass muster so John
is working on a proper AppArmor patch, in the meantime this set of
patches shouldn't change the behavior of AppArmor in any way. This
change explains the bulk of the diffstat beyond security/.
- Fix a problem where we were not properly terminating the permission
list for two SELinux object classes.
* tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: add proper NULL termination to the secclass_map permissions
smack: differentiate between subjective and objective task credentials
selinux: clarify task subjective and objective credentials
lsm: separate security_task_getsecid() into subjective and objective variants
nfs: account for selinux security context when deciding to share superblock
nfs: remove unneeded null check in nfs_fill_super()
lsm,selinux: add new hook to compare new mount to an existing mount
selinux: fix misspellings using codespell tool
selinux: fix misspellings using codespell tool
selinux: measure state and policy capabilities
selinux: Allow context mounts for unpriviliged overlayfs
Mounting NFSv3 uses default timeout parameters specified by underlying
sunrpc transport, and mount options like 'timeo' and 'retrans', unlike
NFSv4, are not honored.
But sometimes we want to set non-default timeout value when mounting
NFSv3, so pass 'timeo' and 'retrans' to nfs_mount() and fill the
'timeout' field of struct rpc_create_args before creating RPC
connection. This is also consistent with NFSv4 behavior.
Note that this only sets the timeout value of rpc connection to mountd,
but the timeout of rpcbind connection should be set as well. A later
patch will fix the rpcbind part.
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Keep track of whether or not there were LSM security context
options passed during mount (ie creation of the superblock).
Then, while deciding if the superblock can be shared for the new
mount, check if the newly passed in LSM security context options
are compatible with the existing superblock's ones by calling
security_sb_mnt_opts_compat().
Previously, with selinux enabled, NFS wasn't able to do the
following 2mounts:
mount -o vers=4.2,sec=sys,context=system_u:object_r:root_t:s0
<serverip>:/ /mnt
mount -o vers=4.2,sec=sys,context=system_u:object_r:swapfile_t:s0
<serverip>:/scratch /scratch
2nd mount would fail with "mount.nfs: an incorrect mount option was
specified" and var log messages would have:
"SElinux: mount invalid. Same superblock, different security
settings for.."
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
[PM: tweak subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Extend some inode methods with an additional user namespace argument. A
filesystem that is aware of idmapped mounts will receive the user
namespace the mount has been marked with. This can be used for
additional permission checking and also to enable filesystems to
translate between uids and gids if they need to. We have implemented all
relevant helpers in earlier patches.
As requested we simply extend the exisiting inode method instead of
introducing new ones. This is a little more code churn but it's mostly
mechanical and doesnt't leave us with additional inode methods.
Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Before referencing the inode, we must ensure that the superblock can be
referenced. Otherwise, we can end up with iput() calling superblock
operations that are no longer valid or accessible.
Fixes: ea7c38fef0 ("NFSv4: Ensure we reference the inode for return-on-close in delegreturn")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Several existing dprink()/dfprintk() calls were converted to use the new
mount API logging macros by commit ce8866f091 ("NFS: Attach
supplementary error information to fs_context"). If the fs_context was
not created using fsopen() then it will not have had a log buffer
allocated for it, and the new mount API logging macros will wind up
calling printk().
This can result in syslog messages being logged where previously there
were none... most notably "NFS4: Couldn't follow remote path", which can
happen if the client is auto-negotiating a protocol version with an NFS
server that doesn't support the higher v4.x versions.
Convert the nfs_errorf(), nfs_invalf(), and nfs_warnf() macros to check
for the existence of the fs_context's log buffer and call dprintk() if
it doesn't exist. Add nfs_ferrorf(), nfs_finvalf(), and nfs_warnf(),
which do the same thing but take an NFS debug flag as an argument and
call dfprintk(). Finally, modify the "NFS4: Couldn't follow remote
path" message to use nfs_ferrorf().
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207385
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: ce8866f091 ("NFS: Attach supplementary error information to fs_context.")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Support readdir buffers of up to 1MB in size so that we can read
large directories using few RPC calls.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
After an NFS page has been written it is considered "unstable" until a
COMMIT request succeeds. If the COMMIT fails, the page will be
re-written.
These "unstable" pages are currently accounted as "reclaimable", either
in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
'reclaimable' count. This might have made sense when sending the COMMIT
required a separate action by the VFS/MM (e.g. releasepage() used to
send a COMMIT). However now that all writes generated by ->writepages()
will automatically be followed by a COMMIT (since commit 919e3bd9a8
("NFS: Ensure we commit after writeback is complete")) it makes more
sense to treat them as writeback pages.
So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in
NR_WRITEBACK and WB_WRITEBACK.
A particular effect of this change is that when
wb_check_background_flush() calls wb_over_bg_threshold(), the latter
will report 'true' a lot less often as the 'unstable' pages are no
longer considered 'dirty' (as there is nothing that writeback can do
about them anyway).
Currently wb_check_background_flush() will trigger writeback to NFS even
when there are relatively few dirty pages (if there are lots of unstable
pages), this can result in small writes going to the server (10s of
Kilobytes rather than a Megabyte) which hurts throughput. With this
patch, there are fewer writes which are each larger on average.
Where the NR_UNSTABLE_NFS count was included in statistics
virtual-files, the entry is retained, but the value is hard-coded as
zero. static trace points and warning printks which mentioned this
counter no longer report it.
[akpm@linux-foundation.org: re-layout comment]
[akpm@linux-foundation.org: fix printk warning]
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Acked-by: Michal Hocko <mhocko@suse.com> [mm]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Link: http://lkml.kernel.org/r/87d06j7gqa.fsf@notabene.neil.brown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to trust that desc->pg_mirror_idx is set correctly, whether
or not mirroring is enabled.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>