Commit Graph

389 Commits

Author SHA1 Message Date
Ian Kent db8603ce12 fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: Dropped hunks for ksmbd because the source is not present in the
	CentOS Stream source tree.
	CentOS Stream has commit bb901646d2 ("ovl: let helper
	ovl_i_path_real() return the realinode") which wasn't present
	upstream when this patch was applied, correct manually.
	CentOS Stream does not have upstream commit c7423dbdbc9ec
	("ima: Handle -ESTALE returned by ima_filter_rule_match()")
	which results in a reject of hunk #3 against
	security/integrity/ima/ima_policy.c, so manually apply hunk.
	Upstream merge commit 05e6295f7b5e0 ("Merge tag 'fs.idmapped.v6.3'
	of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping")
	together with Upstream commit facd61053cff1 ("fuse: fixes after
	adapting to new posix acl api") results in a conflict in
	fs/fuse/acl.c, adjust to suit.
	Update the call to i_uid_into_vfsuid() from 2740f64cb7f00
	("filelocks: use mount idmapping for setlease permission check")
	to pass an idmap instead of a user namespace.
	It looks like Linus made a change to the merge request "Merge tag
	8834147f95056 ("fscache-rewrite-20220111") to account for idmap
	changes (probably the ones in this commit, so add the change here.

commit e67fe63341b8117d7e0d9acf0f1222d5138b9266
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:30 2023 +0100

    fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap

    Convert to struct mnt_idmap.
    Remove legacy file_mnt_user_ns() and mnt_user_ns().

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 11:02:01 +08:00
Ian Kent 45d3f27dc5 filelocks: use mount idmapping for setlease permission check
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: CentOS Stream commit 74af350d0f ("filelocks: use mount
	idmapping for setlease permission check") introduced a backport
	of this change early in CentOS Stream 9, update it to the v6.3
	commit now we have the idmapping dependencies.

commit 42d0c4bdf753063b6eec55415003184d3ca24f6e
Author: Seth Forshee <sforshee@kernel.org>
Date:   Thu Mar 9 14:39:09 2023 -0600

    filelocks: use mount idmapping for setlease permission check

    A user should be allowed to take out a lease via an idmapped mount if
    the fsuid matches the mapped uid of the inode. generic_setlease() is
    checking the unmapped inode uid, causing these operations to be denied.

    Fix this by comparing against the mapped inode uid instead of the
    unmapped uid.

    Fixes: 9caccd4154 ("fs: introduce MOUNT_ATTR_IDMAP")
    Cc: stable@vger.kernel.org
    Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 08:29:41 +08:00
Lucas Zampieri a770863ad5 Merge: filelock: Fix fcntl/close race recovery compat path
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4912

JIRA: https://issues.redhat.com/browse/RHEL-50898

CVE: CVE-2024-41020

Conflicts: Conflict is caused by absence of upstream commit 4ca52f539865 (filelock: have fs/locks.c deal with file_lock_core directly)

Signed-off-by: Pavel Reichl <preichl@redhat.com>

Approved-by: Andrey Albershteyn <aalbersh@redhat.com>
Approved-by: Brian Foster <bfoster@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-08-07 17:05:00 +00:00
Lucas Zampieri afdd51456c Merge: CVE-2024-41049: filelock: fix potential use-after-free in posix_lock_inode
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4851

JIRA: https://issues.redhat.com/browse/RHEL-51103  
CVE: CVE-2024-41049

```
filelock: fix potential use-after-free in posix_lock_inode

Light Hsieh reported a KASAN UAF warning in trace_posix_lock_inode().
The request pointer had been changed earlier to point to a lock entry
that was added to the inode's list. However, before the tracepoint could
fire, another task raced in and freed that lock.

Fix this by moving the tracepoint inside the spinlock, which should
ensure that this doesn't happen.

Fixes: 74f6f5912693 ("locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock")
Link: https://lore.kernel.org/linux-fsdevel/724ffb0a2962e912ea62bb0515deadf39c325112.camel@kernel.org/
Reported-by: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20240702-filelock-6-10-v1-1-96e766aadc98@kernel.org
Reviewed-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
(cherry picked from commit 1b3ec4f7c03d4b07bad70697d7e2f4088d2cfe92)
```

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>

Approved-by: Pavel Reichl <preichl@redhat.com>
Approved-by: Andrey Albershteyn <aalbersh@redhat.com>
Approved-by: Brian Foster <bfoster@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-08-07 17:02:38 +00:00
Pavel Reichl 37f2f2a964 filelock: Fix fcntl/close race recovery compat path
JIRA: https://issues.redhat.com/browse/RHEL-50898
CVE: CVE-2024-41020
Conflicts: Conflict is caused by absence of upstream commit 4ca52f539865
	(filelock: have fs/locks.c deal with file_lock_core directly)

When I wrote commit 3cad1bc01041 ("filelock: Remove locks reliably when
fcntl/close race is detected"), I missed that there are two copies of the
code I was patching: The normal version, and the version for 64-bit offsets
on 32-bit kernels.
Thanks to Greg KH for stumbling over this while doing the stable
backport...

Apply exactly the same fix to the compat path for 32-bit kernels.

Fixes: c293621bbf ("[PATCH] stale POSIX lock handling")
Cc: stable@kernel.org
Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2563
Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20240723-fs-lock-recover-compatfix-v1-1-148096719529@google.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
(cherry picked from commit f8138f2ad2f745b9a1c696a05b749eabe44337ea)
Signed-off-by: Pavel Reichl <preichl@redhat.com>
2024-07-31 17:19:53 +02:00
CKI Backport Bot 7fc54bd48c filelock: fix potential use-after-free in posix_lock_inode
JIRA: https://issues.redhat.com/browse/RHEL-51103
CVE: CVE-2024-41049

commit 1b3ec4f7c03d4b07bad70697d7e2f4088d2cfe92
Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Jul 2 18:44:48 2024 -0400

    filelock: fix potential use-after-free in posix_lock_inode

    Light Hsieh reported a KASAN UAF warning in trace_posix_lock_inode().
    The request pointer had been changed earlier to point to a lock entry
    that was added to the inode's list. However, before the tracepoint could
    fire, another task raced in and freed that lock.

    Fix this by moving the tracepoint inside the spinlock, which should
    ensure that this doesn't happen.

    Fixes: 74f6f5912693 ("locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock")
    Link: https://lore.kernel.org/linux-fsdevel/724ffb0a2962e912ea62bb0515deadf39c325112.camel@kernel.org/
    Reported-by: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Link: https://lore.kernel.org/r/20240702-filelock-6-10-v1-1-96e766aadc98@kernel.org
    Reviewed-by: Alexander Aring <aahringo@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>

Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>
2024-07-30 10:21:53 +00:00
Bill O'Donnell 716e22b1f4 filelock: Remove locks reliably when fcntl/close race is detected
JIRA: https://issues.redhat.com/browse/RHEL-50176

CVE: CVE-2024-41012

Conflicts: minor differences since missing upstream patches including
4ca52f539 (filelock: have fs/locks.c deal with file_lock_core directly)
(e.g. those patches replace fl_type with flc_type and are not required here).

commit 3cad1bc010416c6dd780643476bc59ed742436b9
Author: Jann Horn <jannh@google.com>
Date:   Tue Jul 2 18:26:52 2024 +0200

    filelock: Remove locks reliably when fcntl/close race is detected

    When fcntl_setlk() races with close(), it removes the created lock with
    do_lock_file_wait().
    However, LSMs can allow the first do_lock_file_wait() that created the lock
    while denying the second do_lock_file_wait() that tries to remove the lock.
    In theory (but AFAIK not in practice), posix_lock_file() could also fail to
    remove a lock due to GFP_KERNEL allocation failure (when splitting a range
    in the middle).

    After the bug has been triggered, use-after-free reads will occur in
    lock_get_status() when userspace reads /proc/locks. This can likely be used
    to read arbitrary kernel memory, but can't corrupt kernel memory.
    This only affects systems with SELinux / Smack / AppArmor / BPF-LSM in
    enforcing mode and only works from some security contexts.

    Fix it by calling locks_remove_posix() instead, which is designed to
    reliably get rid of POSIX locks associated with the given file and
    files_struct and is also used by filp_flush().

    Fixes: c293621bbf ("[PATCH] stale POSIX lock handling")
    Cc: stable@kernel.org
    Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2563
    Signed-off-by: Jann Horn <jannh@google.com>
    Link: https://lore.kernel.org/r/20240702-fs-lock-recover-2-v1-1-edd456f63789@google.com
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Christian Brauner <brauner@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-07-25 14:29:51 -05:00
Wander Lairson Costa 7f403537c0
Reapply "memcg: enable accounting for file lock caches"
This reverts commit 3754707bcc3e190e5dadc978d172b61e809cb3bd.

There was a long debate about the performance regression this patch
caused. In the artificial performance tests, locking and unlocking the
file constantly, upstream saw a 35% regression. However, Performance QE
conducted more realistic performance tests with popular databases and
Phoronix test suite, and observed no relevant performance impact.

JIRA: https://issues.redhat.com/browse/RHEL-8487

CVE: CVE-2022-0480

Upstream Status: RHEL-only

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
2024-01-19 14:50:50 -03:00
Jeffrey Layton f8352f0778 locks: allow support for write delegation
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit d67cd907cf8ae2cd42e4f3859ad4de4c16d0c2a3
Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Thu Jun 29 18:52:37 2023 -0700

    locks: allow support for write delegation

    Remove the check for F_WRLCK in generic_add_lease to allow file_lock
    to be used for write delegation.

    First consumer is NFSD.

    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:26 -05:00
Jeffrey Layton 6937630521 locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit 74f6f5912693ce454384eaeec48705646a21c74f
Author: Will Shiu <Will.Shiu@mediatek.com>
Date:   Fri Jul 21 13:19:04 2023 +0800

    locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock

    As following backtrace, the struct file_lock request , in posix_lock_inode
    is free before ftrace function using.
    Replace the ftrace function ahead free flow could fix the use-after-free
    issue.

    [name:report&]===============================================
    BUG:KASAN: use-after-free in trace_event_raw_event_filelock_lock+0x80/0x12c
    [name:report&]Read at addr f6ffff8025622620 by task NativeThread/16753
    [name:report_hw_tags&]Pointer tag: [f6], memory tag: [fe]
    [name:report&]
    BT:
    Hardware name: MT6897 (DT)
    Call trace:
     dump_backtrace+0xf8/0x148
     show_stack+0x18/0x24
     dump_stack_lvl+0x60/0x7c
     print_report+0x2c8/0xa08
     kasan_report+0xb0/0x120
     __do_kernel_fault+0xc8/0x248
     do_bad_area+0x30/0xdc
     do_tag_check_fault+0x1c/0x30
     do_mem_abort+0x58/0xbc
     el1_abort+0x3c/0x5c
     el1h_64_sync_handler+0x54/0x90
     el1h_64_sync+0x68/0x6c
     trace_event_raw_event_filelock_lock+0x80/0x12c
     posix_lock_inode+0xd0c/0xd60
     do_lock_file_wait+0xb8/0x190
     fcntl_setlk+0x2d8/0x440
    ...
    [name:report&]
    [name:report&]Allocated by task 16752:
    ...
     slab_post_alloc_hook+0x74/0x340
     kmem_cache_alloc+0x1b0/0x2f0
     posix_lock_inode+0xb0/0xd60
    ...
     [name:report&]
     [name:report&]Freed by task 16752:
    ...
      kmem_cache_free+0x274/0x5b0
      locks_dispose_list+0x3c/0x148
      posix_lock_inode+0xc40/0xd60
      do_lock_file_wait+0xb8/0x190
      fcntl_setlk+0x2d8/0x440
      do_fcntl+0x150/0xc18
    ...

    Signed-off-by: Will Shiu <Will.Shiu@mediatek.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:26 -05:00
Jeffrey Layton 2aa7d3786b fs/locks: Remove redundant assignment to cmd
JIRA: https://issues.redhat.com/browse/RHEL-7936

commit dc592190a5543c559010e09e8130a1af3f9068d3
Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Date:   Wed Mar 8 15:13:16 2023 +0800

    fs/locks: Remove redundant assignment to cmd

    Variable 'cmd' set but not used.

    fs/locks.c:2428:3: warning: Value stored to 'cmd' is never read.

    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4439
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-12-02 05:12:26 -05:00
Waiman Long 3ec7d5a842 fs: Remove CONFIG_SRCU
JIRA: https://issues.redhat.com/browse/RHEL-5228

commit 7b3a0473d10c64be7b2b4b4d69fa87128ebb6dd0
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue, 22 Nov 2022 18:20:28 -0800

    fs: Remove CONFIG_SRCU

    Now that the SRCU Kconfig option is unconditionally selected, there is
    no longer any point in conditional compilation based on CONFIG_SRCU.
    Therefore, remove the #ifdef and throw away the #else clause.

    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Jeff Layton <jlayton@kernel.org>
    Cc: Chuck Lever <chuck.lever@oracle.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: <linux-fsdevel@vger.kernel.org>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: John Ogness <john.ogness@linutronix.de>

Signed-off-by: Waiman Long <longman@redhat.com>
2023-09-22 13:21:41 -04:00
Jeffrey Layton 9c5d019ca0 vfs: remove the FL_EXT_LMOPS flag
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2185616
Upstream status: RHEL-only

We recently backported a set of file locking updates for RHEL9.3, and
as part of that, I added a RHEL-only FL_EXT_LMOPS flag to allow kernel
modules built against 9.2 and earlier headers to still work.

It was recently pointed out to me that this sort of hack is no longer
necessary in RHEL9.y releases. Remove the flag and move the lm_mod_owner
field to the same place it is in upstream code.

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-04-10 11:20:22 -04:00
Jan Stancek d31aebc906 Merge: vfs: file locking fixes and cleanups for 9.3
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2075

Bugzilla: https://bugzilla.redhat.com/2172087

The main change here is fixing accesses to inode->i_flctx to use a new accessor function. There are also a couple of other minor cleanups and bugfixes to the file locking code. This also merges some of the file locking infrastructure that is required for nfsd's courteous server support, even though there are no callers of these functions yet. Pulling them in allowed things to merge more cleanly, and we plan to eventually take that feature anyway.

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>

Approved-by: Benjamin Coddington <bcodding@redhat.com>
Approved-by: Xiubo Li <xiubli@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-04-06 14:03:53 +02:00
Jan Stancek bb21a71e46 Merge: fs: backport idmapped mounts fixes
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2207

Bugzilla: https://bugzilla.redhat.com/2179877

Merge request contains a backport of fixes related to idmapped mounts. They are required to enable idmapped mounts in RHEL.

Signed-off-by: Alex Gladkov <agladkov@redhat.com>

Approved-by: Adrian Reber <areber@redhat.com>
Approved-by: Brian Foster <bfoster@redhat.com>

Signed-off-by: Jan Stancek <jstancek@redhat.com>
2023-04-04 11:53:03 +02:00
Alex Gladkov 74af350d0f filelocks: use mount idmapping for setlease permission check
Bugzilla: https://bugzilla.redhat.com/2179877
Conflicts: Context diff due to missing commits:
           c65454a94726 ("fs: remove locks_inode")
           42d0c4bdf753 ("filelocks: use mount idmapping for setlease permission check")

           Functionally, the commit is backported to 5.14 by the author
           of the original changes:

           c5ed39d394

commit 42d0c4bdf753063b6eec55415003184d3ca24f6e
Author: Seth Forshee <sforshee@kernel.org>
Date:   Thu Mar 9 14:39:09 2023 -0600

    filelocks: use mount idmapping for setlease permission check

    A user should be allowed to take out a lease via an idmapped mount if
    the fsuid matches the mapped uid of the inode. generic_setlease() is
    checking the unmapped inode uid, causing these operations to be denied.

    Fix this by comparing against the mapped inode uid instead of the
    unmapped uid.

    Fixes: 9caccd4154 ("fs: introduce MOUNT_ATTR_IDMAP")
    Cc: stable@vger.kernel.org
    Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Alex Gladkov <agladkov@redhat.com>
2023-03-27 19:02:08 +02:00
Jeffrey Layton 6699a8b93e Add process name and pid to locks warning
Bugzilla: https://bugzilla.redhat.com/2172087

commit f2f2494c8aa3cc317572c4674ef256005ebc092b
Author: Andi Kleen <ak@linux.intel.com>
Date:   Fri Nov 18 15:43:57 2022 -0800

    Add process name and pid to locks warning

    It's fairly useless to complain about using an obsolete feature without
    telling the user which process used it. My Fedora desktop randomly drops
    this message, but I would really need this patch to figure out what
    triggers is.

    [ jlayton: print pid as well as process name ]

    Signed-off-by: Andi Kleen <ak@linux.intel.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:10 -04:00
Jeffrey Layton 32ba8c1d21 filelock: add a new locks_inode_context accessor function
Bugzilla: https://bugzilla.redhat.com/2172087

commit 401a8b8fd5acd51582b15238d72a8d0edd580e9f
Author: Jeff Layton <jlayton@kernel.org>
Date:   Wed Nov 16 09:02:30 2022 -0500

    filelock: add a new locks_inode_context accessor function

    There are a number of places in the kernel that are accessing the
    inode->i_flctx field without smp_load_acquire. This is required to
    ensure that the caller doesn't see a partially-initialized structure.

    Add a new accessor function for it to make this clear and convert all of
    the relevant accesses in locks.c to use it. Also, convert
    locks_free_lock_context to use the helper as well instead of just doing
    a "bare" assignment.

    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:10 -04:00
Jeffrey Layton 5d6a42d166 filelock: new helper: vfs_inode_has_locks
Bugzilla: https://bugzilla.redhat.com/2172087

commit ab1ddef98a715eddb65309ffa83267e4e84a571e
Author: Jeff Layton <jlayton@kernel.org>
Date:   Mon Nov 14 08:33:09 2022 -0500

    filelock: new helper: vfs_inode_has_locks

    Ceph has a need to know whether a particular inode has any locks set on
    it. It's currently tracking that by a num_locks field in its
    filp->private_data, but that's problematic as it tries to decrement this
    field when releasing locks and that can race with the file being torn
    down.

    Add a new vfs_inode_has_locks helper that just returns whether any locks
    are currently held on the inode.

    Reviewed-by: Xiubo Li <xiubli@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@infradead.org>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:10 -04:00
Jeffrey Layton 48154660fb filelock: WARN_ON_ONCE when ->fl_file and filp don't match
Bugzilla: https://bugzilla.redhat.com/2172087

commit 7e8e5cc818bd93ee7f2699676f2e5b30d26d83f8
Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Nov 11 06:46:52 2022 -0500

    filelock: WARN_ON_ONCE when ->fl_file and filp don't match

    vfs_lock_file, vfs_test_lock and vfs_cancel_lock all take both a struct
    file argument and a file_lock. The file_lock has a fl_file field in it
    howevever and it _must_ match the file passed in.

    While most of the locks.c routines use the separately-passed file
    argument, some filesystems rely on fl_file being filled out correctly.

    I'm working on a patch series to remove the redundant argument from
    these routines, but for now, let's ensure that the callers always set
    this properly by issuing a WARN_ON_ONCE if they ever don't match.

    Cc: Chuck Lever <chuck.lever@oracle.com>
    Cc: Trond Myklebust <trondmy@hammerspace.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:10 -04:00
Jeffrey Layton 9424ff3537 locks: Fix dropped call to ->fl_release_private()
Bugzilla: https://bugzilla.redhat.com/2172087

commit 932c29a10d5d0bba63b9f505a8ec1e3ce8c02542
Author: David Howells <dhowells@redhat.com>
Date:   Wed Aug 17 19:41:27 2022 +0100

    locks: Fix dropped call to ->fl_release_private()

    Prior to commit 4149be7bda7e, sys_flock() would allocate the file_lock
    struct it was going to use to pass parameters, call ->flock() and then call
    locks_free_lock() to get rid of it - which had the side effect of calling
    locks_release_private() and thus ->fl_release_private().

    With commit 4149be7bda7e, however, this is no longer the case: the struct
    is now allocated on the stack, and locks_free_lock() is no longer called -
    and thus any remaining private data doesn't get cleaned up either.

    This causes afs flock to cause oops.  Kasan catches this as a UAF by the
    list_del_init() in afs_fl_release_private() for the file_lock record
    produced by afs_fl_copy_lock() as the original record didn't get delisted.
    It can be reproduced using the generic/504 xfstest.

    Fix this by reinstating the locks_release_private() call in sys_flock().
    I'm not sure if this would affect any other filesystems.  If not, then the
    release could be done in afs_flock() instead.

    Changes
    =======
    ver #2)
     - Don't need to call ->fl_release_private() after calling the security
       hook, only after calling ->flock().

    Fixes: 4149be7bda7e ("fs/lock: Don't allocate file_lock in flock_make_lock().")
    cc: Chuck Lever <chuck.lever@oracle.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    cc: linux-fsdevel@vger.kernel.org
    Link: https://lore.kernel.org/r/166075758809.3532462.13307935588777587536.stgit@warthog.procyon.org.uk/ # v1
    Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:10 -04:00
Jeffrey Layton 9000150501 fs/lock: Rearrange ops in flock syscall.
Bugzilla: https://bugzilla.redhat.com/2172087

commit db4abb4a32ec979ea5deea4d0095fa22ec99a623
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Sat Jul 16 21:35:32 2022 -0700

    fs/lock: Rearrange ops in flock syscall.

    The previous patch added flock_translate_cmd() in flock syscall.
    The test and the other one for LOCK_MAND do not depend on struct
    fd and are cheaper, so we can put them at the top and defer
    fdget() after that.

    Also, we can remove the unlock variable and use type instead.
    While at it, we fix this checkpatch error.

      CHECK: spaces preferred around that '|' (ctx:VxV)
      #45: FILE: fs/locks.c:2099:
      +     if (type != F_UNLCK && !(f.file->f_mode & (FMODE_READ|FMODE_WRITE)))
                                                                 ^

    Finally, we can move the can_sleep part just before we use it.

    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:10 -04:00
Jeffrey Layton 308465efa3 fs/lock: Don't allocate file_lock in flock_make_lock().
Bugzilla: https://bugzilla.redhat.com/2172087

commit 4149be7bda7e1b922896599dd9cee7a3ed8cf38b
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Sat Jul 16 21:35:31 2022 -0700

    fs/lock: Don't allocate file_lock in flock_make_lock().

    Two functions, flock syscall and locks_remove_flock(), call
    flock_make_lock().  It allocates struct file_lock from slab
    cache if its argument fl is NULL.

    When we call flock syscall, we pass NULL to allocate memory
    for struct file_lock.  However, we always free it at the end
    by locks_free_lock().  We need not allocate it and instead
    should use a local variable as locks_remove_flock() does.

    Also, the validation for flock_translate_cmd() is not necessary
    for locks_remove_flock().  So we move the part to flock syscall
    and make flock_make_lock() return nothing.

    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:09 -04:00
Jeffrey Layton 6f597cadf4 fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict
Bugzilla: https://bugzilla.redhat.com/2172087
Conflicts: It's possible for out of tree modules to declare their own
	   lock_manager_operations and set the fl_lmops field to that.
	   Add a RHEL-only FL_EXT_LMOPS flag as well that we can use
	   to determine whether it's safe to access the new fields.

commit 2443da2259e97688f93d64d17ab69b15f466078a
Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:25 2022 -0700

    fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict

    Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to
    lock_manager_operations to allow the lock manager to take appropriate
    action to resolve the lock conflict if possible.

    A new field, lm_mod_owner, is also added to lock_manager_operations.
    The lm_mod_owner is used by the fs/lock code to make sure the lock
    manager module such as nfsd, is not freed while lock conflict is being
    resolved.

    lm_lock_expirable checks and returns true to indicate that the lock
    conflict can be resolved else return false. This callback must be
    called with the flc_lock held so it can not block.

    lm_expire_lock is called to resolve the lock conflict if the returned
    value from lm_lock_expirable is true. This callback is called without
    the flc_lock held since it's allowed to block. Upon returning from
    this callback, the lock conflict should be resolved and the caller is
    expected to restart the conflict check from the beginnning of the list.

    Lock manager, such as NFSv4 courteous server, uses this callback to
    resolve conflict by destroying lock owner, or the NFSv4 courtesy client
    (client that has expired but allowed to maintains its states) that owns
    the lock.

    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:08:00 -04:00
Jeffrey Layton 2cdf0a629d fs/lock: add helper locks_owner_has_blockers to check for blockers
Bugzilla: https://bugzilla.redhat.com/2172087

commit 591502c5cb325b1c6ec59ab161927d606b918aa0
Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon May 2 14:19:24 2022 -0700

    fs/lock: add helper locks_owner_has_blockers to check for blockers

    Add helper locks_owner_has_blockers to check if there is any blockers
    for a given lockowner.

    Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2023-03-27 07:07:51 -04:00
Chris von Recklinghausen c1f51256f0 fs: move locking sysctls where they are used
Bugzilla: https://bugzilla.redhat.com/2160210

commit dd81faa88340a1fe8cd81c8ecbadd8e95c58549c
Author: Luis Chamberlain <mcgrof@kernel.org>
Date:   Fri Jan 21 22:13:10 2022 -0800

    fs: move locking sysctls where they are used

    kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
    dishes, this makes it very difficult to maintain.

    To help with this maintenance let's start by moving sysctls to places
    where they actually belong.  The proc sysctl maintainers do not want to
    know what sysctl knobs you wish to add for your own piece of code, we
    just care about the core logic.

    The locking fs sysctls are only used on fs/locks.c, so move them there.

    Link: https://lkml.kernel.org/r/20211129205548.605569-7-mcgrof@kernel.org
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Cc: Antti Palosaari <crope@iki.fi>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Iurii Zaikin <yzaikin@google.com>
    Cc: "J. Bruce Fields" <bfields@fieldses.org>
    Cc: Jeff Layton <jlayton@kernel.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Lukas Middendorf <kernel@tuxforce.de>
    Cc: Stephen Kitt <steve@sk2.org>
    Cc: Xiaoming Ni <nixiaoming@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-03-24 11:18:32 -04:00
Jeffrey Layton e07e741814 locks: remove changelog comments
Bugzilla: http://bugzilla.redhat.com/2017438

commit e9728cc72d915966dcc288d2e217af48e8fa2362
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Tue Oct 19 13:38:35 2021 -0400

    locks: remove changelog comments

    This is only of historical interest, and anyone interested in the
    history can dig out an old version of locks.c from from git.

    Triggered by the observation that it references the now-removed
    Documentation/filesystems/mandatory-locking.rst.

    Reported-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2021-11-01 13:56:13 -04:00
Jeffrey Layton 23e989128b locks: remove LOCK_MAND flock lock support
Bugzilla: http://bugzilla.redhat.com/2017438

commit 90f7d7a0d0d68623b5f7df5621a8d54d9518fcc4
Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Sep 10 15:36:29 2021 -0400

    locks: remove LOCK_MAND flock lock support

    As best I can tell, the logic for these has been broken for a long time
    (at least before the move to git), such that they never conflict with
    anything. Also, nothing checks for these flags and prevented opens or
    read/write behavior on the files. They don't seem to do anything.

    Given that, we can rip these symbols out of the kernel, and just make
    flock(2) return 0 when LOCK_MAND is set in order to preserve existing
    behavior.

    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2021-11-01 13:56:12 -04:00
Jeffrey Layton 5b7ac1214f fs: remove mandatory file locking support
Bugzilla: http://bugzilla.redhat.com/2017438

commit f7e33bdbd6d1bdf9c3df8bba5abcf3399f957ac3
Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Aug 19 14:56:38 2021 -0400

    fs: remove mandatory file locking support

    We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it
    off in Fedora and RHEL8. Several other distros have followed suit.

    I've heard of one problem in all that time: Someone migrated from an
    older distro that supported "-o mand" to one that didn't, and the host
    had a fstab entry with "mand" in it which broke on reboot. They didn't
    actually _use_ mandatory locking so they just removed the mount option
    and moved on.

    This patch rips out mandatory locking support wholesale from the kernel,
    along with the Kconfig option and the Documentation file. It also
    changes the mount code to ignore the "mand" mount option instead of
    erroring out, and to throw a big, ugly warning.

    Signed-off-by: Jeff Layton <jlayton@kernel.org>

Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
2021-11-01 13:56:12 -04:00
Linus Torvalds a79cdfba68 Additional fixes and clean-ups for NFSD since tags/nfsd-5.13,
including a fix to grant read delegations for files open for
 writing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmCJz0UACgkQM2qzM29m
 f5einQ//ZqErt5sYcvQw5Onkt+lDHp13XgjIVGo1DrAegrdoTMT+jpUfYSbDLEuC
 B+G2+rUGHpNZ017mzoAmzoeA+pKsdRX+YAy/i8K+7r/cr6T9v78yoX9rx1rbEQEq
 QFJm0fGrFLydzaxRpVq5by7yCKD2DaCQL6DefcXQitfKlfRJ8i/D/vXVBb4FJcmg
 4qRJ7RCcck5gqfInFJ+ZKRjC/9Oj9bNUJz2Ph9mWH1qDDKachgnfWYqrnFQdjYTr
 /Tb+6gyqnRplHU7LmPYSREZqrS3CuvPX0MSXKcFhITj0teaF3b7MArIsSrpw/GGi
 kKrc/K+46COA/Ej0stdGev+Fe3GRlPKUk7UgdD3uWvQrDZ5WdcvN1N7xyCHk90qO
 pOmU3iQuFIBJLaHfwzDaPUJZKMsEO+hsd+liwJjBg6WD4DDLYSQT7jglwYwCxeV4
 ywJi9C3DKaM8kpSBbnMUreHdIIz1d8hNifM4PKgtKGpaXaVlO+rxbkQfZjVAF7Sk
 uRXIegRi+YSJY7RJIhT+NcmmJbyQOEXu9UyUJmqpIzbzmiLF/K2qUk5jPxFLgBpq
 CHmdEIfcoGhA1UqAlynplk5+I5QvhzjxENZJ2Bz8Xwn/uDebKlNhrQeXQP1mQ8dK
 3kJ3RUN/yQxgYCXIQWg/ug51hSZ5Y6c7RzaJeW359V5DbPKBQOU=
 =HB+N
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull more nfsd updates from Chuck Lever:
 "Additional fixes and clean-ups for NFSD since tags/nfsd-5.13,
  including a fix to grant read delegations for files open for writing"

* tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix null pointer dereference in svc_rqst_free()
  SUNRPC: fix ternary sign expansion bug in tracing
  nfsd: Fix fall-through warnings for Clang
  nfsd: grant read delegations to clients holding writes
  nfsd: reshuffle some code
  nfsd: track filehandle aliasing in nfs4_files
  nfsd: hash nfs4_files by inode number
  nfsd: ensure new clients break delegations
  nfsd: removed unused argument in nfsd_startup_generic()
  nfsd: remove unused function
  svcrdma: Pass a useful error code to the send_err tracepoint
  svcrdma: Rename goto labels in svc_rdma_sendto()
  svcrdma: Don't leak send_ctxt on Send errors
2021-05-05 13:44:19 -07:00
Linus Torvalds befbfe07e6 File locking fixes for v5.13
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmCGnowTHGpsYXl0b25A
 a2VybmVsLm9yZwAKCRAADmhBGVaCFcKnEAC0MjXWbAvisoEMQDej+1FrqJSvUuMb
 kYGyjWQxLoQwb2Yj4FAjOwg0PtCq5r29CtgKvVjr4Dq2RpVzslG1Yt7ql6eRta8k
 rA2tjU1qosYLgMrj7PkItLC+rvFKZeF3X54SFFrLCjuu6/rMZH2v3d3C6oUsruba
 mWOdkX0Q2vApGfn7ooFOIe3UE29IG1p/6azCfcjjVUi19ibCFyxhxN4IU0nU+x4+
 86KIDwud7iijY/pBcHs1g6F9lD4TyA/XKqXgonC71rtqD7zlZWwRhugNaKmCqK12
 2CskoxFpuVeFtI/PLe/mf9q1aVElZppa2fKQhIrWey3L7dVdU583kbhIiSlo5mvC
 0jFy8r1+JcWfKB+HGjSFQQvG3FkST+ZZ6+eVlOoY5Wdxc/kzlQLBSBrWkWDtsjvm
 +oCmhX9T0ecwUH+AWEr27WP8eSsidSjHJAZY6DGuSwSZig9qEOo9Ayc7qTj3lB/I
 KGL8z8d+x27jXnNMG2+b8acYNC/dhMyIb5Z69/qPptvThteUne/WvTMU14eRCvqm
 C7R1QpQRvgtGiJl8PWkzjxUoKI2XktSL+arbRsqIP3mxlJ6pZJyJpaxDMYTcfz9D
 sWzapnORBKXxvK2xcuXip8v9w3yqgONA8KE5xQrTL4aCg16bXJVXI4c9nN4frNBD
 z2DRhnGw6nXoSQ==
 =88Qz
 -----END PGP SIGNATURE-----

Merge tag 'locks-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking updates from Jeff Layton:
 "When we reworked the blocked locks into a tree structure instead of a
  flat list a few releases ago, we lost the ability to see all of the
  file locks in /proc/locks. Luo's patch fixes it to dump out all of the
  blocked locks instead, which restores the full output.

  This changes the format of /proc/locks as the blocked locks are shown
  at multiple levels of indentation now, but lslocks (the only common
  program I've ID'ed that scrapes this info) seems to be OK with that.

  Tian also contributed a small patch to remove a useless assignment"

* tag 'locks-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fs/locks: remove useless assignment in fcntl_getlk
  fs/locks: print full locks information
2021-04-26 13:24:39 -07:00
J. Bruce Fields aba2072f45 nfsd: grant read delegations to clients holding writes
It's OK to grant a read delegation to a client that holds a write,
as long as it's the only client holding the write.

We originally tried to do this in commit 94415b06eb ("nfsd4: a
client's own opens needn't prevent delegations"), which had to be
reverted in commit 6ee65a7730 ("Revert "nfsd4: a client's own
opens needn't prevent delegations"").

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-04-19 16:41:36 -04:00
Tian Tao cbe6fc4e01 fs/locks: remove useless assignment in fcntl_getlk
Function parameter 'cmd' is rewritten with unused value at locks.c

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-04-13 07:26:38 -04:00
Luo Longjun b8da9b10e2 fs/locks: print full locks information
Commit fd7732e033 ("fs/locks: create a tree of dependent requests.")
has put blocked locks into a tree.

So, with a for loop, we can't check all locks information.

To solve this problem, we should traverse the tree.

Signed-off-by: Luo Longjun <luolongjun@huawei.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-03-11 07:48:11 -05:00
J. Bruce Fields 6ee65a7730 Revert "nfsd4: a client's own opens needn't prevent delegations"
This reverts commit 94415b06eb.

That commit claimed to allow a client to get a read delegation when it
was the only writer.  Actually it allowed a client to get a read
delegation when *any* client has a write open!

The main problem is that it's depending on nfs4_clnt_odstate structures
that are actually only maintained for pnfs exports.

This causes clients to miss writes performed by other clients, even when
there have been intervening closes and opens, violating close-to-open
cache consistency.

We can do this a different way, but first we should just revert this.

I've added pynfs 4.1 test DELEG19 to test for this, as I should have
done originally!

Cc: stable@vger.kernel.org
Reported-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-09 10:37:34 -05:00
Linus Torvalds faf145d6f3 Merge branch 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "This set of changes ultimately fixes the interaction of posix file
  lock and exec. Fundamentally most of the change is just moving where
  unshare_files is called during exec, and tweaking the users of
  files_struct so that the count of files_struct is not unnecessarily
  played with.

  Along the way fcheck and related helpers were renamed to more
  accurately reflect what they do.

  There were also many other small changes that fell out, as this is the
  first time in a long time much of this code has been touched.

  Benchmarks haven't turned up any practical issues but Al Viro has
  observed a possibility for a lot of pounding on task_lock. So I have
  some changes in progress to convert put_files_struct to always rcu
  free files_struct. That wasn't ready for the merge window so that will
  have to wait until next time"

* 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
  exec: Move io_uring_task_cancel after the point of no return
  coredump: Document coredump code exclusively used by cell spufs
  file: Remove get_files_struct
  file: Rename __close_fd_get_file close_fd_get_file
  file: Replace ksys_close with close_fd
  file: Rename __close_fd to close_fd and remove the files parameter
  file: Merge __alloc_fd into alloc_fd
  file: In f_dupfd read RLIMIT_NOFILE once.
  file: Merge __fd_install into fd_install
  proc/fd: In fdinfo seq_show don't use get_files_struct
  bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu
  proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
  file: Implement task_lookup_next_fd_rcu
  kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
  proc/fd: In tid_fd_mode use task_lookup_fd_rcu
  file: Implement task_lookup_fd_rcu
  file: Rename fcheck lookup_fd_rcu
  file: Replace fcheck_files with files_lookup_fd_rcu
  file: Factor files_lookup_fd_locked out of fcheck_files
  file: Rename __fcheck_files to files_lookup_fd_raw
  ...
2020-12-15 19:29:43 -08:00
Eric W. Biederman 120ce2b0cd file: Factor files_lookup_fd_locked out of fcheck_files
To make it easy to tell where files->file_lock protection is being
used when looking up a file create files_lookup_fd_locked.  Only allow
this function to be called with the file_lock held.

Update the callers of fcheck and fcheck_files that are called with the
files->file_lock held to call files_lookup_fd_locked instead.

Hopefully this makes it easier to quickly understand what is going on.

The need for better names became apparent in the last round of
discussion of this set of changes[1].

[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-8-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-12-10 12:39:59 -06:00
Mauro Carvalho Chehab 529adfe8f1 locks: fix a typo at a kernel-doc markup
locks_delete_lock -> locks_delete_block

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-10-26 08:00:39 -04:00
Luo Meng 16238415eb locks: Fix UBSAN undefined behaviour in flock64_to_posix_lock
When the sum of fl->fl_start and l->l_len overflows,
UBSAN shows the following warning:

UBSAN: Undefined behaviour in fs/locks.c:482:29
signed integer overflow: 2 + 9223372036854775806
cannot be represented in type 'long long int'
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xe4/0x14e lib/dump_stack.c:118
 ubsan_epilogue+0xe/0x81 lib/ubsan.c:161
 handle_overflow+0x193/0x1e2 lib/ubsan.c:192
 flock64_to_posix_lock fs/locks.c:482 [inline]
 flock_to_posix_lock+0x595/0x690 fs/locks.c:515
 fcntl_setlk+0xf3/0xa90 fs/locks.c:2262
 do_fcntl+0x456/0xf60 fs/fcntl.c:387
 __do_sys_fcntl fs/fcntl.c:483 [inline]
 __se_sys_fcntl fs/fcntl.c:468 [inline]
 __x64_sys_fcntl+0x12d/0x180 fs/fcntl.c:468
 do_syscall_64+0xc8/0x5a0 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fix it by parenthesizing 'l->l_len - 1'.

Signed-off-by: Luo Meng <luomeng12@huawei.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-10-26 07:59:29 -04:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds 7a6b60441f Highlights:
- Support for user extended attributes on NFS (RFC 8276)
 - Further reduce unnecessary NFSv4 delegation recalls
 
 Notable fixes:
 
 - Fix recent krb5p regression
 - Address a few resource leaks and a rare NULL dereference
 
 Other:
 
 - De-duplicate RPC/RDMA error handling and other utility functions
 - Replace storage and display of kernel memory addresses by tracepoints
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAl8oBt0ACgkQM2qzM29m
 f5dTFQ/9H72E6gr1onsia0/Py0CO8F9qzLgmUBl1vVYAh2/vPqUL1ypxrC5OYrAy
 TOqESTsJvmGluCFc/77XUTD7NvJY3znIWim49okwDiyee4Y14ZfRhhCxyyA6Z94E
 FjJQb5TbF1Mti4X3dN8Gn7O1Y/BfTjDAAXnXGlTA1xoLcxM5idWIj+G8x0bPmeDb
 2fTbgsoETu6MpS2/L6mraXVh3d5ESOJH+73YvpBl0AhYPzlNASJZMLtHtd+A/JbO
 IPkMP/7UA5DuJtWGeuQ4I4D5bQNpNWMfN6zhwtih4IV5bkRC7vyAOLG1R7w9+Ufq
 58cxPiorMcsg1cHnXG0Z6WVtbMEdWTP/FzmJdE5RC7DEJhmmSUG/R0OmgDcsDZET
 GovPARho01yp80GwTjCIctDHRRFRL4pdPfr8PjVHetSnx9+zoRUT+D70Zeg/KSy2
 99gmCxqSY9BZeHoiVPEX/HbhXrkuDjUSshwl98OAzOFmv6kbwtLntgFbWlBdE6dB
 mqOxBb73zEoZ5P9GA2l2ShU3GbzMzDebHBb9EyomXHZrLejoXeUNA28VJ+8vPP5S
 IVHnEwOkdJrNe/7cH4jd/B0NR6f8Da/F9kmkLiG2GNPMqQ8bnVhxTUtZkcAE+fd4
 f34qLxsoht70wSSfISjBs7hP5KxEM1lOAf0w0RpycPUKJNV1FB0=
 =OEpF
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6

Pull NFS server updates from Chuck Lever:
 "Highlights:
   - Support for user extended attributes on NFS (RFC 8276)
   - Further reduce unnecessary NFSv4 delegation recalls

  Notable fixes:
   - Fix recent krb5p regression
   - Address a few resource leaks and a rare NULL dereference

  Other:
   - De-duplicate RPC/RDMA error handling and other utility functions
   - Replace storage and display of kernel memory addresses by tracepoints"

* tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
  svcrdma: CM event handler clean up
  svcrdma: Remove transport reference counting
  svcrdma: Fix another Receive buffer leak
  SUNRPC: Refresh the show_rqstp_flags() macro
  nfsd: netns.h: delete a duplicated word
  SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
  nfsd: avoid a NULL dereference in __cld_pipe_upcall()
  nfsd4: a client's own opens needn't prevent delegations
  nfsd: Use seq_putc() in two functions
  svcrdma: Display chunk completion ID when posting a rw_ctxt
  svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
  svcrdma: Introduce Send completion IDs
  svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
  svcrdma: Introduce Receive completion IDs
  svcrdma: Introduce infrastructure to support completion IDs
  svcrdma: Add common XDR encoders for RDMA and Read segments
  svcrdma: Add common XDR decoders for RDMA and Read segments
  SUNRPC: Add helpers for decoding list discriminators symbolically
  svcrdma: Remove declarations for functions long removed
  svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
  ...
2020-08-09 13:58:04 -07:00
Linus Torvalds 3208167a86 File locking fix for v5.9.
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAl8oDgYTHGpsYXl0b25A
 a2VybmVsLm9yZwAKCRAADmhBGVaCFaA9D/9HzjmL8/17DdCiFFucl9fgyIUUIlqZ
 mSM9RslHQuaOAM5c5RbtbifRZbh5H/pIm930at+JxFcZBN51iwB7xAc8MYEelxIy
 9i3hwZJP2mmqum3GTD4QtUcoirzjmYvGffThq9Cb/XuUaXd6S/PZZPZVVk4bChIA
 TDwday9Us+5Qz+NddnDPtkZbjv/edYS+gXh5NItODiV/B38yCiRVW36vazdWhZf9
 UMRz7YpUT4xijjFd06rQZb6otJSAnP9BEi/4ihYAjsPuf8aot85vLfKD9CzkdLpd
 +LbBkaXfoM6pb7C2QFx1PlBB4DeTkYzR7n89kp9poy/F35SyAEvj3zf12AceVG1a
 4AbyVhFz6tNea5PLKBhswvGT0Kq0LfDJh6SnH03dqgcU7LQm20OMBT7ImWb3I1/3
 1TMe44auGy4Ap1XgkPNq6xMNteX/XIUJIvKJ1g0sYyLppc2jLRnyH+n+aJCFyFQo
 ghDKFRUYlmsYZJmzzV17rZjfnqewrlyHf6BcA1aq7C7GbdSJ8eMmxH+UaU3AgRES
 Jy693Vd7XTOFPUwOGzHRKRxQ9cFQloTQxSKF6xcigBcKZE1xVZGarR8s4mRlsIU9
 oqx50d37nVRVbLtC0OK2ZwD6hvtt9z4v0xM8ahF9n0XDkxnAwi7Hs3XhAvArUPnF
 QLPVFaBbWDxwMQ==
 =7CeF
 -----END PGP SIGNATURE-----

Merge tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking fix from Jeff Layton:
 "Just a single, one-line patch to fix an inefficiency in the posix
  locking code that can lead to it doing more wakeups than necessary"

* tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  locks: add locks_move_blocks in posix_lock_inode
2020-08-03 10:46:41 -07:00
J. Bruce Fields 94415b06eb nfsd4: a client's own opens needn't prevent delegations
We recently fixed lease breaking so that a client's actions won't break
its own delegations.

But we still have an unnecessary self-conflict when granting
delegations: a client's own write opens will prevent us from handing out
a read delegation even when no other client has the file open for write.

Fix that by turning off the checks for conflicting opens under
vfs_setlease, and instead performing those checks in the nfsd code.

We don't depend much on locks here: instead we acquire the delegation,
then check for conflicts, and drop the delegation again if we find any.

The check beforehand is an optimization of sorts, just to avoid
acquiring the delegation unnecessarily.  There's a race where the first
check could cause us to deny the delegation when we could have granted
it.  But, that's OK, delegation grants are optional (and probably not
even a good idea in that case).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:28:46 -04:00
Linus Torvalds c742b63473 Highlights:
- Keep nfsd clients from unnecessarily breaking their own delegations:
   Note this requires a small kthreadd addition, discussed at:
   https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com
   The result is Tejun Heo's suggestion, and he was OK with this going
   through my tree.
 - Patch nfsd/clients/ to display filenames, and to fix byte-order when
   displaying stateid's.
 - fix a module loading/unloading bug, from Neil Brown.
 - A big series from Chuck Lever with RPC/RDMA and tracing improvements,
   and lay some groundwork for RPC-over-TLS.
 
 Note Stephen Rothwell spotted two conflicts in linux-next.  Both should
 be straightforward:
 	include/trace/events/sunrpc.h
 		https://lore.kernel.org/r/20200529105917.50dfc40f@canb.auug.org.au
 	net/sunrpc/svcsock.c
 		https://lore.kernel.org/r/20200529131955.26c421db@canb.auug.org.au
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl7iRYwVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+yx8QALIfyz/ziPgjGBnNJGCW8BjWHz7+
 rGI+1SP2EUpgJ0fGJc9MpGyYTa5T3pTgsENnIRtegyZDISg2OQ5GfifpkTz4U7vg
 QbWRihs/W9EhltVYhKvtLASAuSAJ8ETbDfLXVb2ncY7iO6JNvb22xwsgKZILmzm1
 uG4qSszmBZzpMUUy51kKJYJZ3ysP+v14qOnyOXEoeEMuJYNK9FkQ9bSPZ6wTJNOn
 hvZBMbU7LzRyVIvp358mFHY+vwq5qBNkJfVrZBkURGn4OxWPbWDXzqOi0Zs1oBjA
 L+QODIbTLGkopu/rD0r1b872PDtket7p5zsD8MreeI1vJOlt3xwqdCGlicIeNATI
 b0RG7sqh+pNv0mvwLxSNTf3rO0EKW6tUySqCnQZUAXFGRH0nYM2TWze4HUr2zfWT
 EgRMwxHY/AZUStZBuCIHPJ6inWnKuxSUELMf2a9JHO1BJc/yClRgmwJGdthVwb9u
 GP6F3/maFu+9YOO6iROMsqtxDA+q5vch5IBzevNOOBDEQDKqENmogR/knl9DmAhF
 sr+FOa3O0u6S4tgXw/TU97JS/h1L2Hu6QVEwU2iVzWtlUUOFVMZQODJTB6Lts4Ka
 gKzYXWvCHN+LyETsN6q7uHFg9mtO7xO5vrrIgo72SuVCscDw/8iHkoOOFLief+GE
 O0fR0IYjW8U1Rkn2
 =YEf0
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Highlights:

   - Keep nfsd clients from unnecessarily breaking their own
     delegations.

     Note this requires a small kthreadd addition. The result is Tejun
     Heo's suggestion (see link), and he was OK with this going through
     my tree.

   - Patch nfsd/clients/ to display filenames, and to fix byte-order
     when displaying stateid's.

   - fix a module loading/unloading bug, from Neil Brown.

   - A big series from Chuck Lever with RPC/RDMA and tracing
     improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com

* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
  sunrpc: use kmemdup_nul() in gssp_stringify()
  nfsd: safer handling of corrupted c_type
  nfsd4: make drc_slab global, not per-net
  SUNRPC: Remove unreachable error condition in rpcb_getport_async()
  nfsd: Fix svc_xprt refcnt leak when setup callback client failed
  sunrpc: clean up properly in gss_mech_unregister()
  sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
  sunrpc: check that domain table is empty at module unload.
  NFSD: Fix improperly-formatted Doxygen comments
  NFSD: Squash an annoying compiler warning
  SUNRPC: Clean up request deferral tracepoints
  NFSD: Add tracepoints for monitoring NFSD callbacks
  NFSD: Add tracepoints to the NFSD state management code
  NFSD: Add tracepoints to NFSD's duplicate reply cache
  SUNRPC: svc_show_status() macro should have enum definitions
  SUNRPC: Restructure svc_udp_recvfrom()
  SUNRPC: Refactor svc_recvfrom()
  SUNRPC: Clean up svc_release_skb() functions
  SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
  SUNRPC: Replace dprintk() call sites in TCP receive path
  ...
2020-06-11 10:33:13 -07:00
Linus Torvalds 9ff7258575 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull proc updates from Eric Biederman:
 "This has four sets of changes:

   - modernize proc to support multiple private instances

   - ensure we see the exit of each process tid exactly

   - remove has_group_leader_pid

   - use pids not tasks in posix-cpu-timers lookup

  Alexey updated proc so each mount of proc uses a new superblock. This
  allows people to actually use mount options with proc with no fear of
  messing up another mount of proc. Given the kernel's internal mounts
  of proc for things like uml this was a real problem, and resulted in
  Android's hidepid mount options being ignored and introducing security
  issues.

  The rest of the changes are small cleanups and fixes that came out of
  my work to allow this change to proc. In essence it is swapping the
  pids in de_thread during exec which removes a special case the code
  had to handle. Then updating the code to stop handling that special
  case"

* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc: proc_pid_ns takes super_block as an argument
  remove the no longer needed pid_alive() check in __task_pid_nr_ns()
  posix-cpu-timers: Replace __get_task_for_clock with pid_for_clock
  posix-cpu-timers: Replace cpu_timer_pid_type with clock_pid_type
  posix-cpu-timers: Extend rcu_read_lock removing task_struct references
  signal: Remove has_group_leader_pid
  exec: Remove BUG_ON(has_group_leader_pid)
  posix-cpu-timer:  Unify the now redundant code in lookup_task
  posix-cpu-timer: Tidy up group_leader logic in lookup_task
  proc: Ensure we see the exit of each process tid exactly once
  rculist: Add hlists_swap_heads_rcu
  proc: Use PIDTYPE_TGID in next_tgid
  Use proc_pid_ns() to get pid_namespace from the proc superblock
  proc: use named enums for better readability
  proc: use human-readable values for hidepid
  docs: proc: add documentation for "hidepid=4" and "subset=pid" options and new mount behavior
  proc: add option to mount only a pids subset
  proc: instantiate only pids that we can ptrace on 'hidepid=4' mount option
  proc: allow to mount many instances of proc in one pid namespace
  proc: rename struct proc_fs_info to proc_fs_opts
2020-06-04 13:54:34 -07:00
yangerkun 5ef1596813 locks: add locks_move_blocks in posix_lock_inode
We forget to call locks_move_blocks in posix_lock_inode when try to
process same owner and different types.

Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-06-02 12:08:25 -04:00
Alexey Gladkov 9d78edeaec proc: proc_pid_ns takes super_block as an argument
syzbot found that

  touch /proc/testfile

causes NULL pointer dereference at tomoyo_get_local_path()
because inode of the dentry is NULL.

Before c59f415a7c, Tomoyo received pid_ns from proc's s_fs_info
directly. Since proc_pid_ns() can only work with inode, using it in
the tomoyo_get_local_path() was wrong.

To avoid creating more functions for getting proc_ns, change the
argument type of the proc_pid_ns() function. Then, Tomoyo can use
the existing super_block to get pid_ns.

Link: https://lkml.kernel.org/r/0000000000002f0c7505a5b0e04c@google.com
Link: https://lkml.kernel.org/r/20200518180738.2939611-1-gladkov.alexey@gmail.com
Reported-by: syzbot+c1af344512918c61362c@syzkaller.appspotmail.com
Fixes: c59f415a7c ("Use proc_pid_ns() to get pid_namespace from the proc superblock")
Signed-off-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-05-19 07:07:50 -05:00
J. Bruce Fields 28df3d1539 nfsd: clients don't need to break their own delegations
We currently revoke read delegations on any write open or any operation
that modifies file data or metadata (including rename, link, and
unlink).  But if the delegation in question is the only read delegation
and is held by the client performing the operation, that's not really
necessary.

It's not always possible to prevent this in the NFSv4.0 case, because
there's not always a way to determine which client an NFSv4.0 delegation
came from.  (In theory we could try to guess this from the transport
layer, e.g., by assuming all traffic on a given TCP connection comes
from the same client.  But that's not really correct.)

In the NFSv4.1 case the session layer always tells us the client.

This patch should remove such self-conflicts in all cases where we can
reliably determine the client from the compound.

To do that we need to track "who" is performing a given (possibly
lease-breaking) file operation.  We're doing that by storing the
information in the svc_rqst and using kthread_data() to map the current
task back to a svc_rqst.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-08 21:23:10 -04:00
Mauro Carvalho Chehab a02dcdf65b docs: filesystems: convert mandatory-locking.txt to ReST
- Add a SPDX header;
- Adjust document title;
- Some whitespace fixes and new line breaks;
- Use notes markups;
- Add it to filesystems/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/aecd6259fe9f99b2c2b3440eab6a2b989125e00d.1588021877.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-05-05 09:22:22 -06:00
Alexey Gladkov c59f415a7c Use proc_pid_ns() to get pid_namespace from the proc superblock
To get pid_namespace from the procfs superblock should be used a special
helper. This will avoid errors when s_fs_info will change the type.

Link: https://lore.kernel.org/lkml/20200423200316.164518-3-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/20200423112858.95820-1-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/06B50A1C-406F-4057-BFA8-3A7729EA7469@lca.pw/
Signed-off-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-04-24 16:38:30 -05:00