Commit Graph

2245 Commits

Author SHA1 Message Date
Andreas Gruenbacher f8ab09f886 gfs2: Remove LM_FLAG_PRIORITY flag
JIRA: https://issues.redhat.com/browse/RHEL-77720

The last user of this flag was removed in commit b77b4a4815a9 ("gfs2:
Rework freeze / thaw logic").

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 0b93bac2271e11beb980fca037a34a9819c7dc37)
2025-03-07 15:45:14 +01:00
Ian Kent 372c5a349d erofs: set block size to the on-disk block size
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit d3c4bdcc756e60b95365c66ff58844ce75d1c8f8
Author: Jingbo Xu <jefflexu@linux.alibaba.com>
Date:   Mon Mar 13 21:53:09 2023 +0800

    erofs: set block size to the on-disk block size

    Set the block size to that specified in on-disk superblock.

    Also remove the hard constraint of PAGE_SIZE block size for the
    uncompressed device backend.  This constraint is temporarily remained
    for compressed device and fscache backend, as there is more work needed
    to handle the condition where the block size is not equal to PAGE_SIZE.

    It is worth noting that the on-disk block size is read prior to
    erofs_superblock_csum_verify(), as the read block size is needed in the
    latter.

    Besides, later we are going to make erofs refer to tar data blobs (which
    is 512-byte aligned) for OCI containers, where the block size is 512
    bytes.  In this case, the 512-byte block size may not be adequate for a
    directory to contain enough dirents.  To fix this, we are also going to
    introduce directory block size independent on the block size.

    Due to we have already supported block size smaller than PAGE_SIZE now,
    disable all these images with such separated directory block size until
    we supported this feature later.

    Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Yue Hu <huyue2@coolpad.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20230313135309.75269-3-jefflexu@linux.alibaba.com
    [ Gao Xiang: update documentation. ]
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 19:09:19 +08:00
Ian Kent 0fd7363a2a erofs: add documentation for 'domain_id' mount option
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit b22c7b97189d461d7143052da83b36390c623b54
Author: Jingbo Xu <jefflexu@linux.alibaba.com>
Date:   Thu Jan 12 14:54:30 2023 +0800

    erofs: add documentation for 'domain_id' mount option

    Since the EROFS share domain feature for fscache mode has been available
    since Linux v6.1, let's add documentation for 'domain_id' mount option.

    Cc: linux-doc@vger.kernel.org
    Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
    Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20230112065431.124926-2-jefflexu@linux.alibaba.com
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 19:08:52 +08:00
Ian Kent e0d670a075 erofs: enable large folios for fscache mode
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit e6687b89225ee9c817e6dcbadc873f6a4691e5c2
Author: Jingbo Xu <jefflexu@linux.alibaba.com>
Date:   Thu Dec 1 15:42:56 2022 +0800

    erofs: enable large folios for fscache mode

    Enable large folios for fscache mode.  Enable this feature for
    non-compressed format for now, until the compression part supports large
    folios later.

    One thing worth noting is that, the feature is not enabled for the meta
    data routine since meta inodes don't need large folios for now, nor do
    they support readahead yet.

    Also document this new feature.

    Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
    Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
    Link: https://lore.kernel.org/r/20221201074256.16639-3-jefflexu@linux.alibaba.com
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 19:08:47 +08:00
Ian Kent a240ef364f erofs: update documentation
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit 2109901d49c5876cb424c55a520c750982d68593
Author: Gao Xiang <xiang@kernel.org>
Date:   Wed Nov 30 17:56:05 2022 +0800

    erofs: update documentation

    - Refine highlights for main features;

    - Add multi-reference pclusters and fragment description.

    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Yue Hu <huyue2@coolpad.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20221130095605.4656-1-hsiangkao@linux.alibaba.com

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 19:08:42 +08:00
Ian Kent 785528b7d8 erofs: update documentation
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit 6e95d0a01899ed176b3450db057c3c0a9609cf47
Author: Gao Xiang <xiang@kernel.org>
Date:   Fri May 27 15:01:33 2022 +0800

    erofs: update documentation

     - refine the filesystem overview for better description of recent
       new features like FSDAX and Fscache;

     - add the new `fsid' mount option;

     - fix some typos.

    Link: https://lore.kernel.org/r/20220527070133.77962-1-hsiangkao@linux.alibaba.com
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 19:04:18 +08:00
Ian Kent 17a352cc70 cachefiles: document on-demand read mode
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit 99302ebd3af7895cb5312e80e65d4db5aed5a72d
Author: Jeffle Xu <jefflexu@linux.alibaba.com>
Date:   Mon Apr 25 20:21:30 2022 +0800

    cachefiles: document on-demand read mode

    Document new user interface introduced by on-demand read mode.

    Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
    Link: https://lore.kernel.org/r/20220509074028.74954-9-jefflexu@linux.alibaba.com
    Acked-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 18:59:13 +08:00
Ian Kent 419ce68ff4 erofs: rename ctime to mtime
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit a1108dcd9373a98f7018aa4310076260b8ecfc0b
Author: David Anderson <dvander@google.com>
Date:   Thu Mar 17 19:49:59 2022 +0800

    erofs: rename ctime to mtime

    EROFS images should inherit modification time rather than change time,
    since users and host tooling have no easy way to control change time.

    To reflect the new timestamp meaning, i_ctime and i_ctime_nsec are
    renamed to i_mtime and i_mtime_nsec.

    Link: https://lore.kernel.org/r/20220311041829.3109511-1-dvander@google.com # v1
    Signed-off-by: David Anderson <dvander@google.com>
    [ Gao Xiang: update document as well. ]
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20220317114959.106787-1-hsiangkao@linux.alibaba.com # v2
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 18:55:16 +08:00
Ian Kent 3aabb7ff30 erofs: add sysfs interface
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

commit 168e9a76200c54c584a23aa88c62c53c4b0edd66
Author: Huang Jianan <huangjianan@oppo.com>
Date:   Wed Dec 1 22:54:36 2021 +0800

    erofs: add sysfs interface

    Add sysfs interface to configure erofs related parameters later.

    Link: https://lore.kernel.org/r/20211201145436.4357-1-huangjianan@oppo.com
    Reviewed-by: Chao Yu <chao@kernel.org>
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Signed-off-by: Huang Jianan <huangjianan@oppo.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 18:48:37 +08:00
Ian Kent d4a7eb6d40 erofs: add multiple device support
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

Conflict: Include hunks dropped by CentOS Stream commits 7f25e45b02
    and 20ce3bbee1. (upstream commits cd913c76f489 and de2051147771).
    CentOS Stream commit 794cb59448 alters z_erofs_submit_queue() and
    has been applied later upstream, make a best effort to follow
    the requirements of upstream commit 07888c665b405 ("block: pass
    a block_device and opf to bio_alloc") at around line 1305 in
    fs/erofs/zdata.c

commit dfeab2e95a75a424adf39992ac62dcb9e9517d4a
Author: Gao Xiang <xiang@kernel.org>
Date:   Thu Oct 14 16:10:10 2021 +0800

    erofs: add multiple device support

    In order to support multi-layer container images, add multiple
    device feature to EROFS. Two ways are available to use for now:

     - Devices can be mapped into 32-bit global block address space;
     - Device ID can be specified with the chunk indexes format.

    Note that it assumes no extent would cross device boundary and mkfs
    should take care of it seriously.

    In the future, a dedicated device manager could be introduced then
    thus extra devices can be automatically scanned by UUID as well.

    Link: https://lore.kernel.org/r/20211014081010.43485-1-hsiangkao@linux.alibaba.com
    Reviewed-by: Chao Yu <chao@kernel.org>
    Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 18:30:48 +08:00
Ian Kent 8a79c05fb2 erofs: dax support for non-tailpacking regular file
JIRA: https://issues.redhat.com/browse/RHEL-31991
Upstream status: Linus

Conflicts: The hunk that includes <linux/dax.h> has been removed because
    it was added in CentOS Stream backport commit c70fa990f3.
    Include hunks dropped by CentOS Stream commits 7f25e45b02,
    f4673910d2 and 687ee3abd4 (upstream commits cd913c76f489d,
    8012b86608552 and 7b0800d00dae8).
    Update to accomodate existing CentOS Stream commit 930d4bbabf
    ("mm: remove enum page_entry_size").

commit 06252e9ce05b94b587e522667b85848a30197b15
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   2021-08-05 08:36:00 +0800

    erofs: dax support for non-tailpacking regular file

    DAX is quite useful for some VM use cases in order to save guest
    memory extremely with minimal lightweight EROFS.

    In order to prepare for such use cases, add preliminary dax support
    for non-tailpacking regular files for now.

    Tested with the DRAM-emulated PMEM and the EROFS image generated by
    "mkfs.erofs -Enoinline_data enwik9.fsdax.img enwik9"

    Link: https://lore.kernel.org/r/20210805003601.183063-3-hsiangkao@linux.alibaba.com
    Cc: nvdimm@lists.linux.dev
    Cc: linux-fsdevel@vger.kernel.org
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2025-01-17 18:30:48 +08:00
Rado Vrbovsky b68b07aa18 Merge: gfs2: Revise glock reference counting model
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5784

JIRA: https://issues.redhat.com/browse/RHEL-67675
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Approved-by: Abhi Das <adas@redhat.com>
Approved-by: Andrew Price <anprice@redhat.com>
Approved-by: Paul Evans <pevans@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-28 20:19:11 +00:00
Andreas Gruenbacher acdf764471 gfs2: Get rid of demote_ok checks
JIRA: https://issues.redhat.com/browse/RHEL-67675

The demote_ok glock operation is only still used to prevent the inode
glocks of the "jindex" and "rindex" directories from getting recycled
while they are still referenced by sdp->sd_jindex and sdp->sd_rindex.
However, the LRU walking code will no longer recycle glocks which are
referenced, so the demote_ok glock operation is obsolete and can be
removed.

Each of a glock's holders in the gl_holders list is holding a reference
on the glock, so when the list of holders isn't empty in demote_ok(),
the existing reference count check will already prevent the glock from
getting released.  This means that demote_ok() is obsolete as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 713f8834389f4b34bc8b449412202543c8b32214)
2024-11-14 21:01:15 +01:00
Andreas Gruenbacher 91c928f44f gfs2: Update glocks documentation
JIRA: https://issues.redhat.com/browse/RHEL-67675

Rearrange the table of locking modes and associated caching capability
to be in order of increasing caching capability.

Update the description of the glock operations.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 97d6fdcd79752af8686ab58a0b9389ba80ae0fae)
2024-11-14 17:58:11 +01:00
Rado Vrbovsky c154c6dc53 Merge: fs: backport mnt_idmap type
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4324

JIRA: https://issues.redhat.com/browse/RHEL-33888

This MR back ports idmapping changes to sync. our RHEL-9 kernel with the
upstream kernel to version 6.3.

Our current kernel has idmapped mounts support but there have been many
changes since this initial implementation in the base kernel. In
particular we need the type safety changes and we have seen difficulty
back porting other requested changes on more than one occassion.

The Jira this MR has been raised for is arother example of such a request.

It is needed for a back port of a BPF feature to RHEL 9 which allows BPF
programs to do file verification with LSM and fsverity. To satisfy this
request changes made in the upstream 6.3 kernel are needed which is the
reason we have chosen upstream 6.3 as the target release for the MR.

The first fix has been omitted because it appears to be the same as
24b5308cf5ee ("selftests/filesystems: grant executable permission to
run_fat_tests.sh"). In any case the requirement is to make the path
tools/testing/selftests/filesystems/fat/run_fat_tests.sh executable which
is done.

The second and third Omitted patches are a straight apply and revert leaving
the source unchanged.

Omitted-Fix: 1d4beeb4edc7 ("selftests/filesystems: grant executable permission to run_fat_tests.sh")

Omitted-Fix: 4a47c6385bb4 ovl: turn of SB_POSIXACL with idmapped layers temporarily

Omitted-Fix: 7c4d37c269ac Revert "ovl: turn of SB_POSIXACL with idmapped layers temporarily"

Signed-off-by: Ian Kent <ikent@redhat.com>

Approved-by: Scott Mayhew <smayhew@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Xin Long <lxin@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-11 08:26:30 +00:00
Rado Vrbovsky 570a71d7db Merge: mm: update core code to v6.6 upstream
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5252

JIRA: https://issues.redhat.com/browse/RHEL-27743  
JIRA: https://issues.redhat.com/browse/RHEL-59459    
CVE: CVE-2024-46787    
Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4961  
  
This MR brings RHEL9 core MM code up to upstream's v6.6 LTS level.    
This work follows up on the previous v6.5 update (RHEL-27742) and as such,    
the bulk of this changeset is comprised of refactoring and clean-ups of     
the internal implementation of several APIs as it further advances the     
conversion to FOLIOS, and follow up on the per-VMA locking changes.

Also, with the rebase to v6.6 LTS, we complete the infrastructure to allow    
Control-flow Enforcement Technology, a.k.a. Shadow Stacks, for x86 builds,    
and we add a potential extra level of protection (assessment pending) to help    
on mitigating kernel heap exploits dubbed as "SlubStick".     
    
Follow-up fixes are omitted from this series either because they are irrelevant to     
the bits we support on RHEL or because they depend on bigger changesets introduced     
upstream more recently. A follow-up ticket (RHEL-27745) will deal with these and other cases separately.    

Omitted-fix: e540b8c5da04 ("mips: mm: add slab availability checking in ioremap_prot")    
Omitted-fix: f7875966dc0c ("tools headers UAPI: Sync files changed by new fchmodat2 and map_shadow_stack syscalls with the kernel sources")   
Omitted-fix: df39038cd895 ("s390/mm: Fix VM_FAULT_HWPOISON handling in do_exception()")    
Omitted-fix: 12bbaae7635a ("mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros")    
Omitted-fix: fd1a745ce03e ("mm: support page_mapcount() on page_has_type() pages")    
Omitted-fix: d99e3140a4d3 ("mm: turn folio_test_hugetlb into a PageType")    
Omitted-fix: fa2690af573d ("mm: page_ref: remove folio_try_get_rcu()")    
Omitted-fix: f442fa614137 ("mm: gup: stop abusing try_grab_folio")    
Omitted-fix: cb0f01beb166 ("mm/mprotect: fix dax pud handling")    
    
Signed-off-by: Rafael Aquini <raquini@redhat.com>

Approved-by: John W. Linville <linville@redhat.com>
Approved-by: Mark Salter <msalter@redhat.com>
Approved-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Steve Best <sbest@redhat.com>
Approved-by: David Airlie <airlied@redhat.com>
Approved-by: Michal Schmidt <mschmidt@redhat.com>
Approved-by: Baoquan He <5820488-baoquan_he@users.noreply.gitlab.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-10-30 07:22:28 +00:00
Ian Kent 1225837915 fscrypt: remove mention of symlink st_size quirk from documentation
JIRA: https://issues.redhat.com/browse/RHEL-33888
Upstream status: Linus

commit e538b0985a05cfe245ada0bb92f177efec6b8a88
Author: Eric Biggers <ebiggers@google.com>
Date:   Thu Jul 1 23:53:50 2021 -0700

    fscrypt: remove mention of symlink st_size quirk from documentation

    Now that the correct st_size is reported for encrypted symlinks on all
    filesystems, update the documentation accordingly.

    Link: https://lore.kernel.org/r/20210702065350.209646-6-ebiggers@kernel.org
    Signed-off-by: Eric Biggers <ebiggers@google.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 14:20:06 +08:00
Ian Kent 92d69b838d fs: port xattr to mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: The cifs source has been moved in CentOS Stream so manually
	apply rejected hunk to fs/smb/client/xattr.c.
        Dropped hunks for ntfs3 because the source is not present in
        the CentOS Stream source tree.
	CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support
	to new xattrs.c file") moved ovl_own_xattr_set(), manually apply
	changes.
	CentOS Stream commit 67e2fcb2f3 ("evm: don't copy up
	'security.evm' xattr") is present causing hunk #1 against
	include/linux/evm.h to be rejected, manually apply.
	Upstream commit 5d1ef2ce13a90 ("ima: Introduce
	ima_get_current_hash_algo()") is not present in CentOS Stream
	which causes fuzz 1 for hunk #1 against include/linux/ima.h.
	There's a reject of hunk #1 for include/linux/lsm_hooks.h but
	I can't see any reason for it, manually applied the hunk.
	CentOS Stream does not have upstream commit ce5bb5a86e5eb
	("ima: Return int in the functions to measure a buffer") which
	results in a reject of hunk #2 against security/integrity/ima/ima.h
	and hunks #8 and #11 against security/integrity/ima/ima_main.c, so
	manually apply hunks. There also appears to be a whitespace
	mismatch causing hunk #7 to report fuzz 2 on application.
	CentOS Stream does not have upstream commit c7423dbdbc9ec
	("ima: Handle -ESTALE returned by ima_filter_rule_match()")
	which results in a reject of hunk #3 against
	security/integrity/ima/ima_policy.c, so manually apply hunk.

commit 39f60c1ccee72caa0104145b5dbf5d37cce1ea39
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:23 2023 +0100

    fs: port xattr to mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:21 +08:00
Ian Kent 304ec491ee fs: port ->permission() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	CentOS Stream commit 48fa94aacd ("ceph: fscrypt_auth handling
	for ceph") is presnt which causes fuzz 2 in hunk #1 in
	fs/ceph/super.h.
	Upstream commit 427505ffeaa46 ("exportfs: use pr_debug for
	unreachable debug statements") is not present causing fuzz 2
	in hunk #1 against fs/exportfs/expfs.c.
	Dropped hunks for ksmbd because the source is not present in the
	CentOS Stream source tree.
	Upstream commit 03fa86e9f79d8 ("namei: stash the sampled ->d_seq
	into nameidata") is not present causing a fuzz 1 for hunk #14
	against fs/namei.c.
	CentOS Stream c4f3dd0731 ("nfsd: handle failure to collect
	pre/post-op attrs more sanely") is present and causes a rejects
	for hunks #4 and #5 against fs/nfsd/vfs.c, apply manually.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.
	CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support to
	new xattrs.c file") moves ovl_xattr_set() and ovl_xattr_get()
	from fs/overlayfs/inode.c to fs/overlayfs/xattrs.c which causes
	hunks #4 and #5 to fail, manually apply to fs/overlayfs/xattrs.c.
	CentOS Stream commit 55177e4b83 ("ovl: mark xwhiteouts directory
	with overlay.opaque='x'") and commit d17b324bb6 ("ovl: use
	ovl_numlower() and ovl_lowerstack() accessors") change the first
	and third hunks of fs/overlayfs/namei.c causing them to fail,
	manually apply.
	CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support to
	new xattrs.c file") causes fuzz 2 in hunk #5 of
	fs/overlayfs/overlayfs.h
	CentOS Stream commit 355a9c490a ("ovl: Add an alternative
	type of whiteout") changes ovl_cache_update_ino() to
	ovl_cache_update() in fs/overlayfs/readdir.c, make the change
	manually.
	Upstream commit 217af7e2f4deb ("apparmor: refactor profile
	rules and attachments") is not in CentOS Stream causing hunk #1
	to fail to apply so manually apply the change.

commit 4609e1f18e19c3b302e1eb4858334bca1532f780
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:22 2023 +0100

    fs: port ->permission() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:20 +08:00
Ian Kent 060dc0b240 fs: port ->fileattr_set() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.

commit 8782a9aea3ab4d697ad67d1f8ebca38a4e1c24ab
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:21 2023 +0100

    fs: port ->fileattr_set() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:18 +08:00
Ian Kent be97228574 fs: port ->set_acl() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsacl.c and
	fs/smb/client/cifsproto.h.
	Dropped hunks for ntfs3 and ksmbd because the source is not
	present in the CentOS Stream source tree.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present, which cuases hunk #1 against
	mm/shmem.c to be rejected, manually apply the hunk.
	CentOS Stream commit 48fa94aacd ("ceph: fscrypt_auth handling
	for ceph") is present which causes fuzz 1 of hunk #1 against
	fs/ceph/inode.c.

commit 13e83a4923bea7c4f2f6714030cb7e56d20ef7e5
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:20 2023 +0100

    fs: port ->set_acl() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:12 +08:00
Ian Kent 1176258599 fs: port ->get_acl() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsacl.c and
	fs/smb/client/cifsproto.h.
	Upstream merge commit 05e6295f7b5e0 ("Merge tag 'fs.idmapped.v6.3'
	of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping")
	has changes already applied in Upstream commit facd61053cff1 (
	"fuse: fixes after adapting to new posix acl api") so just apply
	the additional changes (those that relate to the ->get_act() port).
	CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support
	to new xattrs.c file") resulted in fuzz for its hunk against
	fs/overlayfs/overlayfs.h.

commit 77435322777d8a8a08264a39111bef94e32b871b
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:19 2023 +0100

    fs: port ->get_acl() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:11 +08:00
Ian Kent 0dcf7b37eb fs: port ->tmpfile() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	Upstream commit 863f144f12add ("vfs: open inside ->tmpfile()") is
	not present which caused a reject in fs/f2fs/namei.c for hunk #1,
	applied manually.
	The hunk of the patch against fs/minix/namei.c was rejected but I
	can't see any reason for it, applied manually.
	CentOS Stream has commit 9e0a1fff8d ("ubifs: Implement
	RENAME_WHITEOUT") which caused a reject in the hunk against
	fs/ubifs/dir.c, manually applied.

commit 011e2b717b1b921d3706a9d48ff83a025563e826
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:18 2023 +0100

    fs: port ->tmpfile() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:10 +08:00
Ian Kent 956e3ad810 fs: port ->mknod() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/dir.c.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present, which cuases hunks #2-#4 to be
	rejected, manually apply the hunks.
	CentOS Stream commit f0f830cd7e ("ceph: create symlinks with
	encrypted and base64-encoded targets") is present and resulted
	in fuzz against fs/ceph/dir.c hunk #2.
	Upstream commit 863f144f12add ("vfs: open inside ->tmpfile()")
	is missing causing fuzz against fs/ext2/namei.c.
	Upstream commit 7d37539037c2f ("fuse: implement ->tmpfile()")
	is missing causing fuzz in hunk #4 against fs/fuse/dir.c.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present, so a patch reorder was needed
	with appropriate adjustments.

commit 5ebb29bee8d5fc173b774e0755be8cb335503ee3
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:16 2023 +0100

    fs: port ->mknod() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:08 +08:00
Ian Kent 19f3b4f1ba fs: port ->rename() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/inode.c.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.
	Upstream commit cc14d24026704 ("hpfs: Convert symlinks to
	read_folio") is not present which causes fuzz 1 for hunk #1.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present, so a patch reorder was needed
	with appropriate adjustments.

commit e18275ae55e07a2937e48134589c2f4c1d99a369
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:17 2023 +0100

    fs: port ->rename() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:07 +08:00
Ian Kent a7750be4f4 fs: port ->mkdir() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/inode.c.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.

commit c54bd91e9eaba43f09aadc25b52ea869ff3b5587
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:15 2023 +0100

    fs: port ->mkdir() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:00 +08:00
Ian Kent 5744ba0ee3 fs: port ->symlink() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/link.c.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.
	CentOS Stream commit f0f830cd7e ("ceph: create symlinks with
	encrypted and base64-encoded targets") is present and resulted
	in fuzz against fs/ceph/dir.c.

commit 7a77db95511c39be4b2db2ceca152ef589adc2dc
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:14 2023 +0100

    fs: port ->symlink() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:45:00 +08:00
Ian Kent a56d1daadf fs: port ->create() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/dir.c.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present, which cuases fuzz in mm/shmem.c.

commit 6c960e68aaed335a0040f16654f3c5e5bfcf9249
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:13 2023 +0100

    fs: port ->create() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 10:44:53 +08:00
Ian Kent 6ad3fa5fce fs: port ->getattr() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: CentOS Stream has commit 3e0b6f1fa9 ("afs: use
	read_seqbegin() in afs_check_validity() and afs_getattr()"),
	manually apply hunk #2 to fs/afs/inode.c.
	CentOS Stream commit 3b06927229 {"afs: split
        afs_pagecache_valid() out of afs_validate()") is present which
        causes a reject in fs/afs/internal.h, manually apply hunk to
	fs/afs/internal.h.
	For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	CentOS Stream commit 48fa94aacd ("ceph: fscrypt_auth handling
	for ceph") alters the definition of _ceph_setattr() causing fuzz.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/inode.c.
	Upstream commit 2e1d66379e ("staging: erofs: drop the extern
        prefix for function definitions") caused strange behaviour when
        applying this patch, there was a conflict in fs/erofs/internal.h but
        after a refresh the hunk and context looked ok. The hunk had to be
	manually applied.
	Upstream commit 2db0487faa211 ("f2fs: move f2fs_force_buffered_io()
	into file.c") is not present in CentOS Stream which causes fuzz
	when applying the first hunk to fs/f2fs/file.c.
	Upstream commit 30abce053f811 ("fat: report creation time in statx")
	is not present in CentOS Stream which caused a reject so apply change
	manually.
	Dropped hunks for ksmbd because the source is not present in the
	CentOS Stream source tree.
	Dropped hunks for ntfs3 because the source is not present in the
	CentOS Stream source tree.
	There was fuzz with hunk #2 against fs/nfs/inode.c but I was
	unable to see any difference.
	CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support
	to new xattrs.c file") is present which caused fuzz in
	fs/overlayfs/overlayfs.h.
	Upstream commit d919a1e79bac8 ("proc: fix a dentry lock race
	between release_task and lookup") is not present in CentOS
	Stream causing fuzz applying hunk #1 against fs/proc/base.c.
	CentOS Stream commit 20c470188c ("vfs: plumb i_version
	handling into struct kstat") is present causing fuzz in hunk
	#2 against fs/stat.c.
	Upstream commit e0c49bd2b4d3c ("fs: sysv: Fix sysv_nblocks()
	returns wrong value") is not present in CentOS Stream causing
	fuzz applying hunk#1 against fs/sysv/itree.c.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present so it's ok to pass idmap to
	generic_fillattr().
	CentOS Stream commit f0f830cd7e {"ceph: create symlinks
	with encrypted and base64-encoded targets") uses the old
	struct user_namespace and so leaves those changes out, make
	those getattr() changes here.
	Allow for CentOS Stream commit 6c3396a0d8 ("kernfs: Introduce
	separate rwsem to protect inode attributes") which is already
	present.
	CentOS Stream commit f5219db0c0 ("KVM: fix Add KVM_CREATE_GUEST_MEMFD
	ioctl() for guest-specific backing memory") updated the upstream commit
	a7800aa80ea4d ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific
	backing memory") to account for missing idmapping commits. Now we have
	updated the second and final place these changes were made make the final
	needed adjustment to match the original upstream patch.

commit b74d24f7a74ffd2d42ca883d84b7422b8d545901
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:12 2023 +0100

    fs: port ->getattr() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 09:37:45 +08:00
Ian Kent 0001830c3d proc: report open files as size in stat() for /proc/pid/fd
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

commit f1f1f2569901ec5b9d425f2e91c09a0e320768f3
Author: Ivan Babrou <ivan@cloudflare.com>
Date:   Thu Sep 22 15:40:26 2022 -0700

    proc: report open files as size in stat() for /proc/pid/fd

    Many monitoring tools include open file count as a metric.  Currently the
    only way to get this number is to enumerate the files in /proc/pid/fd.

    The problem with the current approach is that it does many things people
    generally don't care about when they need one number for a metric.  In our
    tests for cadvisor, which reports open file counts per cgroup, we observed
    that reading the number of open files is slow.  Out of 35.23% of CPU time
    spent in `proc_readfd_common`, we see 29.43% spent in `proc_fill_cache`,
    which is responsible for filling dentry info.  Some of this extra time is
    spinlock contention, but it's a contention for the lock we don't want to
    take to begin with.

    We considered putting the number of open files in /proc/pid/status.
    Unfortunately, counting the number of fds involves iterating the
    open_files bitmap, which has a linear complexity in proportion with the
    number of open files (bitmap slots really, but it's close).  We don't want
    to make /proc/pid/status any slower, so instead we put this info in
    /proc/pid/fd as a size member of the stat syscall result.  Previously the
    reported number was zero, so there's very little risk of breaking
    anything, while still providing a somewhat logical way to count the open
    files with a fallback if it's zero.

    RFC for this patch included iterating open fds under RCU.  Thanks to Frank
    Hofmann for the suggestion to use the bitmap instead.

   Previously:

    ```
    $ sudo stat /proc/1/fd | head -n2
      File: /proc/1/fd
      Size: 0               Blocks: 0          IO Block: 1024   directory
    ```

    With this patch:

    ```
    $ sudo stat /proc/1/fd | head -n2
      File: /proc/1/fd
      Size: 65              Blocks: 0          IO Block: 1024   directory
    ```

    Correctness check:

    ```
    $ sudo ls /proc/1/fd | wc -l
    65
    ```

    I added the docs for /proc/<pid>/fd while I'm at it.

    [ivan@cloudflare.com: use bitmap_weight() to count the bits]
      Link: https://lkml.kernel.org/r/20221018045844.37697-1-ivan@cloudflare.com
    [akpm@linux-foundation.org: include linux/bitmap.h for bitmap_weight()]
    [ivan@cloudflare.com: return errno from proc_fd_getattr() instead of setting negative size]
      Link: https://lkml.kernel.org/r/20221024173140.30673-1-ivan@cloudflare.com
    Link: https://lkml.kernel.org/r/20220922224027.59266-1-ivan@cloudflare.com
    Signed-off-by: Ivan Babrou <ivan@cloudflare.com>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: David Laight <David.Laight@ACULAB.COM>
    Cc: Ivan Babrou <ivan@cloudflare.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Kalesh Singh <kaleshsingh@google.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 09:28:48 +08:00
Ian Kent 43ca440cdf fs: port ->setattr() to pass mnt_idmap
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: CentOS Stream commit 3c29fadfb1 ("afs: split
	afs_pagecache_valid() out of afs_validate()") is present, manually
	adjust hunk #1 of fs/afs/internal.h.
	For consistency drop btrfs hunks because it isn't supported in
	CentOS Stream and other backports also drop such hunks.
	CentOS Stream commit 48fa94aacd ("ceph: fscrypt_auth handling
	for ceph") alters the definition of _ceph_setattr(), adjust
	manually.
	CentOS Stream commit 34b2a2b5a3 {"ceph: add some fscrypt
	guardrails") introduces a call to fscrypt_prepare_setattr() which
	causes fuzz when applying.
	The cifs source has been moved in CentOS Stream so manually
	apply rejected hunks to fs/smb/client/cifsfs.h and
	fs/smb/client/inode.c.
	Upstream commit 5a646fb3a3e2d ("coda: avoid doing bad things on
	inode type changes during revalidation") is not present which
	causes fuzz in fs/coda/coda_linux.h.
	Dropped hunks for ntfs3 because the source is not present in
	the CentOS Stream source tree.
	CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support
	to new xattrs.c file") is presnt so manually apply hunk.
	CentOS Stream commit 892da692fa ("shmem: support idmapped
	mounts for tmpfs") is present so it's ok to pass idmap to
	setattr_prepare() and setattr_copy().
	Update to add incremental changes needed due to CentOS Stream
	commit 469e1d13f6 ("shmem: quota support").
	Allow for CentOS Stream commit 6c3396a0d8 ("kernfs: Introduce
	separate rwsem to protect inode attributes") which is already
	present.
	CentOS Stream commit f5219db0c0 ("KVM: fix Add KVM_CREATE_GUEST_MEMFD
	ioctl() for guest-specific backing memory") updated the upstream commit
	a7800aa80ea4d ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific
	backing memory") to account for missing idmapping commits. Now we have
	updated one of the two places these changes were made make one of the
	needed adjustments to match the original upstream patch.

commit c1632a0f11209338fc300c66252bcc4686e609e8
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Jan 13 12:49:11 2023 +0100

    fs: port ->setattr() to pass mnt_idmap

    Convert to struct mnt_idmap.

    Last cycle we merged the necessary infrastructure in
    256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
    This is just the conversion to struct mnt_idmap.

    Currently we still pass around the plain namespace that was attached to a
    mount. This is in general pretty convenient but it makes it easy to
    conflate namespaces that are relevant on the filesystem with namespaces
    that are relevent on the mount level. Especially for non-vfs developers
    without detailed knowledge in this area this can be a potential source for
    bugs.

    Once the conversion to struct mnt_idmap is done all helpers down to the
    really low-level helpers will take a struct mnt_idmap argument instead of
    two namespace arguments. This way it becomes impossible to conflate the two
    eliminating the possibility of any bugs. All of the vfs and all filesystems
    only operate on struct mnt_idmap.

    Acked-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-16 09:07:05 +08:00
Ian Kent e38fc575b4 fs: add new get acl method
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

commit 7420332a6ff407ba2d3d25f5e8430bf426131d1d
Author: Christian Brauner <brauner@kernel.org>
Date:   Thu Sep 22 17:17:01 2022 +0200

    fs: add new get acl method

    The current way of setting and getting posix acls through the generic
    xattr interface is error prone and type unsafe. The vfs needs to
    interpret and fixup posix acls before storing or reporting it to
    userspace. Various hacks exist to make this work. The code is hard to
    understand and difficult to maintain in it's current form. Instead of
    making this work by hacking posix acls through xattr handlers we are
    building a dedicated posix acl api around the get and set inode
    operations. This removes a lot of hackiness and makes the codepaths
    easier to maintain. A lot of background can be found in [1].

    Since some filesystem rely on the dentry being available to them when
    setting posix acls (e.g., 9p and cifs) they cannot rely on the old get
    acl inode operation to retrieve posix acl and need to implement their
    own custom handlers because of that.

    In a previous patch we renamed the old get acl inode operation to
    ->get_inode_acl(). We decided to rename it and implement a new one since
    ->get_inode_acl() is called generic_permission() and inode_permission()
    both of which can be called during an filesystem's ->permission()
    handler. So simply passing a dentry argument to ->get_acl() would have
    amounted to also having to pass a dentry argument to ->permission(). We
    avoided that change.

    This adds a new ->get_acl() inode operations which takes a dentry
    argument which filesystems such as 9p, cifs, and overlayfs can implement
    to get posix acls.

    Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:11:37 +08:00
Ian Kent 4f5c324efc fs: rename current get acl method
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: Upstream commit eadcd6b5a1eb3 ("erofs: add fiemap support
    with iomap") is not (yet) present in CentOS Stream.
    The changes for fs/ksmbd/* were dropped as the directory doesn't
    exist in CentOS Stream.
    The changes for fs/ntfs3/* were dropped as the directory doesn't
    exist in CentOS Stream.

commit cac2f8b8d8b50ef32b3e34f6dcbbf08937e4f616
Author: Christian Brauner <brauner@kernel.org>
Date:   Thu Sep 22 17:17:00 2022 +0200

    fs: rename current get acl method

    The current way of setting and getting posix acls through the generic
    xattr interface is error prone and type unsafe. The vfs needs to
    interpret and fixup posix acls before storing or reporting it to
    userspace. Various hacks exist to make this work. The code is hard to
    understand and difficult to maintain in it's current form. Instead of
    making this work by hacking posix acls through xattr handlers we are
    building a dedicated posix acl api around the get and set inode
    operations. This removes a lot of hackiness and makes the codepaths
    easier to maintain. A lot of background can be found in [1].

    The current inode operation for getting posix acls takes an inode
    argument but various filesystems (e.g., 9p, cifs, overlayfs) need access
    to the dentry. In contrast to the ->set_acl() inode operation we cannot
    simply extend ->get_acl() to take a dentry argument. The ->get_acl()
    inode operation is called from:

    acl_permission_check()
    -> check_acl()
       -> get_acl()

    which is part of generic_permission() which in turn is part of
    inode_permission(). Both generic_permission() and inode_permission() are
    called in the ->permission() handler of various filesystems (e.g.,
    overlayfs). So simply passing a dentry argument to ->get_acl() would
    amount to also having to pass a dentry argument to ->permission(). We
    should avoid this unnecessary change.

    So instead of extending the existing inode operation rename it from
    ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that
    passes a dentry argument and which filesystems that need access to the
    dentry can implement instead of ->get_inode_acl(). Filesystems like cifs
    which allow setting and getting posix acls but not using them for
    permission checking during lookup can simply not implement
    ->get_inode_acl().

    This is intended to be a non-functional change.

    Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
    Suggested-by/Inspired-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:11:26 +08:00
Ian Kent 310906db16 fs: pass dentry to set acl method
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: I didn't want to just drop the btrfs hunks so I made the
    change to btrfs_setattr() init_user_ns instead of the expected
    mnt_userns. That should at least cause a conflict if btrfs changes
    to a supported fs in the future.
    CentOS Stream commit 48fa94aacd ("ceph: fscrypt_auth handling for
    ceph") is present, make necessary adjustment.
    CentOS Stream commit 892da692fa ("shmem: support idmapped mounts
    for tmpfs") is present, make necessary adjustment.
    The changes for fs/ksmbd/* were dropped as the directory doesn't
    exist in CentOS Stream.
    The changes for fs/ntfs3/* were dropped as the directory doesn't
    exist in CentOS Stream.

commit 138060ba92b3b0d77c8e6818d0f33398b23ea42e
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Sep 23 10:29:39 2022 +0200

    fs: pass dentry to set acl method

    The current way of setting and getting posix acls through the generic
    xattr interface is error prone and type unsafe. The vfs needs to
    interpret and fixup posix acls before storing or reporting it to
    userspace. Various hacks exist to make this work. The code is hard to
    understand and difficult to maintain in it's current form. Instead of
    making this work by hacking posix acls through xattr handlers we are
    building a dedicated posix acl api around the get and set inode
    operations. This removes a lot of hackiness and makes the codepaths
    easier to maintain. A lot of background can be found in [1].

    Since some filesystem rely on the dentry being available to them when
    setting posix acls (e.g., 9p and cifs) they cannot rely on set acl inode
    operation. But since ->set_acl() is required in order to use the generic
    posix acl xattr handlers filesystems that do not implement this inode
    operation cannot use the handler and need to implement their own
    dedicated posix acl handlers.

    Update the ->set_acl() inode method to take a dentry argument. This
    allows all filesystems to rely on ->set_acl().

    As far as I can tell all codepaths can be switched to rely on the dentry
    instead of just the inode. Note that the original motivation for passing
    the dentry separate from the inode instead of just the dentry in the
    xattr handlers was because of security modules that call
    security_d_instantiate(). This hook is called during
    d_instantiate_new(), d_add(), __d_instantiate_anon(), and
    d_splice_alias() to initialize the inode's security context and possibly
    to set security.* xattrs. Since this only affects security.* xattrs this
    is completely irrelevant for posix acls.

    Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:11:25 +08:00
Ian Kent 07a3bda2ba vfs: add rcu argument to ->get_acl() callback
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

Conflicts: CentOS Stream commit d592b7f96f ("9p: fix a bunch of
    checkpatch warnings") removes extern from declarations in
    fs/9p/acl.h.
    CentOS Stream commit 98ba731fc7 ("ovl: Move xattr support to
    new xattrs.c file") moved the declaration of ovl_xattr_get() and
    ovl_listxattr() to fs/overlayfs/xattr.c.
    CentOS Stream commit fdb679f7a3 ("xfs: improve __xfs_set_acl")
    changes the declarations in fs/xfs/xfs_acl.h.

commit 0cad6246621b5887d5b33fea84219d2a71f2f99a
Author: Miklos Szeredi <mszeredi@redhat.com>
Date:   Wed Aug 18 22:08:24 2021 +0200

    vfs: add rcu argument to ->get_acl() callback

    Add a rcu argument to the ->get_acl() callback to allow
    get_cached_acl_rcu() to call the ->get_acl() method in the next patch.

    Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:11:09 +08:00
Ian Kent 306e866d5f docs: Add small intro to idmap examples
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

commit ccbd0c991985afc53670e2b01840517922fc30e4
Author: Rodrigo Campos <rodrigo@sdfg.com.ar>
Date:   Fri Apr 29 15:57:48 2022 +0200

    docs: Add small intro to idmap examples

    When reading the documentation, I didn't understand why this list
    examples of things that fail without using the mount idmap feature.
    It seems pretty pointless and I doubted if I was missing something,
    until I finished the examples, the next section and saw the examples
    revisited.  After that, it all made sense.

    Let's add one small sentence before, so the reader knows where this is
    going and why examples that don't might seem relevant are used.

    Link: https://lore.kernel.org/r/20220429135748.481301-1-rodrigo@sdfg.com.ar
    Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
    Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
    Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:10:55 +08:00
Ian Kent c6e872114c docs: update mapping documentation
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

commit 8cc5c54de44c5e8e104d364a627ac4296845fc7f
Author: Christian Brauner <brauner@kernel.org>
Date:   Fri Dec 3 12:17:02 2021 +0100

    docs: update mapping documentation

    Now that we implement the full remapping algorithms described in our
    documentation remove the section about shortcircuting them.

    Link: https://lore.kernel.org/r/20211123114227.3124056-6-brauner@kernel.org (v1)
    Link: https://lore.kernel.org/r/20211130121032.3753852-6-brauner@kernel.org (v2)
    Link: https://lore.kernel.org/r/20211203111707.3901969-6-brauner@kernel.org
    Cc: Seth Forshee <sforshee@digitalocean.com>
    Cc: Amir Goldstein <amir73il@gmail.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    CC: linux-fsdevel@vger.kernel.org
    Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
    Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:10:53 +08:00
Ian Kent 0ba5d80a23 doc: give a more thorough id handling explanation
JIRA: https://issues.redhat.com/browse/RHEL-33888
Status: Linus

commit ad19607a90b29eef044660aba92a2a2d63b1e977
Author: Christian Brauner <brauner@kernel.org>
Date:   Tue Jul 27 12:44:16 2021 +0200

    doc: give a more thorough id handling explanation

    Currently there's no document explaining how idmappings work at all.
    Add a document that gives an introduction and also goes into a bit more
    detail for more advanced use-cases.

    Link: https://lore.kernel.org/r/20210727104416.828293-1-brauner@kernel.org
    Cc: Seth Forshee <seth.forshee@digitalocean.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Aleksa Sarai <cyphar@cyphar.com>
    Cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

Signed-off-by: Ian Kent <ikent@redhat.com>
2024-10-15 16:10:48 +08:00
Rafael Aquini a2d8b7832f mm: allow ->huge_fault() to be called without the mmap_lock held
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 40d49a3c9e4a0e5cf7a6fcebc8d4d7d63d1f3f1b
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Fri Aug 18 21:23:34 2023 +0100

    mm: allow ->huge_fault() to be called without the mmap_lock held

    Remove the checks for the VMA lock being held, allowing the page fault
    path to call into the filesystem instead of retrying with the mmap_lock
    held.  This will improve scalability for DAX page faults.  Also update the
    documentation to match (and fix some other changes that have happened
    recently).

    Link: https://lkml.kernel.org/r/20230818202335.2739663-3-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:22:00 -04:00
Rafael Aquini af8796d9b7 mm: convert do_set_pte() to set_pte_range()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 3bd786f76de2e01745f462844fd1a206052ee8b8
Author: Yin Fengwei <fengwei.yin@intel.com>
Date:   Wed Aug 2 16:14:04 2023 +0100

    mm: convert do_set_pte() to set_pte_range()

    set_pte_range() allows to setup page table entries for a specific
    range.  It takes advantage of batched rmap update for large folio.
    It now takes care of calling update_mmu_cache_range().

    Link: https://lkml.kernel.org/r/20230802151406.3735276-37-willy@infradead.org
    Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:20:33 -04:00
Rafael Aquini 057fcc1895 mm: Introduce VM_SHADOW_STACK for shadow stack memory
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 54007f818206dc27309ca423df4c87dd160a7208
Author: Yu-cheng Yu <yu-cheng.yu@intel.com>
Date:   Mon Jun 12 17:10:40 2023 -0700

    mm: Introduce VM_SHADOW_STACK for shadow stack memory

    New hardware extensions implement support for shadow stack memory, such
    as x86 Control-flow Enforcement Technology (CET). Add a new VM flag to
    identify these areas, for example, to be used to properly indicate shadow
    stack PTEs to the hardware.

    Shadow stack VMA creation will be tightly controlled and limited to
    anonymous memory to make the implementation simpler and since that is all
    that is required. The solution will rely on pte_mkwrite() to create the
    shadow stack PTEs, so it will not be required for vm_get_page_prot() to
    learn how to create shadow stack memory. For this reason document that
    VM_SHADOW_STACK should not be mixed with VM_SHARED.

    Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
    Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
    Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reviewed-by: Mark Brown <broonie@kernel.org>
    Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Acked-by: David Hildenbrand <david@redhat.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Pengfei Xu <pengfei.xu@intel.com>
    Tested-by: John Allen <john.allen@amd.com>
    Tested-by: Kees Cook <keescook@chromium.org>
    Link: https://lore.kernel.org/all/20230613001108.3040476-15-rick.p.edgecombe%40intel.com

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:17:02 -04:00
Jerry Snitselaar 10c44eec32 iommu: account IOMMU allocated memory
JIRA: https://issues.redhat.com/browse/RHEL-54186
Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 212c5c078d83d780cf2873ca931df135771e8bb7
Author: Pasha Tatashin <pasha.tatashin@soleen.com>
Date:   Sat Apr 13 00:25:22 2024 +0000

    iommu: account IOMMU allocated memory

    In order to be able to limit the amount of memory that is allocated
    by IOMMU subsystem, the memory must be accounted.

    Account IOMMU as part of the secondary pagetables as it was discussed
    at LPC.

    The value of SecPageTables now contains mmeory allocation by IOMMU
    and KVM.

    There is a difference between GFP_ACCOUNT and what NR_IOMMU_PAGES shows.
    GFP_ACCOUNT is set only where it makes sense to charge to user
    processes, i.e. IOMMU Page Tables, but there more IOMMU shared data
    that should not really be charged to a specific process.

    Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
    Acked-by: David Rientjes <rientjes@google.com>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Link: https://lore.kernel.org/r/20240413002522.1101315-12-pasha.tatashin@soleen.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>

(cherry picked from commit 212c5c078d83d780cf2873ca931df135771e8bb7)
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
2024-09-20 12:29:01 -07:00
Lucas Zampieri 2424e8e040 Merge: mm: follow up work for the MM v6.4 update and disable CONFIG_PER_VMA_LOCK until it is fixed
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4749

JIRA: https://issues.redhat.com/browse/RHEL-48221  
  
It was identified that our process to bring in code-base updates   
has been unwittingly missing some of the peripheric commits not   
touching directly the core code under mm/ the directory.  
While most of these identified peripheric commits are simple  
and basic clean-ups, some are relevant changesets that might end   
up causing real(and subtle) issues for RHEL deployments if they  
remain missing.   
  
The intent of this patchset is to close the aforementioned GAP  
by bringing in the missing peripheric commits from v5.14 up to  
v6.4, which is the level we're parking our codebase for RHEL-9.5.  
  
A secondary intent of this patchset is to bring in upstream's   
v6.5 commit that disables the PER_VMA_LOCK feature which was   
recently introduced (to RHEL-9.5) but was marked BROKEN upstream  
circa release v6.5, in order to avoid the reported issues with  
memory corruptions in the upstream builds.  
  
Signed-off-by: Rafael Aquini <aquini@redhat.com>

Approved-by: Mark Langsdorf <mlangsdo@redhat.com>
Approved-by: Waiman Long <longman@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-08-06 14:21:52 +00:00
Lucas Zampieri 5fe59960ba Merge: tmpfs: Enable quotas for tmpfs filesystem
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4132

JIRA: https://issues.redhat.com/browse/RHEL-7768
Tested: with xfstests

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>

Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Ian Kent <ikent@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-22 14:38:49 +00:00
Carlos Maiolino cc90a30c6c shmem: Add default quota limit mount options
JIRA: https://issues.redhat.com/browse/RHEL-7768
Tested: with xfstests

Allow system administrator to set default global quota limits at tmpfs
mount time.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230725144510.253763-7-cem@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
(cherry picked from commit de4c0e7ca8b526a82ff7e5ee5533787bb6d01724)
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
2024-07-17 07:49:46 +02:00
Carlos Maiolino 469e1d13f6 shmem: quota support
JIRA: https://issues.redhat.com/browse/RHEL-7768
Tested: with xfstests

Conflicts:
	- We have no support for iattr vfs{g,u}ids, so open core calls
	  to i_{ug}id_needs_update()
	- We don't have support for struct idmap, so use user_namespace
	  struct wherever appropriate.

Now the basic infra-structure is in place, enable quota support for tmpfs.

This offers user and group quotas to tmpfs (project quotas will be added
later). Also, as other filesystems, the tmpfs quota is not supported
within user namespaces yet, so idmapping is not translated.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230725144510.253763-6-cem@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
(cherry picked from commit e09764cff44b5d31c2ca5477444565e3080637d2)
2024-07-17 07:49:46 +02:00
Rafael Aquini 02352a2014 mm/memtest: add results of early memtest to /proc/meminfo
JIRA: https://issues.redhat.com/browse/RHEL-48221

This patch is a backport of the following upstream commit:
commit bd23024b9774e681cbe6cc3afcb24244dfcb2390
Author: Tomas Mudrunka <tomas.mudrunka@gmail.com>
Date:   Tue Mar 21 11:34:30 2023 +0100

    mm/memtest: add results of early memtest to /proc/meminfo

    Currently the memtest results were only presented in dmesg.

    When running a large fleet of devices without ECC RAM it's currently not
    easy to do bulk monitoring for memory corruption.  You have to parse
    dmesg, but that's a ring buffer so the error might disappear after some
    time.  In general I do not consider dmesg to be a great API to query RAM
    status.

    In several companies I've seen such errors remain undetected and cause
    issues for way too long.  So I think it makes sense to provide a
    monitoring API, so that we can safely detect and act upon them.

    This adds /proc/meminfo entry which can be easily used by scripts.

    Link: https://lkml.kernel.org/r/20230321103430.7130-1-tomas.mudrunka@gmail.com
    Signed-off-by: Tomas Mudrunka <tomas.mudrunka@gmail.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Mike Rapoport (IBM) <rppt@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <aquini@redhat.com>
2024-07-16 09:30:02 -04:00
Benjamin Coddington ccf6487e6e lockd: introduce safe async lock op
JIRA: https://issues.redhat.com/browse/RHEL-34875

commit 2dd10de8e6bcbacf85ad758b904543c294820c63
Author: Alexander Aring <aahringo@redhat.com>
Date:   Tue Sep 12 17:53:18 2023 -0400

    lockd: introduce safe async lock op

    This patch reverts mostly commit 40595cdc93ed ("nfs: block notification
    on fs with its own ->lock") and introduces an EXPORT_OP_ASYNC_LOCK
    export flag to signal that the "own ->lock" implementation supports
    async lock requests. The only main user is DLM that is used by GFS2 and
    OCFS2 filesystem. Those implement their own lock() implementation and
    return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed
    ("nfs: block notification on fs with its own ->lock") the DLM
    implementation were never updated. This patch should prepare for DLM
    to set the EXPORT_OP_ASYNC_LOCK export flag and update the DLM
    plock implementation regarding to it.

    Acked-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Alexander Aring <aahringo@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2024-06-27 08:14:02 -04:00
Benjamin Coddington 9740242c8c Documentation: Add missing documentation for EXPORT_OP flags
JIRA: https://issues.redhat.com/browse/RHEL-34875

commit b38a6023da6a12b561f0421c6a5a1f7624a1529c
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Aug 25 15:04:23 2023 -0400

    Documentation: Add missing documentation for EXPORT_OP flags

    The commits that introduced these flags neglected to update the
    Documentation/filesystems/nfs/exporting.rst file.

    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2024-06-27 08:14:02 -04:00
Nico Pache 5b6ce49452 tmpfs: fix Documentation of noswap and huge mount options
commit 253e5df8b8f0145adb090f57c6f4e6efa52d738e
Author: Hugh Dickins <hughd@google.com>
Date:   Sun Jul 23 13:55:00 2023 -0700

    tmpfs: fix Documentation of noswap and huge mount options

    The noswap mount option is surely not one of the three options for sizing:
    move its description down.

    The huge= mount option does not accept numeric values: those are just in
    an internal enum.  Delete those numbers, and follow the manpage text more
    closely (but there's not yet any fadvise() or fcntl() which applies here).

    /sys/kernel/mm/transparent_hugepage/shmem_enabled is hard to describe, and
    barely relevant to mounting a tmpfs: just refer to transhuge.rst (while
    still using the words deny and force, to help as informal reminders).

    [rdunlap@infradead.org: fixup Docs table for huge mount options]
      Link: https://lkml.kernel.org/r/20230725052333.26857-1-rdunlap@infradead.org
    Link: https://lkml.kernel.org/r/986cb0bf-9780-354-9bb-4bf57aadbab@google.com
    Signed-off-by: Hugh Dickins <hughd@google.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
    Fixes: d0f5a85442d1 ("shmem: update documentation")
    Fixes: 2c6efe9cf2d7 ("shmem: add support to ignore swap")
    Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
    Cc: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5619
Signed-off-by: Nico Pache <npache@redhat.com>
2024-04-30 17:51:33 -06:00