Commit Graph

123 Commits

Author SHA1 Message Date
Bill O'Donnell 74b7ce3c46 xfs: remove xfs_attr_sf_hdr_t
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 074aea4be1a4074be49a7ec41c674cc02b52fd60
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:35:02 2023 +0100

    xfs: remove xfs_attr_sf_hdr_t

    Remove the last two users of the typedef.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:19 -06:00
Bill O'Donnell e21282d525 xfs: remove struct xfs_attr_shortform
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 414147225400a0c4562ebfb0fdd40f065099ede4
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:35:01 2023 +0100

    xfs: remove struct xfs_attr_shortform

    sparse complains about struct xfs_attr_shortform because it embeds a
    structure with a variable sized array in a variable sized array.

    Given that xfs_attr_shortform is not a very useful structure, and the
    dir2 equivalent has been removed a long time ago, remove it as well.

    Provide a xfs_attr_sf_firstentry helper that returns the first
    xfs_attr_sf_entry behind a xfs_attr_sf_hdr to replace the structure
    dereference.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:19 -06:00
Bill O'Donnell 9bf304b850 xfs: use xfs_attr_sf_findname in xfs_attr_shortform_getvalue
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 1fb4b0def7b5a5bf91ad62a112d8d3f6dc76585f
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:35:00 2023 +0100

    xfs: use xfs_attr_sf_findname in xfs_attr_shortform_getvalue

    xfs_attr_shortform_getvalue duplicates the logic in xfs_attr_sf_findname.
    Use the helper instead.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:19 -06:00
Bill O'Donnell 017b1c033f xfs: remove xfs_attr_shortform_lookup
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 22b7b1f597a6a21fb7b3791a55f3a7ae54d2dfe4
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:34:59 2023 +0100

    xfs: remove xfs_attr_shortform_lookup

    xfs_attr_shortform_lookup is only used by xfs_attr_shortform_addname,
    which is much better served by calling xfs_attr_sf_findname.  Switch
    it over and remove xfs_attr_shortform_lookup.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:18 -06:00
Bill O'Donnell ac3b5d5adb xfs: simplify xfs_attr_sf_findname
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 6c8d169bbd51fc10d1d0029d495962881315b4c2
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:34:58 2023 +0100

    xfs: simplify xfs_attr_sf_findname

    xfs_attr_sf_findname has the simple job of finding a xfs_attr_sf_entry in
    the attr fork, but the convoluted calling convention obfuscates that.

    Return the found entry as the return value instead of an pointer
    argument, as the -ENOATTR/-EEXIST can be trivally derived from that, and
    remove the basep argument, as it is equivalent of the offset of sfe in
    the data for if an sfe was found, or an offset of totsize if not was
    found.  To simplify the totsize computation add a xfs_attr_sf_endptr
    helper that returns the imaginative xfs_attr_sf_entry at the end of
    the current attrs.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:18 -06:00
Bill O'Donnell 29cd7200e9 xfs: move the xfs_attr_sf_lookup tracepoint
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 14f2e4ab5d0310c2bb231941d9884fa5bae47fab
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:34:57 2023 +0100

    xfs: move the xfs_attr_sf_lookup tracepoint

    trace_xfs_attr_sf_lookup is currently only called by
    xfs_attr_shortform_lookup, which despit it's name is a simple helper for
    xfs_attr_shortform_addname, which has it's own tracing.  Move the
    callsite to xfs_attr_shortform_getvalue, which is the closest thing to
    a high level lookup we have for the Linux xattr API.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:18 -06:00
Bill O'Donnell 5718b6bb4d xfs: return if_data from xfs_idata_realloc
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 45c76a2add55b332d965c901e14004ae0134a67e
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:34:56 2023 +0100

    xfs: return if_data from xfs_idata_realloc

    Many of the xfs_idata_realloc callers need to set a local pointer to the
    just reallocated if_data memory.  Return the pointer to simplify them a
    bit and use the opportunity to re-use krealloc for freeing if_data if the
    size hits 0.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:18 -06:00
Bill O'Donnell 86c0442471 xfs: make if_data a void pointer
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 6e145f943bd86be47e54101fa5939f9ed0cb73e5
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Dec 20 07:34:55 2023 +0100

    xfs: make if_data a void pointer

    The xfs_ifork structure currently has a union of the if_root void pointer
    and the if_data char pointer.  In either case it is an opaque pointer
    that depends on the fork format.  Replace the union with a single if_data
    void pointer as that is what almost all callers want.  Only the symlink
    NULL termination code in xfs_init_local_fork actually needs a new local
    variable now.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:17 -06:00
Bill O'Donnell 070bdf384b xfs: zap broken inode forks
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit e744cef206055954517648070d2b3aaa3d2515ba
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Fri Dec 15 10:03:37 2023 -0800

    xfs: zap broken inode forks

    Determine if inode fork damage is responsible for the inode being unable
    to pass the ifork verifiers in xfs_iget and zap the fork contents if
    this is true.  Once this is done the fork will be empty but we'll be
    able to construct an in-core inode, and a subsequent call to the inode
    fork repair ioctl will search the rmapbt to rebuild the records that
    were in the fork.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:06 -06:00
Bill O'Donnell 96930a46e4 xfs: extract xfs_da_buf_copy() helper function
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit fd45ddb9dd606b3eaddf26e13f64340636955986
Author: Zhang Tianci <zhangtianci.1997@bytedance.com>
Date:   Tue Dec 5 13:59:00 2023 +0800

    xfs: extract xfs_da_buf_copy() helper function

    This patch does not modify logic.

    xfs_da_buf_copy() will copy one block from src xfs_buf to
    dst xfs_buf, and update the block metadata in dst directly.

    Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com>
    Suggested-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:25:57 -06:00
Bill O'Donnell 4f3600dea0 xfs: remove redundant initializations of pointers drop_leaf and save_leaf
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit 347eb95b27eb97bebdc3ea7de23558216f4e2c90
Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Wed Jun 28 10:59:37 2023 -0700

    xfs: remove redundant initializations of pointers drop_leaf and save_leaf

    Pointers drop_leaf and save_leaf are initialized with values that are never
    read, they are being re-assigned later on just before they are used. Remove
    the redundant early initializations and keep the later assignments at the
    point where they are used. Cleans up two clang scan build warnings:

    fs/xfs/libxfs/xfs_attr_leaf.c:2288:29: warning: Value stored to 'drop_leaf'
    during its initialization is never read [deadcode.DeadStores]
    fs/xfs/libxfs/xfs_attr_leaf.c:2289:29: warning: Value stored to 'save_leaf'
    during its initialization is never read [deadcode.DeadStores]

    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-06 10:32:51 -05:00
Bill O'Donnell ba8109db31 xfs: don't leak memory when attr fork loading fails
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit c78c2d0903183a41beb90c56a923e30f90fa91b9
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Jul 19 09:14:55 2022 -0700

    xfs: don't leak memory when attr fork loading fails

    I observed the following evidence of a memory leak while running xfs/399
    from the xfs fsck test suite (edited for brevity):

    XFS (sde): Metadata corruption detected at xfs_attr_shortform_verify_struct.part.0+0x7b/0xb0 [xfs], inode 0x1172 attr fork
    XFS: Assertion failed: ip->i_af.if_u1.if_data == NULL, file: fs/xfs/libxfs/xfs_inode_fork.c, line: 315
    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 91635 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs]
    CPU: 2 PID: 91635 Comm: xfs_scrub Tainted: G        W         5.19.0-rc7-xfsx #rc7 6e6475eb29fd9dda3181f81b7ca7ff961d277a40
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
    RIP: 0010:assfail+0x46/0x4a [xfs]
    Call Trace:
     <TASK>
     xfs_ifork_zap_attr+0x7c/0xb0
     xfs_iformat_attr_fork+0x86/0x110
     xfs_inode_from_disk+0x41d/0x480
     xfs_iget+0x389/0xd70
     xfs_bulkstat_one_int+0x5b/0x540
     xfs_bulkstat_iwalk+0x1e/0x30
     xfs_iwalk_ag_recs+0xd1/0x160
     xfs_iwalk_run_callbacks+0xb9/0x180
     xfs_iwalk_ag+0x1d8/0x2e0
     xfs_iwalk+0x141/0x220
     xfs_bulkstat+0x105/0x180
     xfs_ioc_bulkstat.constprop.0.isra.0+0xc5/0x130
     xfs_file_ioctl+0xa5f/0xef0
     __x64_sys_ioctl+0x82/0xa0
     do_syscall_64+0x2b/0x80
     entry_SYSCALL_64_after_hwframe+0x46/0xb0

    This newly-added assertion checks that there aren't any incore data
    structures hanging off the incore fork when we're trying to reset its
    contents.  From the call trace, it is evident that iget was trying to
    construct an incore inode from the ondisk inode, but the attr fork
    verifier failed and we were trying to undo all the memory allocations
    that we had done earlier.

    The three assertions in xfs_ifork_zap_attr check that the caller has
    already called xfs_idestroy_fork, which clearly has not been done here.
    As the zap function then zeroes the pointers, we've effectively leaked
    the memory.

    The shortest change would have been to insert an extra call to
    xfs_idestroy_fork, but it makes more sense to bundle the _idestroy_fork
    call into _zap_attr, since all other callsites call _idestroy_fork
    immediately prior to calling _zap_attr.  IOWs, it eliminates one way to
    fail.

    Note: This change only applies cleanly to 2ed5b09b3e8f, since we just
    reworked the attr fork lifetime.  However, I think this memory leak has
    existed since 0f45a1b20c, since the chain xfs_iformat_attr_fork ->
    xfs_iformat_local -> xfs_init_local_fork will allocate
    ifp->if_u1.if_data, but if xfs_ifork_verify_local_attr fails,
    xfs_iformat_attr_fork will free i_afp without freeing any of the stuff
    hanging off i_afp.  The solution for older kernels I think is to add the
    missing call to xfs_idestroy_fork just prior to calling kmem_cache_free.

    Found by fuzzing a.sfattr.hdr.totsize = lastbit in xfs/399.

    Fixes: 2ed5b09b3e8f ("xfs: make inode attribute forks a permanent part of struct xfs_inode")
    Probably-Fixes: 0f45a1b20c ("xfs: improve local fork verification")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:48 -05:00
Bill O'Donnell 3219617b1b xfs: replace inode fork size macros with functions
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit c01147d929899f02a0a8b15e406d12784768ca72
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Sat Jul 9 10:56:07 2022 -0700

    xfs: replace inode fork size macros with functions

    Replace the shouty macros here with typechecked helper functions.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:43 -05:00
Bill O'Donnell a2d362f29a xfs: make inode attribute forks a permanent part of struct xfs_inode
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

Conflicts: previous out of order application of 5625ea0 requires minor adjust to xfs_iomap.c

commit 2ed5b09b3e8fc274ae8fecd6ab7c5106a364bed1
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Sat Jul 9 10:56:06 2022 -0700

    xfs: make inode attribute forks a permanent part of struct xfs_inode

    Syzkaller reported a UAF bug a while back:

    ==================================================================
    BUG: KASAN: use-after-free in xfs_ilock_attr_map_shared+0xe3/0xf6 fs/xfs/xfs_inode.c:127
    Read of size 4 at addr ffff88802cec919c by task syz-executor262/2958

    CPU: 2 PID: 2958 Comm: syz-executor262 Not tainted
    5.15.0-0.30.3-20220406_1406 #3
    Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29
    04/01/2014
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:88 [inline]
     dump_stack_lvl+0x82/0xa9 lib/dump_stack.c:106
     print_address_description.constprop.9+0x21/0x2d5 mm/kasan/report.c:256
     __kasan_report mm/kasan/report.c:442 [inline]
     kasan_report.cold.14+0x7f/0x11b mm/kasan/report.c:459
     xfs_ilock_attr_map_shared+0xe3/0xf6 fs/xfs/xfs_inode.c:127
     xfs_attr_get+0x378/0x4c2 fs/xfs/libxfs/xfs_attr.c:159
     xfs_xattr_get+0xe3/0x150 fs/xfs/xfs_xattr.c:36
     __vfs_getxattr+0xdf/0x13d fs/xattr.c:399
     cap_inode_need_killpriv+0x41/0x5d security/commoncap.c:300
     security_inode_need_killpriv+0x4c/0x97 security/security.c:1408
     dentry_needs_remove_privs.part.28+0x21/0x63 fs/inode.c:1912
     dentry_needs_remove_privs+0x80/0x9e fs/inode.c:1908
     do_truncate+0xc3/0x1e0 fs/open.c:56
     handle_truncate fs/namei.c:3084 [inline]
     do_open fs/namei.c:3432 [inline]
     path_openat+0x30ab/0x396d fs/namei.c:3561
     do_filp_open+0x1c4/0x290 fs/namei.c:3588
     do_sys_openat2+0x60d/0x98c fs/open.c:1212
     do_sys_open+0xcf/0x13c fs/open.c:1228
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3a/0x7e arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x44/0x0
    RIP: 0033:0x7f7ef4bb753d
    Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48
    89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
    01 c3 48 8b 0d 1b 79 2c 00 f7 d8 64 89 01 48
    RSP: 002b:00007f7ef52c2ed8 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
    RAX: ffffffffffffffda RBX: 0000000000404148 RCX: 00007f7ef4bb753d
    RDX: 00007f7ef4bb753d RSI: 0000000000000000 RDI: 0000000020004fc0
    RBP: 0000000000404140 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0030656c69662f2e
    R13: 00007ffd794db37f R14: 00007ffd794db470 R15: 00007f7ef52c2fc0
     </TASK>

    Allocated by task 2953:
     kasan_save_stack+0x19/0x38 mm/kasan/common.c:38
     kasan_set_track mm/kasan/common.c:46 [inline]
     set_alloc_info mm/kasan/common.c:434 [inline]
     __kasan_slab_alloc+0x68/0x7c mm/kasan/common.c:467
     kasan_slab_alloc include/linux/kasan.h:254 [inline]
     slab_post_alloc_hook mm/slab.h:519 [inline]
     slab_alloc_node mm/slub.c:3213 [inline]
     slab_alloc mm/slub.c:3221 [inline]
     kmem_cache_alloc+0x11b/0x3eb mm/slub.c:3226
     kmem_cache_zalloc include/linux/slab.h:711 [inline]
     xfs_ifork_alloc+0x25/0xa2 fs/xfs/libxfs/xfs_inode_fork.c:287
     xfs_bmap_add_attrfork+0x3f2/0x9b1 fs/xfs/libxfs/xfs_bmap.c:1098
     xfs_attr_set+0xe38/0x12a7 fs/xfs/libxfs/xfs_attr.c:746
     xfs_xattr_set+0xeb/0x1a9 fs/xfs/xfs_xattr.c:59
     __vfs_setxattr+0x11b/0x177 fs/xattr.c:180
     __vfs_setxattr_noperm+0x128/0x5e0 fs/xattr.c:214
     __vfs_setxattr_locked+0x1d4/0x258 fs/xattr.c:275
     vfs_setxattr+0x154/0x33d fs/xattr.c:301
     setxattr+0x216/0x29f fs/xattr.c:575
     __do_sys_fsetxattr fs/xattr.c:632 [inline]
     __se_sys_fsetxattr fs/xattr.c:621 [inline]
     __x64_sys_fsetxattr+0x243/0x2fe fs/xattr.c:621
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3a/0x7e arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x44/0x0

    Freed by task 2949:
     kasan_save_stack+0x19/0x38 mm/kasan/common.c:38
     kasan_set_track+0x1c/0x21 mm/kasan/common.c:46
     kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:360
     ____kasan_slab_free mm/kasan/common.c:366 [inline]
     ____kasan_slab_free mm/kasan/common.c:328 [inline]
     __kasan_slab_free+0xe2/0x10e mm/kasan/common.c:374
     kasan_slab_free include/linux/kasan.h:230 [inline]
     slab_free_hook mm/slub.c:1700 [inline]
     slab_free_freelist_hook mm/slub.c:1726 [inline]
     slab_free mm/slub.c:3492 [inline]
     kmem_cache_free+0xdc/0x3ce mm/slub.c:3508
     xfs_attr_fork_remove+0x8d/0x132 fs/xfs/libxfs/xfs_attr_leaf.c:773
     xfs_attr_sf_removename+0x5dd/0x6cb fs/xfs/libxfs/xfs_attr_leaf.c:822
     xfs_attr_remove_iter+0x68c/0x805 fs/xfs/libxfs/xfs_attr.c:1413
     xfs_attr_remove_args+0xb1/0x10d fs/xfs/libxfs/xfs_attr.c:684
     xfs_attr_set+0xf1e/0x12a7 fs/xfs/libxfs/xfs_attr.c:802
     xfs_xattr_set+0xeb/0x1a9 fs/xfs/xfs_xattr.c:59
     __vfs_removexattr+0x106/0x16a fs/xattr.c:468
     cap_inode_killpriv+0x24/0x47 security/commoncap.c:324
     security_inode_killpriv+0x54/0xa1 security/security.c:1414
     setattr_prepare+0x1a6/0x897 fs/attr.c:146
     xfs_vn_change_ok+0x111/0x15e fs/xfs/xfs_iops.c:682
     xfs_vn_setattr_size+0x5f/0x15a fs/xfs/xfs_iops.c:1065
     xfs_vn_setattr+0x125/0x2ad fs/xfs/xfs_iops.c:1093
     notify_change+0xae5/0x10a1 fs/attr.c:410
     do_truncate+0x134/0x1e0 fs/open.c:64
     handle_truncate fs/namei.c:3084 [inline]
     do_open fs/namei.c:3432 [inline]
     path_openat+0x30ab/0x396d fs/namei.c:3561
     do_filp_open+0x1c4/0x290 fs/namei.c:3588
     do_sys_openat2+0x60d/0x98c fs/open.c:1212
     do_sys_open+0xcf/0x13c fs/open.c:1228
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3a/0x7e arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x44/0x0

    The buggy address belongs to the object at ffff88802cec9188
     which belongs to the cache xfs_ifork of size 40
    The buggy address is located 20 bytes inside of
     40-byte region [ffff88802cec9188, ffff88802cec91b0)
    The buggy address belongs to the page:
    page:00000000c3af36a1 refcount:1 mapcount:0 mapping:0000000000000000
    index:0x0 pfn:0x2cec9
    flags: 0xfffffc0000200(slab|node=0|zone=1|lastcpupid=0x1fffff)
    raw: 000fffffc0000200 ffffea00009d2580 0000000600000006 ffff88801a9ffc80
    raw: 0000000000000000 0000000080490049 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
     ffff88802cec9080: fb fb fb fc fc fa fb fb fb fb fc fc fb fb fb fb
     ffff88802cec9100: fb fc fc fb fb fb fb fb fc fc fb fb fb fb fb fc
    >ffff88802cec9180: fc fa fb fb fb fb fc fc fa fb fb fb fb fc fc fb
                                ^
     ffff88802cec9200: fb fb fb fb fc fc fb fb fb fb fb fc fc fb fb fb
     ffff88802cec9280: fb fb fc fc fa fb fb fb fb fc fc fa fb fb fb fb
    ==================================================================

    The root cause of this bug is the unlocked access to xfs_inode.i_afp
    from the getxattr code paths while trying to determine which ILOCK mode
    to use to stabilize the xattr data.  Unfortunately, the VFS does not
    acquire i_rwsem when vfs_getxattr (or listxattr) call into the
    filesystem, which means that getxattr can race with a removexattr that's
    tearing down the attr fork and crash:

    xfs_attr_set:                          xfs_attr_get:
    xfs_attr_fork_remove:                  xfs_ilock_attr_map_shared:

    xfs_idestroy_fork(ip->i_afp);
    kmem_cache_free(xfs_ifork_cache, ip->i_afp);

                                           if (ip->i_afp &&

    ip->i_afp = NULL;

                                               xfs_need_iread_extents(ip->i_afp))
                                           <KABOOM>

    ip->i_forkoff = 0;

    Regrettably, the VFS is much more lax about i_rwsem and getxattr than
    is immediately obvious -- not only does it not guarantee that we hold
    i_rwsem, it actually doesn't guarantee that we *don't* hold it either.
    The getxattr system call won't acquire the lock before calling XFS, but
    the file capabilities code calls getxattr with and without i_rwsem held
    to determine if the "security.capabilities" xattr is set on the file.

    Fixing the VFS locking requires a treewide investigation into every code
    path that could touch an xattr and what i_rwsem state it expects or sets
    up.  That could take years or even prove impossible; fortunately, we
    can fix this UAF problem inside XFS.

    An earlier version of this patch used smp_wmb in xfs_attr_fork_remove to
    ensure that i_forkoff is always zeroed before i_afp is set to null and
    changed the read paths to use smp_rmb before accessing i_forkoff and
    i_afp, which avoided these UAF problems.  However, the patch author was
    too busy dealing with other problems in the meantime, and by the time he
    came back to this issue, the situation had changed a bit.

    On a modern system with selinux, each inode will always have at least
    one xattr for the selinux label, so it doesn't make much sense to keep
    incurring the extra pointer dereference.  Furthermore, Allison's
    upcoming parent pointer patchset will also cause nearly every inode in
    the filesystem to have extended attributes.  Therefore, make the inode
    attribute fork structure part of struct xfs_inode, at a cost of 40 more
    bytes.

    This patch adds a clunky if_present field where necessary to maintain
    the existing logic of xattr fork null pointer testing in the existing
    codebase.  The next patch switches the logic over to XFS_IFORK_Q and it
    all goes away.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:42 -05:00
Bill O'Donnell 08529f7680 xfs: convert XFS_IFORK_PTR to a static inline helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 732436ef916b4f338d672ea56accfdb11e8d0732
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Sat Jul 9 10:56:05 2022 -0700

    xfs: convert XFS_IFORK_PTR to a static inline helper

    We're about to make this logic do a bit more, so convert the macro to a
    static inline function for better typechecking and fewer shouty macros.
    No functional changes here.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:42 -05:00
Bill O'Donnell 752ffec331 xfs: don't hold xattr leaf buffers across transaction rolls
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit e53bcffad0326c1ef4b4baec4262b5343e420c44
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Sat Jun 25 10:01:20 2022 -0700

    xfs: don't hold xattr leaf buffers across transaction rolls

    Now that we've established (again!) that empty xattr leaf buffers are
    ok, we no longer need to bhold them to transactions when we're creating
    new leaf blocks.  Get rid of the entire mechanism, which should simplify
    the xattr code quite a bit.

    The original justification for using bhold here was to prevent the AIL
    from trying to write the empty leaf block into the fs during the brief
    time that we release the buffer lock.  The reason for /that/ was to
    prevent recovery from tripping over the empty ondisk block.

    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:34 -05:00
Bill O'Donnell 6b5d7a9ede xfs: empty xattr leaf header blocks are not corruption
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 7be3bd8856fba99f8b25b9c223250e42292c312e
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Fri Jun 24 15:01:28 2022 -0700

    xfs: empty xattr leaf header blocks are not corruption

    TLDR: Revert commit 51e6104fdb95 ("xfs: detect empty attr leaf blocks in
    xfs_attr3_leaf_verify") because it was wrong.

    Every now and then we get a corruption report from the kernel or
    xfs_repair about empty leaf blocks in the extended attribute structure.
    We've long thought that these shouldn't be possible, but prior to 5.18
    one would shake loose in the recoveryloop fstests about once a month.

    A new addition to the xattr leaf block verifier in 5.19-rc1 makes this
    happen every 7 minutes on my testing cloud.  I added a ton of logging to
    detect any time we set the header count on an xattr leaf block to zero.
    This produced the following dmesg output on generic/388:

    XFS (sda4): ino 0x21fcbaf leaf 0x129bf78 hdcount==0!
    Call Trace:
     <TASK>
     dump_stack_lvl+0x34/0x44
     xfs_attr3_leaf_create+0x187/0x230
     xfs_attr_shortform_to_leaf+0xd1/0x2f0
     xfs_attr_set_iter+0x73e/0xa90
     xfs_xattri_finish_update+0x45/0x80
     xfs_attr_finish_item+0x1b/0xd0
     xfs_defer_finish_noroll+0x19c/0x770
     __xfs_trans_commit+0x153/0x3e0
     xfs_attr_set+0x36b/0x740
     xfs_xattr_set+0x89/0xd0
     __vfs_setxattr+0x67/0x80
     __vfs_setxattr_noperm+0x6e/0x120
     vfs_setxattr+0x97/0x180
     setxattr+0x88/0xa0
     path_setxattr+0xc3/0xe0
     __x64_sys_setxattr+0x27/0x30
     do_syscall_64+0x35/0x80
     entry_SYSCALL_64_after_hwframe+0x46/0xb0

    So now we know that someone is creating empty xattr leaf blocks as part
    of converting a sf xattr structure into a leaf xattr structure.  The
    conversion routine logs any existing sf attributes in the same
    transaction that creates the leaf block, so we know this is a setxattr
    to a file that has no attributes at all.

    Next, g/388 calls the shutdown ioctl and cycles the mount to trigger log
    recovery.  I also augmented buffer item recovery to call ->verify_struct
    on any attr leaf blocks and complain if it finds a failure:

    XFS (sda4): Unmounting Filesystem
    XFS (sda4): Mounting V5 Filesystem
    XFS (sda4): Starting recovery (logdev: internal)
    XFS (sda4): xattr leaf daddr 0x129bf78 hdrcount == 0!
    Call Trace:
     <TASK>
     dump_stack_lvl+0x34/0x44
     xfs_attr3_leaf_verify+0x3b8/0x420
     xlog_recover_buf_commit_pass2+0x60a/0x6c0
     xlog_recover_items_pass2+0x4e/0xc0
     xlog_recover_commit_trans+0x33c/0x350
     xlog_recovery_process_trans+0xa5/0xe0
     xlog_recover_process_data+0x8d/0x140
     xlog_do_recovery_pass+0x19b/0x720
     xlog_do_log_recovery+0x62/0xc0
     xlog_do_recover+0x33/0x1d0
     xlog_recover+0xda/0x190
     xfs_log_mount+0x14c/0x360
     xfs_mountfs+0x517/0xa60
     xfs_fs_fill_super+0x6bc/0x950
     get_tree_bdev+0x175/0x280
     vfs_get_tree+0x1a/0x80
     path_mount+0x6f5/0xaa0
     __x64_sys_mount+0x103/0x140
     do_syscall_64+0x35/0x80
     entry_SYSCALL_64_after_hwframe+0x46/0xb0
    RIP: 0033:0x7fc61e241eae

    And a moment later, the _delwri_submit of the recovered buffers trips
    the same verifier and recovery fails:

    XFS (sda4): Metadata corruption detected at xfs_attr3_leaf_verify+0x393/0x420 [xfs], xfs_attr3_leaf block 0x129bf78
    XFS (sda4): Unmount and run xfs_repair
    XFS (sda4): First 128 bytes of corrupted metadata buffer:
    00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
    00000010: 00 00 00 00 01 29 bf 78 00 00 00 00 00 00 00 00  .....).x........
    00000020: a5 1b d0 02 b2 9a 49 df 8e 9c fb 8d f8 31 3e 9d  ......I......1>.
    00000030: 00 00 00 00 02 1f cb af 00 00 00 00 10 00 00 00  ................
    00000040: 00 50 0f b0 00 00 00 00 00 00 00 00 00 00 00 00  .P..............
    00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    XFS (sda4): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x37f/0x3b0 [xfs] (fs/xfs/xfs_buf.c:1518).  Shutting down filesystem.
    XFS (sda4): Please unmount the filesystem and rectify the problem(s)
    XFS (sda4): log mount/recovery failed: error -117
    XFS (sda4): log mount failed

    I think I see what's going on here -- setxattr is racing with something
    that shuts down the filesystem:

    Thread 1                                Thread 2
    --------                                --------
    xfs_attr_sf_addname
    xfs_attr_shortform_to_leaf
    <create empty leaf>
    xfs_trans_bhold(leaf)
    xattri_dela_state = XFS_DAS_LEAF_ADD
    <roll transaction>
                                            <flush log>
                                            <shut down filesystem>
    xfs_trans_bhold_release(leaf)
    <discover fs is dead, bail>

    Thread 3
    --------
    <cycle mount, start recovery>
    xlog_recover_buf_commit_pass2
    xlog_recover_do_reg_buffer
    <replay empty leaf buffer from recovered buf item>
    xfs_buf_delwri_queue(leaf)
    xfs_buf_delwri_submit
    _xfs_buf_ioapply(leaf)
    xfs_attr3_leaf_write_verify
    <trip over empty leaf buffer>
    <fail recovery>

    As you can see, the bhold keeps the leaf buffer locked and thus prevents
    the *AIL* from tripping over the ichdr.count==0 check in the write
    verifier.  Unfortunately, it doesn't prevent the log from getting
    flushed to disk, which sets up log recovery to fail.

    So.  It's clear that the kernel has always had the ability to persist
    attr leaf blocks with ichdr.count==0, which means that it's part of the
    ondisk format now.

    Unfortunately, this check has been added and removed multiple times
    throughout history.  It first appeared in[1] kernel 3.10 as part of the
    early V5 format patches.  The check was later discovered to break log
    recovery and hence disabled[2] during log recovery in kernel 4.10.
    Simultaneously, the check was added[3] to xfs_repair 4.9.0 to try to
    weed out the empty leaf blocks.  This was still not correct because log
    recovery would recover an empty attr leaf block successfully only for
    regular xattr operations to trip over the empty block during of the
    block during regular operation.  Therefore, the check was removed
    entirely[4] in kernel 5.7 but removal of the xfs_repair check was
    forgotten.  The continued complaints from xfs_repair lead to us
    mistakenly re-adding[5] the verifier check for kernel 5.19.  Remove it
    once again.

    [1] 517c22207b ("xfs: add CRCs to attr leaf blocks")
    [2] 2e1d23370e ("xfs: ignore leaf attr ichdr.count in verifier
                       during log replay")
    [3] f7140161 ("xfs_repair: junk leaf attribute if count == 0")
    [4] f28cef9e4d ("xfs: don't fail verifier on empty attr3 leaf
                       block")
    [5] 51e6104fdb95 ("xfs: detect empty attr leaf blocks in
                       xfs_attr3_leaf_verify")

    Looking at the rest of the xattr code, it seems that files with empty
    leaf blocks behave as expected -- listxattr reports no attributes;
    getxattr on any xattr returns nothing as expected; removexattr does
    nothing; and setxattr can add attributes just fine.

    Original-bug: 517c22207b ("xfs: add CRCs to attr leaf blocks")
    Still-not-fixed-by: 2e1d23370e ("xfs: ignore leaf attr ichdr.count in verifier during log replay")
    Removed-in: f28cef9e4d ("xfs: don't fail verifier on empty attr3 leaf block")
    Fixes: 51e6104fdb95 ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:34 -05:00
Bill O'Donnell 510137b4dd xfs: fix TOCTOU race involving the new logged xattrs control knob
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit f4288f01820e2d57722d21874c1fda661003c9b9
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Sun Jun 5 18:51:22 2022 -0700

    xfs: fix TOCTOU race involving the new logged xattrs control knob

    I found a race involving the larp control knob, aka the debugging knob
    that lets developers enable logging of extended attribute updates:

    Thread 1                        Thread 2

    echo 0 > /sys/fs/xfs/debug/larp
                                    setxattr(REPLACE)
                                    xfs_has_larp (returns false)
                                    xfs_attr_set

    echo 1 > /sys/fs/xfs/debug/larp

                                    xfs_attr_defer_replace
                                    xfs_attr_init_replace_state
                                    xfs_has_larp (returns true)
                                    xfs_attr_init_remove_state

                                    <oops, wrong DAS state!>

    This isn't a particularly severe problem right now because xattr logging
    is only enabled when CONFIG_XFS_DEBUG=y, and developers *should* know
    what they're doing.

    However, the eventual intent is that callers should be able to ask for
    the assistance of the log in persisting xattr updates.  This capability
    might not be required for /all/ callers, which means that dynamic
    control must work correctly.  Once an xattr update has decided whether
    or not to use logged xattrs, it needs to stay in that mode until the end
    of the operation regardless of what subsequent parallel operations might
    do.

    Therefore, it is an error to continue sampling xfs_globals.larp once
    xfs_attr_change has made a decision about larp, and it was not correct
    for me to have told Allison that ->create_intent functions can sample
    the global log incompat feature bitfield to decide to elide a log item.

    Instead, create a new op flag for the xfs_da_args structure, and convert
    all other callers of xfs_has_larp and xfs_sb_version_haslogxattrs within
    the attr update state machine to look for the operations flag.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:32 -05:00
Bill O'Donnell c16b68acb7 xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 51e6104fdb95f377c8741794778319bd413f4fff
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu May 12 15:12:57 2022 +1000

    xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify

    xfs_repair flags these as a corruption error, so the verifier should
    catch software bugs that result in empty leaf blocks being written
    to disk, too.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:24 -05:00
Bill O'Donnell 38b5dff00d xfs: ATTR_REPLACE algorithm with LARP enabled needs rework
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit fdaf1bb3cafcfee9ef05c4eaf6ee1193fd90cbd2
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu May 12 15:12:56 2022 +1000

    xfs: ATTR_REPLACE algorithm with LARP enabled needs rework

    We can't use the same algorithm for replacing an existing attribute
    when logging attributes. The existing algorithm is essentially:

    1. create new attr w/ INCOMPLETE
    2. atomically flip INCOMPLETE flags between old + new attribute
    3. remove old attr which is marked w/ INCOMPLETE

    This algorithm guarantees that we see either the old or new
    attribute, and if we fail after the atomic flag flip, we don't have
    to recover the removal of the old attr because we never see
    INCOMPLETE attributes in lookups.

    For logged attributes, however, this does not work. The logged
    attribute intents do not track the work that has been done as the
    transaction rolls, and hence the only recovery mechanism we have is
    "run the replace operation from scratch".

    This is further exacerbated by the attempt to avoid needing the
    INCOMPLETE flag to create an atomic swap. This means we can create
    a second active attribute of the same name before we remove the
    original. If we fail at any point after the create but before the
    removal has completed, we end up with duplicate attributes in
    the attr btree and recovery only tries to replace one of them.

    There are several other failure modes where we can leave partially
    allocated remote attributes that expose stale data, partially free
    remote attributes that enable UAF based stale data exposure, etc.

    TO fix this, we need a different algorithm for replace operations
    when LARP is enabled. Luckily, it's not that complex if we take the
    right first step. That is, the first thing we log is the attri
    intent with the new name/value pair and mark the old attr as
    INCOMPLETE in the same transaction.

    From there, we then remove the old attr and keep relogging the
    new name/value in the intent, such that we always know that we have
    to create the new attr in recovery. Once the old attr is removed,
    we then run a normal ATTR_CREATE operation relogging the intent as
    we go. If the new attr is local, then it gets created in a single
    atomic transaction that also logs the final intent done. If the new
    attr is remote, the we set INCOMPLETE on the new attr while we
    allocate and set the remote value, and then we clear the INCOMPLETE
    flag at in the last transaction taht logs the final intent done.

    If we fail at any point in this algorithm, log recovery will always
    see the same state on disk: the new name/value in the intent, and
    either an INCOMPLETE attr or no attr in the attr btree. If we find
    an INCOMPLETE attr, we run the full replace starting with removing
    the INCOMPLETE attr. If we don't find it, then we simply create the
    new attr.

    Notably, recovery of a failed create that has an INCOMPLETE flag set
    is now the same - we start with the lookup of the INCOMPLETE attr,
    and if that exists then we do the full replace recovery process,
    otherwise we just create the new attr.

    Hence changing the way we do the replace operation when LARP is
    enabled allows us to use the same log recovery algorithm for both
    the ATTR_CREATE and ATTR_REPLACE operations. This is also the same
    algorithm we use for runtime ATTR_REPLACE operations (except for the
    step setting up the initial conditions).

    The result is that:

    - ATTR_CREATE uses the same algorithm regardless of whether LARP is
      enabled or not
    - ATTR_REPLACE with larp=0 is identical to the old algorithm
    - ATTR_REPLACE with larp=1 runs an unmodified attr removal algorithm
      from the larp=0 code and then runs the unmodified ATTR_CREATE
      code.
    - log recovery when larp=1 runs the same ATTR_REPLACE algorithm as
      it uses at runtime.

    Because the state machine is now quite clean, changing the algorithm
    is really just a case of changing the initial state and how the
    states link together for the ATTR_REPLACE case. Hence it's not a
    huge amount of code for what is a fairly substantial rework
    of the attr logging and recovery algorithm....

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:23 -05:00
Bill O'Donnell 159a808d8f xfs: use XFS_DA_OP flags in deferred attr ops
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit e7f358dee4e5cf1ce8b11ff2e65d5ccb1ced24db
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu May 12 15:12:56 2022 +1000

    xfs: use XFS_DA_OP flags in deferred attr ops

    We currently store the high level attr operation in
    args->attr_flags. This field contains what the VFS is telling us to
    do, but don't necessarily match what we are doing in the low level
    modification state machine. e.g. XATTR_REPLACE implies both
    XFS_DA_OP_ADDNAME and XFS_DA_OP_RENAME because it is doing both a
    remove and adding a new attr.

    However, deep in the individual state machine operations, we check
    errors against this high level VFS op flags, not the low level
    XFS_DA_OP flags. Indeed, we don't even have a low level flag for
    a REMOVE operation, so the only way we know we are doing a remove
    is the complete absence of XATTR_REPLACE, XATTR_CREATE,
    XFS_DA_OP_ADDNAME and XFS_DA_OP_RENAME. And because there are other
    flags in these fields, this is a pain to check if we need to.

    As the XFS_DA_OP flags are only needed once the deferred operations
    are set up, set these flags appropriately when we set the initial
    operation state. We also introduce a XFS_DA_OP_REMOVE flag to make
    it easy to know that we are doing a remove operation.

    With these, we can remove the use of XATTR_REPLACE and XATTR_CREATE
    in low level lookup operations, and manipulate the low level flags
    according to the low level context that is operating. e.g. log
    recovery does not have a VFS xattr operation state to copy into
    args->attr_flags, and the low level state machine ops we do for
    recovery do not match the high level VFS operations that were in
    progress when the system failed...

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:23 -05:00
Bill O'Donnell eae8298b48 xfs: add leaf to node error tag
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit c5218a7cd97349c53bc64e447778a07e49364d40
Author: Allison Henderson <allison.henderson@oracle.com>
Date:   Wed May 11 17:01:23 2022 +1000

    xfs: add leaf to node error tag

    Add an error tag on xfs_attr3_leaf_to_node to test log attribute
    recovery and replay.

    Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
    Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:18 -05:00
Bill O'Donnell 526f2aad48 xfs: Skip flip flags for delayed attrs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit f38dc503d366b589d98d5676a5b279d10b47bcb9
Author: Allison Henderson <allison.henderson@oracle.com>
Date:   Mon May 9 19:09:10 2022 +1000

    xfs: Skip flip flags for delayed attrs

    This is a clean up patch that skips the flip flag logic for delayed attr
    renames.  Since the log replay keeps the inode locked, we do not need to
    worry about race windows with attr lookups.  So we can skip over
    flipping the flag and the extra transaction roll for it

    Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:16 -05:00
Carlos Maiolino 25a40d32f8 xfs: rename _zone variables to _cache
Bugzilla: https://bugzilla.redhat.com/2125724

Conflicts:
	Small conflict at xfs_inode_alloc() due to out of order
	backport. Inode alloc using kmem_cache_alloc() has been
	converted to use alloc_inode_sb() before this patch.

Now that we've gotten rid of the kmem_zone_t typedef, rename the
variables to _cache since that's what they are.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
(cherry picked from commit 182696fb021fc196e5cbe641565ca40fcf0f885a)
2022-10-21 12:50:46 +02:00
Brian Foster 5872597dac xfs: convert bp->b_bn references to xfs_buf_daddr()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit 9343ee76909e3f6466d85c9ebb0e343cdf54de71
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:47:05 2021 -0700

    xfs: convert bp->b_bn references to xfs_buf_daddr()

    Stop directly referencing b_bn in code outside the buffer cache, as
    b_bn is supposed to be used only as an internal cache index.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:36 -04:00
Brian Foster 6def1029c3 xfs: convert mount flags to features
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git
Conflicts: Work around out of order backport in xfs_fs_fill_super().

commit 0560f31a09e523090d1ab2bfe21c69d028c2bdf2
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:52 2021 -0700

    xfs: convert mount flags to features

    Replace m_flags feature checks with xfs_has_<feature>() calls and
    rework the setup code to set flags in m_features.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:34 -04:00
Brian Foster d54a790d1d xfs: replace xfs_sb_version checks with feature flag checks
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit 38c26bfd90e1999650d5ef40f90d721f05916643
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:37 2021 -0700

    xfs: replace xfs_sb_version checks with feature flag checks

    Convert the xfs_sb_version_hasfoo() to checks against
    mp->m_features. Checks of the superblock itself during disk
    operations (e.g. in the read/write verifiers and the to/from disk
    formatters) are not converted - they operate purely on the
    superblock state. Everything else should use the mount features.

    Large parts of this conversion were done with sed with commands like
    this:

    for f in `git grep -l xfs_sb_version_has fs/xfs/*.c`; do
            sed -i -e 's/xfs_sb_version_has\(.*\)(&\(.*\)->m_sb)/xfs_has_\1(\2)/' $f
    done

    With manual cleanups for things like "xfs_has_extflgbit" and other
    little inconsistencies in naming.

    The result is ia lot less typing to check features and an XFS binary
    size reduced by a bit over 3kB:

    $ size -t fs/xfs/built-in.a
            text       data     bss     dec     hex filenam
    before  1130866  311352     484 1442702  16038e (TOTALS)
    after   1127727  311352     484 1439563  15f74b (TOTALS)

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:34 -04:00
Brian Foster 8bf8cc906b xfs: replace kmem_alloc_large() with kvmalloc()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit d634525db63e9e946c3229fb93c8d9b763afbaf3
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Aug 9 10:10:01 2021 -0700

    xfs: replace kmem_alloc_large() with kvmalloc()

    There is no reason for this wrapper existing anymore. All the places
    that use KM_NOFS allocation are within transaction contexts and
    hence covered by memalloc_nofs_save/restore contexts. Hence we don't
    need any special handling of vmalloc for large IOs anymore and
    so special casing this code isn't necessary.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:24 -04:00
Brian Foster c908a48ac9 xfs: fix silly whitespace problems with kernel libxfs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit b7df7630cccd103671b14b946bcdb3b14be75d68
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Fri Aug 6 11:05:44 2021 -0700

    xfs: fix silly whitespace problems with kernel libxfs

    Fix a few whitespace errors such as spaces at the end of the line, etc.
    This gets us back to something more closely resembling parity.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:23 -04:00
Darrick J. Wong d1015e2ebd Merge tag 'xfs-delay-ready-attrs-v20.1' of https://github.com/allisonhenderson/xfs_work into xfs-5.14-merge4
xfs: Delay Ready Attributes

Hi all,

This set is a subset of a larger series for Dealyed Attributes. Which is a
subset of a yet larger series for parent pointers. Delayed attributes allow
attribute operations (set and remove) to be logged and committed in the same
way that other delayed operations do. This allows more complex operations (like
parent pointers) to be broken up into multiple smaller transactions. To do
this, the existing attr operations must be modified to operate as a delayed
operation.  This means that they cannot roll, commit, or finish transactions.
Instead, they return -EAGAIN to allow the calling function to handle the
transaction.  In this series, we focus on only the delayed attribute portion.
We will introduce parent pointers in a later set.

The set as a whole is a bit much to digest at once, so I usually send out the
smaller sub series to reduce reviewer burn out.  But the entire extended series
is visible through the included github links.

Updates since v19: Added Darricks fix for the remote block accounting as well as
some minor nits about the default assert in xfs_attr_set_iter.  Spent quite
a bit of time testing this cycle to weed out any more unexpected bugs.  No new
test failures were observed with the addition of this set.

xfs: Fix default ASSERT in xfs_attr_set_iter
  Replaced the assert with ASSERT(0);

xfs: Add delay ready attr remove routines
  Added Darricks fix for remote block accounting

This series can be viewed on github here:
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_v20

As well as the extended delayed attribute and parent pointer series:
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_v20_extended

And the test cases:
https://github.com/allisonhenderson/xfs_work/tree/pptr_xfstestsv3
In order to run the test cases, you will need have the corresponding xfsprogs

changes as well.  Which can be found here:
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_xfsprogs_v20
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_xfsprogs_v20_extended

To run the xfs attributes tests run:
check -g attr

To run as delayed attributes run:
export MOUNT_OPTIONS="-o delattr"
check -g attr

To run parent pointer tests:
check -g parent

I've also made the corresponding updates to the user space side as well, and ported anything
they need to seat correctly.

Questions, comment and feedback appreciated!

Thanks all!
Allison

* tag 'xfs-delay-ready-attrs-v20.1' of https://github.com/allisonhenderson/xfs_work:
  xfs: Make attr name schemes consistent
  xfs: Fix default ASSERT in xfs_attr_set_iter
  xfs: Clean up xfs_attr_node_addname_clear_incomplete
  xfs: Remove xfs_attr_rmtval_set
  xfs: Add delay ready attr set routines
  xfs: Add delay ready attr remove routines
  xfs: Hoist node transaction handling
  xfs: Hoist xfs_attr_leaf_addname
  xfs: Hoist xfs_attr_node_addname
  xfs: Add helper xfs_attr_node_addname_find_attr
  xfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incomplete
  xfs: Refactor xfs_attr_set_shortform
  xfs: Add xfs_attr_node_remove_name
  xfs: Reverse apply 72b97ea40d
2021-06-18 08:13:22 -07:00
Allison Henderson 816c8e39b7 xfs: Make attr name schemes consistent
This patch renames the following functions to make the nameing scheme more consistent:
xfs_attr_shortform_remove -> xfs_attr_sf_removename
xfs_attr_node_remove_name -> xfs_attr_node_removename
xfs_attr_set_fmt -> xfs_attr_sf_addname

Suggested-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-09 09:34:05 -07:00
Dave Chinner 9bbafc7191 xfs: move xfs_perag_get/put to xfs_ag.[ch]
They are AG functions, not superblock functions, so move them to the
appropriate location.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Allison Henderson 2b74b03c13 xfs: Add delay ready attr remove routines
This patch modifies the attr remove routines to be delay ready. This
means they no longer roll or commit transactions, but instead return
-EAGAIN to have the calling routine roll and refresh the transaction. In
this series, xfs_attr_remove_args is merged with
xfs_attr_node_removename become a new function, xfs_attr_remove_iter.
This new version uses a sort of state machine like switch to keep track
of where it was when EAGAIN was returned. A new version of
xfs_attr_remove_args consists of a simple loop to refresh the
transaction until the operation is completed. A new XFS_DAC_DEFER_FINISH
flag is used to finish the transaction where ever the existing code used
to.

Calls to xfs_attr_rmtval_remove are replaced with the delay ready
version __xfs_attr_rmtval_remove. We will rename
__xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
done.

xfs_attr_rmtval_remove itself is still in use by the set routines (used
during a rename).  For reasons of preserving existing function, we
modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
set.  Similar to how xfs_attr_remove_args does here.  Once we transition
the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
used and will be removed.

This patch also adds a new struct xfs_delattr_context, which we will use
to keep track of the current state of an attribute operation. The new
xfs_delattr_state enum is used to track various operations that are in
progress so that we know not to repeat them, and resume where we left
off before EAGAIN was returned to cycle out the transaction. Other
members take the place of local variables that need to retain their
values across multiple function calls.  See xfs_attr.h for a more
detailed diagram of the states.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-01 10:49:47 -07:00
Christoph Hellwig b2197a36c0 xfs: remove XFS_IFEXTENTS
The in-memory XFS_IFEXTENTS is now only used to check if an inode with
extents still needs the extents to be read into memory before doing
operations that need the extent map.  Add a new xfs_need_iread_extents
helper that returns true for btree format forks that do not have any
entries in the in-memory extent btree, and use that instead of checking
the XFS_IFEXTENTS flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-15 09:35:51 -07:00
Christoph Hellwig 0779f4a68d xfs: remove XFS_IFINLINE
Just check for an inline format fork instead of the using the equivalent
in-memory XFS_IFINLINE flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-15 09:35:51 -07:00
Christoph Hellwig 7821ea302d xfs: move the di_forkoff field to struct xfs_inode
In preparation of removing the historic icinode struct, move the
forkoff field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Gao Xiang ada49d64fb xfs: fix forkoff miscalculation related to XFS_LITINO(mp)
Currently, commit e9e2eae89d dropped a (int) decoration from
XFS_LITINO(mp), and since sizeof() expression is also involved,
the result of XFS_LITINO(mp) is simply as the size_t type
(commonly unsigned long).

Considering the expression in xfs_attr_shortform_bytesfit():
  offset = (XFS_LITINO(mp) - bytes) >> 3;
let "bytes" be (int)340, and
    "XFS_LITINO(mp)" be (unsigned long)336.

on 64-bit platform, the expression is
  offset = ((unsigned long)336 - (int)340) >> 3 =
           (int)(0xfffffffffffffffcUL >> 3) = -1

but on 32-bit platform, the expression is
  offset = ((unsigned long)336 - (int)340) >> 3 =
           (int)(0xfffffffcUL >> 3) = 0x1fffffff
instead.

so offset becomes a large positive number on 32-bit platform, and
cause xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0.

Therefore, one result is
  "ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));"

assertion failure in xfs_idata_realloc(), which was also the root
cause of the original bugreport from Dennis, see:
   https://bugzilla.redhat.com/show_bug.cgi?id=1894177

And it can also be manually triggered with the following commands:
  $ touch a;
  $ setfattr -n user.0 -v "`seq 0 80`" a;
  $ setfattr -n user.1 -v "`seq 0 80`" a

on 32-bit platform.

Fix the case in xfs_attr_shortform_bytesfit() by bailing out
"XFS_LITINO(mp) < bytes" in advance suggested by Eric and a misleading
comment together with this bugfix suggested by Darrick. It seems the
other users of XFS_LITINO(mp) are not impacted.

Fixes: e9e2eae89d ("xfs: only check the superblock version for dinode size calculation")
Cc: <stable@vger.kernel.org> # 5.7+
Reported-and-tested-by: Dennis Gilmore <dgilmore@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-11-18 09:23:51 -08:00
Carlos Maiolino e01b7eed5d xfs: Convert xfs_attr_sf macros to inline functions
xfs_attr_sf_totsize() requires access to xfs_inode structure, so, once
xfs_attr_shortform_addname() is its only user, move it to xfs_attr.c
instead of playing with more #includes.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:42 -07:00
Carlos Maiolino c418dbc980 xfs: Use variable-size array for nameval in xfs_attr_sf_entry
nameval is a variable-size array, so, define it as it, and remove all
the -1 magic number subtractions

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:42 -07:00
Carlos Maiolino 47e6cc1000 xfs: Remove typedef xfs_attr_shortform_t
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:42 -07:00
Carlos Maiolino 6337c84466 xfs: remove typedef xfs_attr_sf_entry_t
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-09-15 20:52:41 -07:00
Darrick J. Wong 125eac2438 xfs: initialize the shortform attr header padding entry
Don't leak kernel memory contents into the shortform attr fork.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-08-27 08:01:31 -07:00
Eric Sandeen f4020438fa xfs: fix boundary test in xfs_attr_shortform_verify
The boundary test for the fixed-offset parts of xfs_attr_sf_entry in
xfs_attr_shortform_verify is off by one, because the variable array
at the end is defined as nameval[1] not nameval[].
Hence we need to subtract 1 from the calculation.

This can be shown by:

# touch file
# setfattr -n root.a file

and verifications will fail when it's written to disk.

This only matters for a last attribute which has a single-byte name
and no value, otherwise the combination of namelen & valuelen will
push endp further out and this test won't fail.

Fixes: 1e1bbd8e7e ("xfs: create structure verifier function for shortform xattrs")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-08-26 14:13:21 -07:00
Allison Collins 1fc618d762 xfs: Pull up trans roll in xfs_attr3_leaf_clearflag
New delayed allocation routines cannot be handling transactions so
pull them out into the calling functions

Signed-off-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
2020-07-28 20:28:11 -07:00
Allison Collins 0949d317ae xfs: Pull up trans roll from xfs_attr3_leaf_setflag
New delayed allocation routines cannot be handling transactions so
pull them up into the calling functions

Signed-off-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
2020-07-28 20:28:11 -07:00
Allison Collins e3be1272dd xfs: Pull up trans handling in xfs_attr3_leaf_flipflags
Since delayed operations cannot roll transactions, pull up the
transaction handling into the calling function

Signed-off-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
2020-07-28 20:28:11 -07:00
Allison Collins 07120f1abd xfs: Add xfs_has_attr and subroutines
This patch adds a new functions to check for the existence of an
attribute. Subroutines are also added to handle the cases of leaf
blocks, nodes or shortform. Common code that appears in existing attr
add and remove functions have been factored out to help reduce the
appearance of duplicated code.  We will need these routines later for
delayed attributes since delayed operations cannot return error codes.

Signed-off-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix a leak-on-error bug reported by Dan Carpenter]
[darrick: fix unused variable warning reported by 0day]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Reported-by: dan.carpenter@oracle.com
Reported-by: kernel test robot <lkp@intel.com>
2020-07-28 20:24:14 -07:00
Darrick J. Wong 6dcde60efd xfs: more lockdep whackamole with kmem_alloc*
Dave Airlie reported the following lockdep complaint:

>  ======================================================
>  WARNING: possible circular locking dependency detected
>  5.7.0-0.rc5.20200515git1ae7efb38854.1.fc33.x86_64 #1 Not tainted
>  ------------------------------------------------------
>  kswapd0/159 is trying to acquire lock:
>  ffff9b38d01a4470 (&xfs_nondir_ilock_class){++++}-{3:3},
>  at: xfs_ilock+0xde/0x2c0 [xfs]
>
>  but task is already holding lock:
>  ffffffffbbb8bd00 (fs_reclaim){+.+.}-{0:0}, at:
>  __fs_reclaim_acquire+0x5/0x30
>
>  which lock already depends on the new lock.
>
>
>  the existing dependency chain (in reverse order) is:
>
>  -> #1 (fs_reclaim){+.+.}-{0:0}:
>         fs_reclaim_acquire+0x34/0x40
>         __kmalloc+0x4f/0x270
>         kmem_alloc+0x93/0x1d0 [xfs]
>         kmem_alloc_large+0x4c/0x130 [xfs]
>         xfs_attr_copy_value+0x74/0xa0 [xfs]
>         xfs_attr_get+0x9d/0xc0 [xfs]
>         xfs_get_acl+0xb6/0x200 [xfs]
>         get_acl+0x81/0x160
>         posix_acl_xattr_get+0x3f/0xd0
>         vfs_getxattr+0x148/0x170
>         getxattr+0xa7/0x240
>         path_getxattr+0x52/0x80
>         do_syscall_64+0x5c/0xa0
>         entry_SYSCALL_64_after_hwframe+0x49/0xb3
>
>  -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
>         __lock_acquire+0x1257/0x20d0
>         lock_acquire+0xb0/0x310
>         down_write_nested+0x49/0x120
>         xfs_ilock+0xde/0x2c0 [xfs]
>         xfs_reclaim_inode+0x3f/0x400 [xfs]
>         xfs_reclaim_inodes_ag+0x20b/0x410 [xfs]
>         xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
>         super_cache_scan+0x190/0x1e0
>         do_shrink_slab+0x184/0x420
>         shrink_slab+0x182/0x290
>         shrink_node+0x174/0x680
>         balance_pgdat+0x2d0/0x5f0
>         kswapd+0x21f/0x510
>         kthread+0x131/0x150
>         ret_from_fork+0x3a/0x50
>
>  other info that might help us debug this:
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(fs_reclaim);
>                                 lock(&xfs_nondir_ilock_class);
>                                 lock(fs_reclaim);
>    lock(&xfs_nondir_ilock_class);
>
>   *** DEADLOCK ***
>
>  4 locks held by kswapd0/159:
>   #0: ffffffffbbb8bd00 (fs_reclaim){+.+.}-{0:0}, at:
>  __fs_reclaim_acquire+0x5/0x30
>   #1: ffffffffbbb7cef8 (shrinker_rwsem){++++}-{3:3}, at:
>  shrink_slab+0x115/0x290
>   #2: ffff9b39f07a50e8
>  (&type->s_umount_key#56){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0
>   #3: ffff9b39f077f258
>  (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at:
>  xfs_reclaim_inodes_ag+0x82/0x410 [xfs]

This is a known false positive because inodes cannot simultaneously be
getting reclaimed and the target of a getxattr operation, but lockdep
doesn't know that.  We can (selectively) shut up lockdep until either
it gets smarter or we change inode reclaim not to require the ILOCK by
applying a stupid GFP_NOLOCKDEP bandaid.

Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Dave Airlie <airlied@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-27 08:49:28 -07:00
Christoph Hellwig ef8385128d xfs: cleanup xfs_idestroy_fork
Move freeing the dynamically allocated attr and COW fork, as well
as zeroing the pointers where actually needed into the callers, and
just pass the xfs_ifork structure to xfs_idestroy_fork.  Also simplify
the kmem_free calls by not checking for NULL first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-19 09:40:59 -07:00
Christoph Hellwig f7e67b20ec xfs: move the fork format fields into struct xfs_ifork
Both the data and attr fork have a format that is stored in the legacy
idinode.  Move it into the xfs_ifork structure instead, where it uses
up padding.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-19 09:40:58 -07:00