Commit Graph

158 Commits

Author SHA1 Message Date
Patrick Talbert fdb3eab93b Merge: XFS: Update #3 for RHEL9.6 (upstream v6.7-6.8)
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5785

XFS: update #3 for RHEL9.6. Backport upstream v6.7-6.8, including fixes patches post v6.8.

JIRA: https://issues.redhat.com/browse/RHEL-65728

Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5633

Omitted-fix:  a18a69bbec083 ("xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock")

Missing several dependency patches that merged upstream in Aug 2024, well beyond this update
(e.g. 7996f10ce6c xfs: factor out a xfs_growfs_rt_bmblock helper).

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>

Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: Brian Foster <bfoster@redhat.com>
Approved-by: Eric Sandeen <esandeen@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Patrick Talbert <ptalbert@redhat.com>
2024-12-30 07:30:09 -05:00
Pavel Reichl 660de44a1b xfs: fix sparse inode limits on runt AG
JIRA: https://issues.redhat.com/browse/RHEL-68541

Upstream Status: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git for-next

Conflicts: changed context due to unbackported upstream commits
	6abd82ab6ea4 xfs: add a xfs_agino_to_ino helper
	e9c4d8bfb26c xfs: factor out a generic xfs_group structure

The runt AG at the end of a filesystem is almost always smaller than
the mp->m_sb.sb_agblocks. Unfortunately, when setting the max_agbno
limit for the inode chunk allocation, we do not take this into
account. This means we can allocate a sparse inode chunk that
overlaps beyond the end of an AG. When we go to allocate an inode
from that sparse chunk, the irec fails validation because the
agbno of the start of the irec is beyond valid limits for the runt
AG.

Prevent this from happening by taking into account the size of the
runt AG when allocating inode chunks. Also convert the various
checks for valid inode chunk agbnos to use xfs_ag_block_count()
so that they will also catch such issues in the future.

Fixes: 56d1115c9b ("xfs: allocate sparse inode chunks on full chunk allocation failure")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

(cherry picked from commit 13325333582d4820d39b9e8f63d6a54e745585d9)
Signed-off-by: Pavel Reichl <preichl@redhat.com>
2024-11-26 07:53:15 +01:00
Bill O'Donnell 2dd6bd7b58 xfs: repair inode btrees
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit dbfbf3bdf639a20da7d5fb390cd2e197d25aa418
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Fri Dec 15 10:03:32 2023 -0800

    xfs: repair inode btrees

    Use the rmapbt to find inode chunks, query the chunks to compute hole
    and free masks, and with that information rebuild the inobt and finobt.
    Refer to the case study in
    Documentation/filesystems/xfs-online-fsck-design.rst for more details.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:26:03 -06:00
Bill O'Donnell 059e57c35e xfs: remove __xfs_free_extent_later
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 4c88fef3af4a51c2cdba6a28237e98da4873e8dc
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Wed Dec 6 18:40:57 2023 -0800

    xfs: remove __xfs_free_extent_later

    xfs_free_extent_later is a trivial helper, so remove it to reduce the
    amount of thinking required to understand the deferred freeing
    interface.  This will make it easier to introduce automatic reaping of
    speculative allocations in the next patch.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:25:54 -06:00
Bill O'Donnell 6f62276e77 xfs: AGI length should be bounds checked
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit 2d7d1e7ea321b0b2810eb00183e21332ee9c4b6f
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Thu Jun 29 10:15:45 2023 -0700

    xfs: AGI length should be bounds checked

    Similar to the recent patch strengthening the AGF agf_length
    verification, the AGI verifier does not check that the AGI length field
    is within known good bounds.  This isn't currently checked by runtime
    kernel code, yet we assume in many places that it is correct and verify
    other metadata against it.

    Add length verification to the AGI verifier.  Just like the AGF length
    checking, the length of the AGI must be equal to the size of the AG
    specified in the superblock, unless it is the last AG in the filesystem.
    In that case, it must be less than or equal to sb->sb_agblocks and
    greater than XFS_MIN_AG_BLOCKS, which is the smallest AG a growfs
    operation will allow to exist.

    There's only one place in the filesystem that actually uses agi_length,
    but let's not leave it vulnerable to the same weird nonsense that
    generates syzbot bugs, eh?

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-06 10:32:54 -05:00
Bill O'Donnell 53919fd386 xfs: use deferred frees for btree block freeing
JIRA: https://issues.redhat.com/browse/RHEL-25419

Conflicts: context

commit b742d7b4f0e03df25c2a772adcded35044b625ca
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Jun 28 11:04:32 2023 -0700

    xfs: use deferred frees for btree block freeing

    Btrees that aren't freespace management trees use the normal extent
    allocation and freeing routines for their blocks. Hence when a btree
    block is freed, a direct call to xfs_free_extent() is made and the
    extent is immediately freed. This puts the entire free space
    management btrees under this path, so we are stacking btrees on
    btrees in the call stack. The inobt, finobt and refcount btrees
    all do this.

    However, the bmap btree does not do this - it calls
    xfs_free_extent_later() to defer the extent free operation via an
    XEFI and hence it gets processed in deferred operation processing
    during the commit of the primary transaction (i.e. via intent
    chaining).

    We need to change xfs_free_extent() to behave in a non-blocking
    manner so that we can avoid deadlocks with busy extents near ENOSPC
    in transactions that free multiple extents. Inserting or removing a
    record from a btree can cause a multi-level tree merge operation and
    that will free multiple blocks from the btree in a single
    transaction. i.e. we can call xfs_free_extent() multiple times, and
    hence the btree manipulation transaction is vulnerable to this busy
    extent deadlock vector.

    To fix this, convert all the remaining callers of xfs_free_extent()
    to use xfs_free_extent_later() to queue XEFIs and hence defer
    processing of the extent frees to a context that can be safely
    restarted if a deadlock condition is detected.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-06 10:32:52 -05:00
Bill O'Donnell 48acd7d8f8 xfs: convert xfs_ialloc_has_inodes_at_extent to return keyfill scan results
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit efc0845f5d3e253f7f46a60b66a94c3164d76ee3
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Apr 11 19:00:15 2023 -0700

    xfs: convert xfs_ialloc_has_inodes_at_extent to return keyfill scan results

    Convert the xfs_ialloc_has_inodes_at_extent function to return keyfill
    scan results because for a given range of inode numbers, we might have
    no indexed inodes at all; the entire region might be allocated ondisk
    inodes; or there might be a mix of the two.

    Unfortunately, sparse inodes adds to the complexity, because each inode
    record can have holes, which means that we cannot use the generic btree
    _scan_keyfill function because we must look for holes in individual
    records to decide the result.  On the plus side, online fsck can now
    detect sub-chunk discrepancies in the inobt.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-05 16:56:23 -05:00
Bill O'Donnell 3ff63a6836 xfs: remove pointless shadow variable from xfs_difree_inobt
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit cc1207662d1a08e253520654e956f5e699826caa
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Apr 11 19:00:13 2023 -0700

    xfs: remove pointless shadow variable from xfs_difree_inobt

    In xfs_difree_inobt, the pag passed in was previously used to look up
    the AGI buffer.  There's no need to extract it again, so remove the
    shadow variable and shut up -Wshadow.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-05 16:56:22 -05:00
Bill O'Donnell 07572f330d xfs: hoist inode record alignment checks from scrub
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit de1a9ce225e93b22d189f8ffbce20074bc803121
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Apr 11 19:00:06 2023 -0700

    xfs: hoist inode record alignment checks from scrub

    Move the inobt record alignment checks from xchk_iallocbt_rec into
    xfs_inobt_check_irec so that they are applied everywhere.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-05 16:56:19 -05:00
Bill O'Donnell a5fbc725fb xfs: complain about bad records in query_range helpers
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit ee12eaaa435a7be17152ac50943ee77249de624a
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Apr 11 19:00:04 2023 -0700

    xfs: complain about bad records in query_range helpers

    For every btree type except for the bmbt, refactor the code that
    complains about bad records into a helper and make the ->query_range
    helpers call it so that corruptions found via that avenue are logged.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-05 16:56:18 -05:00
Bill O'Donnell ba7556be87 xfs: standardize ondisk to incore conversion for inode btrees
JIRA: https://issues.redhat.com/browse/RHEL-25419

commit 366a0b8d49c3a7edcb5331f254af195716ba4bdf
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Apr 11 19:00:01 2023 -0700

    xfs: standardize ondisk to incore conversion for inode btrees

    Create a xfs_inobt_check_irec function to detect corruption in btree
    records.  Fix all xfs_inobt_btrec_to_irec callsites to call the new
    helper and bubble up corruption reports.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-05 16:56:17 -05:00
Bill O'Donnell f3a70babfe treewide: use get_random_u32_below() instead of deprecated function
JIRA: https://issues.redhat.com/browse/RHEL-36333

Conflicts: Use xfs, ext4 and ext2 hunks only. Diffs from upstream due
	   to previous out of order commits.

commit 8032bf1233a74627ce69b803608e650f3f35971c
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date:   Sun Oct 9 20:44:02 2022 -0600

    treewide: use get_random_u32_below() instead of deprecated function

    This is a simple mechanical transformation done by:

    @@
    expression E;
    @@
    - prandom_u32_max
    + get_random_u32_below
      (E)

    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
    Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
    Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
    Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-05-22 13:04:03 -05:00
Bill O'Donnell 253e8028aa xfs: validate block number being freed before adding to xefi
JIRA: https://issues.redhat.com/browse/RHEL-2002

Conflicts: diffs in xfs_alloc.c due to out of order patch application

commit 7dfee17b13e5024c5c0ab1911859ded4182de3e5
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Jun 5 14:48:15 2023 +1000

    xfs: validate block number being freed before adding to xefi

    Bad things happen in defered extent freeing operations if it is
    passed a bad block number in the xefi. This can come from a bogus
    agno/agbno pair from deferred agfl freeing, or just a bad fsbno
    being passed to __xfs_free_extent_later(). Either way, it's very
    difficult to diagnose where a null perag oops in EFI creation
    is coming from when the operation that queued the xefi has already
    been completed and there's no longer any trace of it around....

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:26 -06:00
Bill O'Donnell cc414a2f57 xfs: restore old agirotor behavior
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 6e2985c938e8b765b3de299c561d87f98330c546
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Fri Feb 17 15:44:25 2023 -0800

    xfs: restore old agirotor behavior

    Prior to the removal of xfs_ialloc_next_ag, we would increment the agi
    rotor and return the *old* value.  atomic_inc_return returns the new
    value, which causes mkfs to allocate the root directory in AG 1.  Put
    back the old behavior (at least for mkfs) by subtracting 1 here.

    Fixes: 20a5eab49d35 ("xfs: convert xfs_ialloc_next_ag() to an atomic")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:24 -06:00
Bill O'Donnell c3a9573941 xfs: introduce xfs_alloc_vextent_exact_bno()
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 5f36b2ce79f254dd00cdc88374271df7ce843d56
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:54 2023 +1100

    xfs: introduce xfs_alloc_vextent_exact_bno()

    Two of the callers to xfs_alloc_vextent_this_ag() actually want
    exact block number allocation, not anywhere-in-ag allocation. Split
    this out from _this_ag() as a first class citizen so no external
    extent allocation code needs to care about args->type anymore.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:22 -06:00
Bill O'Donnell 8ed52fa327 xfs: introduce xfs_alloc_vextent_near_bno()
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit db4710fd12248e5d4c3842520cd13f034136576b
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:54 2023 +1100

    xfs: introduce xfs_alloc_vextent_near_bno()

    The remaining callers of xfs_alloc_vextent() are all doing NEAR_BNO
    allocations. We can replace that function with a new
    xfs_alloc_vextent_near_bno() function that does this explicitly.

    We also multiplex NEAR_BNO allocations through
    xfs_alloc_vextent_this_ag via args->type. Replace all of these with
    direct calls to xfs_alloc_vextent_near_bno(), too.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:22 -06:00
Bill O'Donnell f393aca22e xfs: use xfs_alloc_vextent_this_ag() where appropriate
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 74c36a8689d3d8ca9d9e96759c9bbf337e049097
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:53 2023 +1100

    xfs: use xfs_alloc_vextent_this_ag() where appropriate

    Change obvious callers of single AG allocation to use
    xfs_alloc_vextent_this_ag(). Drive the per-ag grabbing out to the
    callers, too, so that callers with active references don't need
    to do new lookups just for an allocation in a context that already
    has a perag reference.

    The only remaining caller that does single AG allocation through
    xfs_alloc_vextent() is xfs_bmap_btalloc() with
    XFS_ALLOCTYPE_NEAR_BNO. That is going to need more untangling before
    it can be converted cleanly.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell 52017e5a79 xfs: introduce xfs_for_each_perag_wrap()
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 76257a15873ccce817e0c4441f6bb66fb8f8201c
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:53 2023 +1100

    xfs: introduce xfs_for_each_perag_wrap()

    In several places we iterate every AG from a specific start agno and
    wrap back to the first AG when we reach the end of the filesystem to
    continue searching. We don't have a primitive for this iteration
    yet, so add one for conversion of these algorithms to per-ag based
    iteration.

    The filestream AG select code is a mess, and this initially makes it
    worse. The per-ag selection needs to be driven completely into the
    filestream code to clean this up and it will be done in a future
    patch that makes the filestream allocator use active per-ag
    references correctly.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell 6ee6b421b0 xfs: perags need atomic operational state
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 7ac2ff8bb3713c7cb43564c04384af2ee7cc1f8d
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:52 2023 +1100

    xfs: perags need atomic operational state

    We currently don't have any flags or operational state in the
    xfs_perag except for the pagf_init and pagi_init flags. And the
    agflreset flag. Oh, there's also the pagf_metadata and pagi_inodeok
    flags, too.

    For controlling per-ag operations, we are going to need some atomic
    state flags. Hence add an opstate field similar to what we already
    have in the mount and log, and convert all these state flags across
    to atomic bit operations.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell fcd881af53 xfs: convert xfs_ialloc_next_ag() to an atomic
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 20a5eab49d354a2837e0af3f07f92a104de52804
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:52 2023 +1100

    xfs: convert xfs_ialloc_next_ag() to an atomic

    This is currently a spinlock lock protected rotor which can be
    implemented with a single atomic operation. Change it to be more
    efficient and get rid of the m_agirotor_lock. Noticed while
    converting the inode allocation AG selection loop to active perag
    references.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell e910090ed9 xfs: inobt can use perags in many more places than it does
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit bab8b795185bf37801a4f7ee5c321eee288c2f10
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:52 2023 +1100

    xfs: inobt can use perags in many more places than it does

    Lots of code in the inobt infrastructure is passed both xfs_mount
    and perags. We only need perags for the per-ag inode allocation
    code, so reduce the duplication by passing only the perags as the
    primary object.

    This ends up reducing the code size by a bit:

               text    data     bss     dec     hex filename
    orig    1138878  323979     548 1463405  16546d (TOTALS)
    patched 1138709  323979     548 1463236  1653c4 (TOTALS)

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell f427daad7a xfs: use active perag references for inode allocation
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit dedab3e4379d298ed60b6c52a15168807b48d57a
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:52 2023 +1100

    xfs: use active perag references for inode allocation

    Convert the inode allocation routines to use active perag references
    or references held by callers rather than grab their own. Also drive
    the perag further inwards to replace xfs_mounts when doing
    operations on a specific AG.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell e4f1581c61 xfs: convert xfs_imap() to take a perag
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 498f0adbcdb6a68403bfb9645a7555b789a7fee4
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:52 2023 +1100

    xfs: convert xfs_imap() to take a perag

    Callers have referenced perags but they don't pass it into
    xfs_imap() so it takes it's own reference. Fix that so we can change
    inode allocation over to using active references.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:21 -06:00
Bill O'Donnell 059410c486 xfs: prefer free inodes at ENOSPC over chunk allocation
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit f08f984c63e9980614ae3a0a574b31eaaef284b2
Author: Dave Chinner <dchinner@redhat.com>
Date:   Sat Feb 11 04:08:06 2023 +1100

    xfs: prefer free inodes at ENOSPC over chunk allocation

    When an XFS filesystem has free inodes in chunks already allocated
    on disk, it will still allocate new inode chunks if the target AG
    has no free inodes in it. Normally, this is a good idea as it
    preserves locality of all the inodes in a given directory.

    However, at ENOSPC this can lead to using the last few remaining
    free filesystem blocks to allocate a new chunk when there are many,
    many free inodes that could be allocated without consuming free
    space. This results in speeding up the consumption of the last few
    blocks and inode create operations then returning ENOSPC when there
    free inodes available because we don't have enough block left in the
    filesystem for directory creation reservations to proceed.

    Hence when we are near ENOSPC, we should be attempting to preserve
    the remaining blocks for directory block allocation rather than
    using them for unnecessary inode chunk creation.

    This particular behaviour is exposed by xfs/294, when it drives to
    ENOSPC on empty file creation whilst there are still thousands of
    free inodes available for allocation in other AGs in the filesystem.

    Hence, when we are within 1% of ENOSPC, change the inode allocation
    behaviour to prefer to use existing free inodes over allocating new
    inode chunks, even though it results is poorer locality of the data
    set. It is more important for the allocations to be space efficient
    near ENOSPC than to have optimal locality for performance, so lets
    modify the inode AG selection code to reflect that fact.

    This allows generic/294 to not only pass with this allocator rework
    patchset, but to increase the number of post-ENOSPC empty inode
    allocations to from ~600 to ~9080 before we hit ENOSPC on the
    directory create transaction reservation.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:20 -06:00
Chris von Recklinghausen 1f619343f6 treewide: use get_random_u32() when possible
Conflicts:
	drivers/gpu/drm/tests/drm_buddy_test.c
	drivers/gpu/drm/tests/drm_mm_test.c - We already have
		ce28ab1380e8 ("drm/tests: Add back seed value information")
		so keep calls to kunit_info.
	drop changes to drivers/misc/habanalabs/gaudi2/gaudi2.c
		fs/ntfs3/fslog.c - files not in CS9
	net/sunrpc/auth_gss/gss_krb5_wrap.c - We already have
		7f675ca7757b ("SUNRPC: Improve Kerberos confounder generation")
		so code to change is gone.
	drivers/gpu/drm/i915/i915_gem_gtt.c
	drivers/gpu/drm/i915/selftests/i915_selftest.c
	drivers/gpu/drm/tests/drm_buddy_test.c
	drivers/gpu/drm/tests/drm_mm_test.c
		change added under
		4cb818386e ("Merge DRM changes from upstream v6.0.8..v6.1")

JIRA: https://issues.redhat.com/browse/RHEL-1848

commit a251c17aa558d8e3128a528af5cf8b9d7caae4fd
Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date:   Wed Oct 5 17:43:22 2022 +0200

    treewide: use get_random_u32() when possible

    The prandom_u32() function has been a deprecated inline wrapper around
    get_random_u32() for several releases now, and compiles down to the
    exact same code. Replace the deprecated wrapper with a direct call to
    the real function. The same also applies to get_random_int(), which is
    just a wrapper around get_random_u32(). This was done as a basic find
    and replace.

    Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Yury Norov <yury.norov@gmail.com>
    Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
    Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
    Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
    Acked-by: Jakub Kicinski <kuba@kernel.org>
    Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbol
t
    Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
    Acked-by: Helge Deller <deller@gmx.de> # for parisc
    Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2023-10-20 06:15:03 -04:00
Bill O'Donnell f8121a9b84 xfs: make is_log_ag() a first class helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 36029dee382a20cf515494376ce9f0d5949944eb
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:13:21 2022 +1000

    xfs: make is_log_ag() a first class helper

    We check if an ag contains the log in many places, so make this
    a first class XFS helper by lifting it to fs/xfs/libxfs/xfs_ag.h and
    renaming it xfs_ag_contains_log(). The convert all the places that
    check if the AG contains the log to use this helper.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:41 -05:00
Bill O'Donnell f8c627a70a xfs: Pre-calculate per-AG agino geometry
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 2d6ca8321c354e1cb6f6b1963c4f7bd053d2e272
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:13:10 2022 +1000

    xfs: Pre-calculate per-AG agino geometry

    There is a lot of overhead in functions like xfs_verify_agino() that
    repeatedly calculate the geometry limits of an AG. These can be
    pre-calculated as they are static and the verification context has
    a per-ag context it can quickly reference.

    In the case of xfs_verify_agino(), we now always have a perag
    context handy, so we can store the minimum and maximum agino values
    in the AG in the perag. This means we don't have to calculate
    it on every call and it can be inlined in callers if we move it
    to xfs_ag.h.

    xfs_verify_agino_or_null() gets the same perag treatment.

    xfs_agino_range() is moved to xfs_ag.c as it's not really a type
    function, and it's use is largely restricted as the first and last
    aginos can be grabbed straight from the perag in most cases.

    Note that we leave the original xfs_verify_agino in place in
    xfs_types.c as a static function as other callers in that file do
    not have per-ag contexts so still need to go the long way. It's been
    renamed to xfs_verify_agno_agino() to indicate it takes both an agno
    and an agino to differentiate it from new function.

    $ size --totals fs/xfs/built-in.a
               text    data     bss     dec     hex filename
    before  1482185  329588     572 1812345  1ba779 (TOTALS)
    after   1481937  329588     572 1812097  1ba681 (TOTALS)

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:41 -05:00
Bill O'Donnell 7048b0f4f2 xfs: pass perag to xfs_read_agi
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

Conflicts: fix one line error due to out of order rhel patch 13e2b274

commit 61021deb1faa5b2b913bf0ad76e2769276160b04
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:07:47 2022 +1000

    xfs: pass perag to xfs_read_agi

    We have the perag in most palces we call xfs_read_agi, so pass the
    perag instead of a mount/agno pair.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:39 -05:00
Bill O'Donnell d1a077edc7 xfs: pass perag to xfs_alloc_read_agf()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 08d3e84feeb8cb8e20d54f659446b98fe17913aa
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:07:40 2022 +1000

    xfs: pass perag to xfs_alloc_read_agf()

    xfs_alloc_read_agf() initialises the perag if it hasn't been done
    yet, so it makes sense to pass it the perag rather than pull a
    reference from the buffer. This allows callers to be per-ag centric
    rather than passing mount/agno pairs everywhere.

    Whilst modifying the xfs_reflink_find_shared() function definition,
    declare it static and remove the extern declaration as it is an
    internal function only these days.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:39 -05:00
Bill O'Donnell 51f3b2d709 xfs: kill xfs_alloc_pagf_init()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 76b47e528e3a27a3bf3b3f9153aad9435e03be8c
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:07:32 2022 +1000

    xfs: kill xfs_alloc_pagf_init()

    Trivial wrapper around xfs_alloc_read_agf(), can be easily replaced
    by passing a NULL agfbp to xfs_alloc_read_agf().

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:39 -05:00
Bill O'Donnell 75fbbfd667 xfs: pass perag to xfs_ialloc_read_agi()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 99b13c7f0bd35dd3cf2cacb61beb4557dc2b6f9b
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:07:24 2022 +1000

    xfs: pass perag to xfs_ialloc_read_agi()

    xfs_ialloc_read_agi() initialises the perag if it hasn't been done
    yet, so it makes sense to pass it the perag rather than pull a
    reference from the buffer. This allows callers to be per-ag centric
    rather than passing mount/agno pairs everywhere.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:38 -05:00
Bill O'Donnell 7dfdc345e3 xfs: kill xfs_ialloc_pagi_init()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit a95fee40e3d433d8fabff7c02e75f7c2c2e54400
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:07:16 2022 +1000

    xfs: kill xfs_ialloc_pagi_init()

    This is just a basic wrapper around xfs_ialloc_read_agi(), which can
    be entirely handled by xfs_ialloc_read_agi() by passing a NULL
    agibpp....

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:38 -05:00
Bill O'Donnell dce8e8ae28 xfs: convert AGI log flags to unsigned.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 0d1b97696696871dc42dfc59d527a0b68b1a1209
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Apr 21 10:46:24 2022 +1000

    xfs: convert AGI log flags to unsigned.

    5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
    fields to be unsigned.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:06 -05:00
Bill O'Donnell 4e0101a18b xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 9b7d16e34bbebc0398b1dd4f2d64ae6793fdc5ea
Author: Chandan Babu R <chandan.babu@oracle.com>
Date:   Tue Nov 16 09:04:43 2021 +0000

    xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers

    This commit adds the new per-inode flag XFS_DIFLAG2_NREXT64 to indicate that
    an inode supports 64-bit extent counters. This flag is also enabled by default
    on newly created inodes when the corresponding filesystem has large extent
    counter feature bit (i.e. XFS_FEAT_NREXT64) set.

    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:10:57 -05:00
Carlos Maiolino 93fff0d397 xfs: rename xfs_bmap_add_free to xfs_free_extent_later
Bugzilla: https://bugzilla.redhat.com/2125724

xfs_bmap_add_free isn't a block mapping function; it schedules deferred
freeing operations for a later point in a compound transaction chain.
While it's primarily used by bunmapi, its use has expanded beyond that.
Move it to xfs_alloc.c and rename the function since it's now general
freeing functionality.  Bring the slab cache bits in line with the
way we handle the other intent items.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
(cherry picked from commit c201d9ca5392b20f04882848a071025b0e194c17)
2022-10-21 12:50:46 +02:00
Carlos Maiolino f164f431d6 xfs: compute absolute maximum nlevels for each btree type
Bugzilla: https://bugzilla.redhat.com/2125724

Add code for all five btree types so that we can compute the absolute
maximum possible btree height for each btree type.  This is a setup for
the next patch, which makes every btree type have its own cursor cache.

The functions are exported so that we can have xfs_db report the
absolute maximum btree heights for each btree type, rather than making
everyone run their own ad-hoc computations.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
(cherry picked from commit 0ed5f7356daee74244b02e100b3cc043e886e686)
2022-10-21 12:50:46 +02:00
Brian Foster dbdceb8f65 xfs: kill xfs_sb_version_has_v3inode()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit cf28e17c9186c83e7e8702f844bc40b6e782ce6c
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:57 2021 -0700

    xfs: kill xfs_sb_version_has_v3inode()

    All callers to xfs_dinode_good_version() and XFS_DINODE_SIZE() in
    both the kernel and userspace have a xfs_mount structure available
    which means they can use mount features checks instead looking
    directly are the superblock.

    Convert these functions to take a mount and use a xfs_has_v3inodes()
    check and move it out of the libxfs/xfs_format.h file as it really
    doesn't have anything to do with the definition of the on-disk
    format.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:36 -04:00
Brian Foster d3185fdb89 xfs: convert xfs_sb_version_has checks to use mount features
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit ebd9027d088b3a4e49d294f79e6cadb7b7a88b28
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:55 2021 -0700

    xfs: convert xfs_sb_version_has checks to use mount features

    This is a conversion of the remaining xfs_sb_version_has..(sbp)
    checks to use xfs_has_..(mp) feature checks.

    This was largely done with a vim replacement macro that did:

    :0,$s/xfs_sb_version_has\(.*\)&\(.*\)->m_sb/xfs_has_\1\2/g<CR>

    A couple of other variants were also used, and the rest touched up
    by hand.

    $ size -t fs/xfs/built-in.a
               text    data     bss     dec     hex filename
    before  1127533  311352     484 1439369  15f689 (TOTALS)
    after   1125360  311352     484 1437196  15ee0c (TOTALS)

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:35 -04:00
Brian Foster d179379de4 xfs: replace XFS_FORCED_SHUTDOWN with xfs_is_shutdown
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit 75c8c50fa16a23f8ac89ea74834ae8ddd1558d75
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:53 2021 -0700

    xfs: replace XFS_FORCED_SHUTDOWN with xfs_is_shutdown

    Remove the shouty macro and instead use the inline function that
    matches other state/feature check wrapper naming. This conversion
    was done with sed.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:34 -04:00
Brian Foster 6def1029c3 xfs: convert mount flags to features
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git
Conflicts: Work around out of order backport in xfs_fs_fill_super().

commit 0560f31a09e523090d1ab2bfe21c69d028c2bdf2
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:52 2021 -0700

    xfs: convert mount flags to features

    Replace m_flags feature checks with xfs_has_<feature>() calls and
    rework the setup code to set flags in m_features.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:34 -04:00
Brian Foster d54a790d1d xfs: replace xfs_sb_version checks with feature flag checks
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit 38c26bfd90e1999650d5ef40f90d721f05916643
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:37 2021 -0700

    xfs: replace xfs_sb_version checks with feature flag checks

    Convert the xfs_sb_version_hasfoo() to checks against
    mp->m_features. Checks of the superblock itself during disk
    operations (e.g. in the read/write verifiers and the to/from disk
    formatters) are not converted - they operate purely on the
    superblock state. Everything else should use the mount features.

    Large parts of this conversion were done with sed with commands like
    this:

    for f in `git grep -l xfs_sb_version_has fs/xfs/*.c`; do
            sed -i -e 's/xfs_sb_version_has\(.*\)(&\(.*\)->m_sb)/xfs_has_\1(\2)/' $f
    done

    With manual cleanups for things like "xfs_has_extflgbit" and other
    little inconsistencies in naming.

    The result is ia lot less typing to check features and an XFS binary
    size reduced by a bit over 3kB:

    $ size -t fs/xfs/built-in.a
            text       data     bss     dec     hex filenam
    before  1130866  311352     484 1442702  16038e (TOTALS)
    after   1127727  311352     484 1439563  15f74b (TOTALS)

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:34 -04:00
Brian Foster af9b90b846 xfs: make the record pointer passed to query_range functions const
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit 159eb69dba8baf6d5b58b69936920fb311324c82
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Tue Aug 10 17:02:16 2021 -0700

    xfs: make the record pointer passed to query_range functions const

    The query_range functions are supposed to call a caller-supplied
    function on each record found in the dataset.  These functions don't
    own the memory storing the record, so don't let them change the record.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:29 -04:00
Brian Foster c908a48ac9 xfs: fix silly whitespace problems with kernel libxfs
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit b7df7630cccd103671b14b946bcdb3b14be75d68
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Fri Aug 6 11:05:44 2021 -0700

    xfs: fix silly whitespace problems with kernel libxfs

    Fix a few whitespace errors such as spaces at the end of the line, etc.
    This gets us back to something more closely resembling parity.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:23 -04:00
Darrick J. Wong da062d16a8 xfs: check for sparse inode clusters that cross new EOAG when shrinking
While running xfs/168, I noticed occasional write verifier shutdowns
involving inodes at the very end of the filesystem.  Existing inode
btree validation code checks that all inode clusters are fully contained
within the filesystem.

However, due to inadequate checking in the fs shrink code, it's possible
that there could be a sparse inode cluster at the end of the filesystem
where the upper inodes of the cluster are marked as holes and the
corresponding blocks are free.  In this case, the last blocks in the AG
are listed in the bnobt.  This enables the shrink to proceed but results
in a filesystem that trips the inode verifiers.  Fix this by disallowing
the shrink.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-07-15 09:58:41 -07:00
Dave Chinner 90e2c1c20a xfs: perag may be null in xfs_imap()
Dan Carpenter's static checker reported:

The patch 7b13c5155182: "xfs: use perag for ialloc btree cursors"
from Jun 2, 2021, leads to the following Smatch complaint:

    fs/xfs/libxfs/xfs_ialloc.c:2403 xfs_imap()
    error: we previously assumed 'pag' could be null (see line 2294)

And it's right. Fix it.

Fixes: 7b13c51551 ("xfs: use perag for ialloc btree cursors")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2021-06-18 08:14:20 -07:00
Dave Chinner 9ba0889e22 xfs: drop the AGI being passed to xfs_check_agi_freecount
From: Dave Chinner <dchinner@redhat.com>

Stephen Rothwell reported this compiler warning from linux-next:

fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt':
fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable]
 2032 |  struct xfs_agi   *agi = agbp->b_addr;

Which is fallout from agno -> perag conversions that were done in
this function. xfs_check_agi_freecount() is the only user of "agi"
in xfs_difree_finobt() now, and it only uses the agi to get the
current free inode count. We hold that in the perag structure, so
there's not need to directly reference the raw AGI to get this
information.

The btree cursor being passed to xfs_check_agi_freecount() has a
reference to the perag being operated on, so use that directly in
xfs_check_agi_freecount() rather than passing an AGI.

Fixes: 7b13c51551 ("xfs: use perag for ialloc btree cursors")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-06-08 09:19:22 -07:00
Dave Chinner f40aadb2bb xfs: use perag through unlink processing
Unlinked lists are held in the perag, and freeing of inodes needs to
be passed a perag, too, so look up the perag early in the unlink
processing and use it throughout.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-06-02 10:48:51 +10:00
Dave Chinner 8237fbf53d xfs: clean up and simplify xfs_dialloc()
Because it's a mess.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner 309161f660 xfs: inode allocation can use a single perag instance
Now that we've internalised the two-phase inode allocation, we can
now easily make the AG selection and allocation atomic from the
perspective of a single perag context. This will ensure AGs going
offline/away cannot occur between the selection and allocation
steps.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner b652afd937 xfs: get rid of xfs_dir_ialloc()
This is just a simple wrapper around the per-ag inode allocation
that doesn't need to exist. The internal mechanism to select and
allocate within an AG does not need to be exposed outside
xfs_ialloc.c, and it being exposed simply makes it harder to follow
the code and simplify it.

This is simplified by internalising xf_dialloc_select_ag() and
xfs_dialloc_ag() into a single xfs_dialloc() function and then
xfs_dir_ialloc() can go away.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00