Commit Graph

209 Commits

Author SHA1 Message Date
Patrick Talbert fdb3eab93b Merge: XFS: Update #3 for RHEL9.6 (upstream v6.7-6.8)
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5785

XFS: update #3 for RHEL9.6. Backport upstream v6.7-6.8, including fixes patches post v6.8.

JIRA: https://issues.redhat.com/browse/RHEL-65728

Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5633

Omitted-fix:  a18a69bbec083 ("xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock")

Missing several dependency patches that merged upstream in Aug 2024, well beyond this update
(e.g. 7996f10ce6c xfs: factor out a xfs_growfs_rt_bmblock helper).

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>

Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: Brian Foster <bfoster@redhat.com>
Approved-by: Eric Sandeen <esandeen@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Patrick Talbert <ptalbert@redhat.com>
2024-12-30 07:30:09 -05:00
Rado Vrbovsky a2a1cd0e2a Merge: xfs, iomap: unshare range fixes
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5731

JIRA: https://issues.redhat.com/browse/RHEL-64959
Tested: via fstests.

Signed-off-by: Brian Foster <bfoster@redhat.com>

Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Bill O'Donnell <bodonnel@redhat.com>
Approved-by: Andrey Albershteyn <aalbersh@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-28 20:18:58 +00:00
Bill O'Donnell 059e57c35e xfs: remove __xfs_free_extent_later
JIRA: https://issues.redhat.com/browse/RHEL-65728

commit 4c88fef3af4a51c2cdba6a28237e98da4873e8dc
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Wed Dec 6 18:40:57 2023 -0800

    xfs: remove __xfs_free_extent_later

    xfs_free_extent_later is a trivial helper, so remove it to reduce the
    amount of thinking required to understand the deferred freeing
    interface.  This will make it easier to introduce automatic reaping of
    speculative allocations in the next patch.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-20 11:25:54 -06:00
Brian Foster 5a43e91669 xfs: don't free cowblocks from under dirty pagecache on unshare
JIRA: https://issues.redhat.com/browse/RHEL-64959

commit 4390f019ad7866c3791c3d768d2ff185d89e8ebe
Author: Brian Foster <bfoster@redhat.com>
Date:   Fri Sep 6 07:40:51 2024 -0400

    xfs: don't free cowblocks from under dirty pagecache on unshare

    fallocate unshare mode explicitly breaks extent sharing. When a
    command completes, it checks the data fork for any remaining shared
    extents to determine whether the reflink inode flag and COW fork
    preallocation can be removed. This logic doesn't consider in-core
    pagecache and I/O state, however, which means we can unsafely remove
    COW fork blocks that are still needed under certain conditions.

    For example, consider the following command sequence:

    xfs_io -fc "pwrite 0 1k" -c "reflink <file> 0 256k 1k" \
            -c "pwrite 0 32k" -c "funshare 0 1k" <file>

    This allocates a data block at offset 0, shares it, and then
    overwrites it with a larger buffered write. The overwrite triggers
    COW fork preallocation, 32 blocks by default, which maps the entire
    32k write to delalloc in the COW fork. All but the shared block at
    offset 0 remains hole mapped in the data fork. The unshare command
    redirties and flushes the folio at offset 0, removing the only
    shared extent from the inode. Since the inode no longer maps shared
    extents, unshare purges the COW fork before the remaining 28k may
    have written back.

    This leaves dirty pagecache backed by holes, which writeback quietly
    skips, thus leaving clean, non-zeroed pagecache over holes in the
    file. To verify, fiemap shows holes in the first 32k of the file and
    reads return different data across a remount:

    $ xfs_io -c "fiemap -v" <file>
    <file>:
     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
       ...
       1: [8..511]:        hole               504
       ...
    $ xfs_io -c "pread -v 4k 8" <file>
    00001000:  cd cd cd cd cd cd cd cd  ........
    $ umount <mnt>; mount <dev> <mnt>
    $ xfs_io -c "pread -v 4k 8" <file>
    00001000:  00 00 00 00 00 00 00 00  ........

    To avoid this problem, make unshare follow the same rules used for
    background cowblock scanning and never purge the COW fork for inodes
    with dirty pagecache or in-flight I/O.

    Fixes: 46afb0628b ("xfs: only flush the unshared range in xfs_reflink_unshare")
    Signed-off-by: Brian Foster <bfoster@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Carlos Maiolino <cem@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2024-11-12 09:52:39 -05:00
Bill O'Donnell 33b2eacbef xfs: only remap the written blocks in xfs_reflink_end_cow_extent
JIRA: https://issues.redhat.com/browse/RHEL-62760

commit 55f669f34184ecb25b8353f29c7f6f1ae5b313d1
Author: Christoph Hellwig <hch@lst.de>
Date:   Mon Oct 16 17:28:52 2023 +0200

    xfs: only remap the written blocks in xfs_reflink_end_cow_extent

    xfs_reflink_end_cow_extent looks up the COW extent and the data fork
    extent at offset_fsb, and then proceeds to remap the common subset
    between the two.

    It does however not limit the remapped extent to the passed in
    [*offset_fsbm end_fsb] range and thus potentially remaps more blocks than
    the one handled by the current I/O completion.  This means that with
    sufficiently large data and COW extents we could be remapping COW fork
    mappings that have not been written to, leading to a stale data exposure
    on a powerfail event.

    We use to have a xfs_trim_range to make the remap fit the I/O completion
    range, but that got (apparently accidentally) removed in commit
    df2fd88f8ac7 ("xfs: rewrite xfs_reflink_end_cow to use intents").

    Note that I've only found this by code inspection, and a test case would
    probably require very specific delay and error injection.

    Fixes: df2fd88f8ac7 ("xfs: rewrite xfs_reflink_end_cow to use intents")
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-09 10:06:45 -06:00
Bill O'Donnell 7024e5c267 xfs: allow read IO and FICLONE to run concurrently
JIRA: https://issues.redhat.com/browse/RHEL-62760

commit 14a537983b228cb050ceca3a5b743d01315dc4aa
Author: Catherine Hoang <catherine.hoang@oracle.com>
Date:   Tue Oct 17 13:12:08 2023 -0700

    xfs: allow read IO and FICLONE to run concurrently

    One of our VM cluster management products needs to snapshot KVM image
    files so that they can be restored in case of failure. Snapshotting is
    done by redirecting VM disk writes to a sidecar file and using reflink
    on the disk image, specifically the FICLONE ioctl as used by
    "cp --reflink". Reflink locks the source and destination files while it
    operates, which means that reads from the main vm disk image are blocked,
    causing the vm to stall. When an image file is heavily fragmented, the
    copy process could take several minutes. Some of the vm image files have
    50-100 million extent records, and duplicating that much metadata locks
    the file for 30 minutes or more. Having activities suspended for such
    a long time in a cluster node could result in node eviction.

    Clone operations and read IO do not change any data in the source file,
    so they should be able to run concurrently. Demote the exclusive locks
    taken by FICLONE to shared locks to allow reads while cloning. While a
    clone is in progress, writes will take the IOLOCK_EXCL, so they block
    until the clone completes.

    Link: https://lore.kernel.org/linux-xfs/8911B94D-DD29-4D6E-B5BC-32EAF1866245@oracle.com/
    Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-11-09 10:06:44 -06:00
Bill O'Donnell 53919fd386 xfs: use deferred frees for btree block freeing
JIRA: https://issues.redhat.com/browse/RHEL-25419

Conflicts: context

commit b742d7b4f0e03df25c2a772adcded35044b625ca
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Jun 28 11:04:32 2023 -0700

    xfs: use deferred frees for btree block freeing

    Btrees that aren't freespace management trees use the normal extent
    allocation and freeing routines for their blocks. Hence when a btree
    block is freed, a direct call to xfs_free_extent() is made and the
    extent is immediately freed. This puts the entire free space
    management btrees under this path, so we are stacking btrees on
    btrees in the call stack. The inobt, finobt and refcount btrees
    all do this.

    However, the bmap btree does not do this - it calls
    xfs_free_extent_later() to defer the extent free operation via an
    XEFI and hence it gets processed in deferred operation processing
    during the commit of the primary transaction (i.e. via intent
    chaining).

    We need to change xfs_free_extent() to behave in a non-blocking
    manner so that we can avoid deadlocks with busy extents near ENOSPC
    in transactions that free multiple extents. Inserting or removing a
    record from a btree can cause a multi-level tree merge operation and
    that will free multiple blocks from the btree in a single
    transaction. i.e. we can call xfs_free_extent() multiple times, and
    hence the btree manipulation transaction is vulnerable to this busy
    extent deadlock vector.

    To fix this, convert all the remaining callers of xfs_free_extent()
    to use xfs_free_extent_later() to queue XEFIs and hence defer
    processing of the extent frees to a context that can be safely
    restarted if a deadlock condition is detected.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2024-06-06 10:32:52 -05:00
Bill O'Donnell 253e8028aa xfs: validate block number being freed before adding to xefi
JIRA: https://issues.redhat.com/browse/RHEL-2002

Conflicts: diffs in xfs_alloc.c due to out of order patch application

commit 7dfee17b13e5024c5c0ab1911859ded4182de3e5
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Jun 5 14:48:15 2023 +1000

    xfs: validate block number being freed before adding to xefi

    Bad things happen in defered extent freeing operations if it is
    passed a bad block number in the xefi. This can come from a bogus
    agno/agbno pair from deferred agfl freeing, or just a bad fsbno
    being passed to __xfs_free_extent_later(). Either way, it's very
    difficult to diagnose where a null perag oops in EFI creation
    is coming from when the operation that queued the xefi has already
    been completed and there's no longer any trace of it around....

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:26 -06:00
Bill O'Donnell ea5491aa47 xfs: active perag reference counting
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit c4d5660afbdcd3f0fa3bbf563e059511fba8445f
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 13 09:14:42 2023 +1100

    xfs: active perag reference counting

    We need to be able to dynamically remove instantiated AGs from
    memory safely, either for shrinking the filesystem or paging AG
    state in and out of memory (e.g. supporting millions of AGs). This
    means we need to be able to safely exclude operations from accessing
    perags while dynamic removal is in progress.

    To do this, introduce the concept of active and passive references.
    Active references are required for high level operations that make
    use of an AG for a given operation (e.g. allocation) and pin the
    perag in memory for the duration of the operation that is operating
    on the perag (e.g. transaction scope). This means we can fail to get
    an active reference to an AG, hence callers of the new active
    reference API must be able to handle lookup failure gracefully.

    Passive references are used in low level code, where we might need
    to access the perag structure for the purposes of completing high
    level operations. For example, buffers need to use passive
    references because:
    - we need to be able to do metadata IO during operations like grow
      and shrink transactions where high level active references to the
      AG have already been blocked
    - buffers need to pin the perag until they are reclaimed from
      memory, something that high level code has no direct control over.
    - unused cached buffers should not prevent a shrink from being
      started.

    Hence we have active references that will form exclusion barriers
    for operations to be performed on an AG, and passive references that
    will prevent reclaim of the perag until all objects with passive
    references have been reclaimed themselves.

    This patch introduce xfs_perag_grab()/xfs_perag_rele() as the API
    for active AG reference functionality. We also need to convert the
    for_each_perag*() iterators to use active references, which will
    start the process of converting high level code over to using active
    references. Conversion of non-iterator based code to active
    references will be done in followup patches.

    Note that the implementation using reference counting is really just
    a development vehicle for the API to ensure we don't have any leaks
    in the callers. Once we need to remove perag structures from memory
    dyanmically, we will need a much more robust per-ag state transition
    mechanism for preventing new references from being taken while we
    wait for existing references to drain before removal from memory can
    occur....

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:20 -06:00
Bill O'Donnell 21482343ad xfs: t_firstblock is tracking AGs not blocks
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 692b6cddeb65a5170c1e63d25b1ffb7822e80f7d
Author: Dave Chinner <dchinner@redhat.com>
Date:   Sat Feb 11 04:11:06 2023 +1100

    xfs: t_firstblock is tracking AGs not blocks

    The tp->t_firstblock field is now raelly tracking the highest AG we
    have locked, not the block number of the highest allocation we've
    made. It's purpose is to prevent AGF locking deadlocks, so rename it
    to "highest AG" and simplify the implementation to just track the
    agno rather than a fsbno.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:20 -06:00
Bill O'Donnell 4be47f6ad8 xfs: don't assert if cmap covers imap after cycling lock
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit 26870c3f5b15187268bf183055c7b9f29fe66079
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Dec 26 10:11:17 2022 -0800

    xfs: don't assert if cmap covers imap after cycling lock

    In xfs_reflink_fill_cow_hole, there's a debugging assertion that trips
    if (after cycling the ILOCK to get a transaction) the requeried cow
    mapping overlaps the start of the area being written.  IOWs, it trips if
    the hole in the cow fork that it's supposed to fill has been filled.

    This is trivially possible since we cycled ILOCK_EXCL.  If we trip the
    assertion, then we know that cmap is a delalloc extent because @found is
    false.  Fortunately, the bmapi_write call below will convert the
    delalloc extent to a real unwritten cow fork extent, so all we need to
    do here is remove the assertion.

    It turns out that generic/095 trips this pretty regularly with alwayscow
    mode enabled.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-10 07:22:18 -06:00
Bill O'Donnell 92ed7bfbe2 xfs: simplify if-else condition in xfs_reflink_trim_around_shared
JIRA: https://issues.redhat.com/browse/RHEL-2002

commit a0ebf8c46d64ba96b413784f88af0a4dca95b6bc
Author: Zeng Heng <zengheng4@huawei.com>
Date:   Mon Sep 19 06:50:14 2022 +1000

    xfs: simplify if-else condition in xfs_reflink_trim_around_shared

    "else" is not generally useful after a return,
    so remove it for clean code.

    There is no logical changes.

    Signed-off-by: Zeng Heng <zengheng4@huawei.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-11-06 19:27:42 -06:00
Bill O'Donnell 49610cb9cc fsdax,xfs: port unshare to fsdax
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730

commit d984648e428bf88cbd94ebe346c73632cb92fffb
Author: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Date:   Thu Dec 1 15:32:33 2022 +0000

    fsdax,xfs: port unshare to fsdax

    Implement unshare in fsdax mode: copy data from srcmap to iomap.

    Link: https://lkml.kernel.org/r/1669908753-169-1-git-send-email-ruansy.fnst@fujitsu.com
    Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: John Hubbard <jhubbard@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-06-16 10:35:49 -05:00
Bill O'Donnell 2118b9b94d xfs: add dax dedupe support
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730

commit 13f9e267fdbba30820ce3999338b7d8fe7d6bf77
Author: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Date:   Fri Jun 3 13:37:38 2022 +0800

    xfs: add dax dedupe support

    Introduce xfs_mmaplock_two_inodes_and_break_dax_layout() for dax files who
    are going to be deduped.  After that, call compare range function only
    when files are both DAX or not.

    Link: https://lkml.kernel.org/r/20220603053738.1218681-15-ruansy.fnst@fujitsu.com
    Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Dan Williams <dan.j.wiliams@intel.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
    Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
    Cc: Jane Chu <jane.chu@oracle.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Ritesh Harjani <riteshh@linux.ibm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-06-16 10:35:47 -05:00
Bill O'Donnell 7554f41e28 fsdax: dedup file range to use a compare function
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2192730

commit 6f7db3894ae23eb5d40af4efb404aa0c072a68d2
Author: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Date:   Fri Jun 3 13:37:36 2022 +0800

    fsdax: dedup file range to use a compare function

    With dax we cannot deal with readpage() etc.  So, we create a dax
    comparison function which is similar with vfs_dedupe_file_range_compare().
    And introduce dax_remap_file_range_prep() for filesystem use.

    Link: https://lkml.kernel.org/r/20220603053738.1218681-13-ruansy.fnst@fujitsu.com
    Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
    Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Dan Williams <dan.j.wiliams@intel.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
    Cc: Jane Chu <jane.chu@oracle.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Ritesh Harjani <riteshh@linux.ibm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-06-16 10:35:47 -05:00
Bill O'Donnell 8758b87ac1 xfs: Fix false ENOSPC when performing direct write on a delalloc extent in cow fork
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit d62113303d691bcd8d0675ae4ac63e7769afc56c
Author: Chandan Babu R <chandan.babu@oracle.com>
Date:   Thu Aug 4 08:59:27 2022 -0700

    xfs: Fix false ENOSPC when performing direct write on a delalloc extent in cow fork

    On a higly fragmented filesystem a Direct IO write can fail with -ENOSPC error
    even though the filesystem has sufficient number of free blocks.

    This occurs if the file offset range on which the write operation is being
    performed has a delalloc extent in the cow fork and this delalloc extent
    begins much before the Direct IO range.

    In such a scenario, xfs_reflink_allocate_cow() invokes xfs_bmapi_write() to
    allocate the blocks mapped by the delalloc extent. The extent thus allocated
    may not cover the beginning of file offset range on which the Direct IO write
    was issued. Hence xfs_reflink_allocate_cow() ends up returning -ENOSPC.

    The following script reliably recreates the bug described above.

      #!/usr/bin/bash

      device=/dev/loop0
      shortdev=$(basename $device)

      mntpnt=/mnt/
      file1=${mntpnt}/file1
      file2=${mntpnt}/file2
      fragmentedfile=${mntpnt}/fragmentedfile
      punchprog=/root/repos/xfstests-dev/src/punch-alternating

      errortag=/sys/fs/xfs/${shortdev}/errortag/bmap_alloc_minlen_extent

      umount $device > /dev/null 2>&1

      echo "Create FS"
      mkfs.xfs -f -m reflink=1 $device > /dev/null 2>&1
      if [[ $? != 0 ]]; then
            echo "mkfs failed."
            exit 1
      fi

      echo "Mount FS"
      mount $device $mntpnt > /dev/null 2>&1
      if [[ $? != 0 ]]; then
            echo "mount failed."
            exit 1
      fi

      echo "Create source file"
      xfs_io -f -c "pwrite 0 32M" $file1 > /dev/null 2>&1

      sync

      echo "Create Reflinked file"
      xfs_io -f -c "reflink $file1" $file2 &>/dev/null

      echo "Set cowextsize"
      xfs_io -c "cowextsize 16M" $file1 > /dev/null 2>&1

      echo "Fragment FS"
      xfs_io -f -c "pwrite 0 64M" $fragmentedfile > /dev/null 2>&1
      sync
      $punchprog $fragmentedfile

      echo "Allocate block sized extent from now onwards"
      echo -n 1 > $errortag

      echo "Create 16MiB delalloc extent in CoW fork"
      xfs_io -c "pwrite 0 4k" $file1 > /dev/null 2>&1

      sync

      echo "Direct I/O write at offset 12k"
      xfs_io -d -c "pwrite 12k 8k" $file1

    This commit fixes the bug by invoking xfs_bmapi_write() in a loop until disk
    blocks are allocated for atleast the starting file offset of the Direct IO
    write range.

    Fixes: 3c68d44a2b ("xfs: allocate direct I/O COW blocks in iomap_begin")
    Reported-and-Root-caused-by: Wengang Wang <wen.gang.wang@oracle.com>
    Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    [djwong: slight editing to make the locking less grody, and fix some style things]
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:50 -05:00
Bill O'Donnell 08529f7680 xfs: convert XFS_IFORK_PTR to a static inline helper
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 732436ef916b4f338d672ea56accfdb11e8d0732
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Sat Jul 9 10:56:05 2022 -0700

    xfs: convert XFS_IFORK_PTR to a static inline helper

    We're about to make this logic do a bit more, so convert the macro to a
    static inline function for better typechecking and fewer shouty macros.
    No functional changes here.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:42 -05:00
Bill O'Donnell d1a077edc7 xfs: pass perag to xfs_alloc_read_agf()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 08d3e84feeb8cb8e20d54f659446b98fe17913aa
Author: Dave Chinner <dchinner@redhat.com>
Date:   Thu Jul 7 19:07:40 2022 +1000

    xfs: pass perag to xfs_alloc_read_agf()

    xfs_alloc_read_agf() initialises the perag if it hasn't been done
    yet, so it makes sense to pass it the perag rather than pull a
    reference from the buffer. This allows callers to be per-ag centric
    rather than passing mount/agno pairs everywhere.

    Whilst modifying the xfs_reflink_find_shared() function definition,
    declare it static and remove the extern declaration as it is an
    internal function only these days.

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:39 -05:00
Bill O'Donnell 4df3409ee5 xfs: rewrite xfs_reflink_end_cow to use intents
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit df2fd88f8ac77f75a603d9fa5015225cc6c30edb
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Apr 25 18:38:15 2022 -0700

    xfs: rewrite xfs_reflink_end_cow to use intents

    Currently, the code that performs CoW remapping after a write has this
    odd behavior where it walks /backwards/ through the data fork to remap
    extents in reverse order.  Earlier, we rewrote the reflink remap
    function to use deferred bmap log items instead of trying to cram as
    much into the first transaction that we could.  Now do the same for the
    CoW remap code.  There doesn't seem to be any performance impact; we're
    just making better use of code that we added for the benefit of reflink.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:11 -05:00
Bill O'Donnell 755e20d26f xfs: remove a __xfs_bunmapi call from reflink
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit f1e6a8d72806d2d57560b4873d8aa42c420384ee
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Apr 25 18:38:12 2022 -0700

    xfs: remove a __xfs_bunmapi call from reflink

    This raw call isn't necessary since we can always remove a full delalloc
    extent.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:11:10 -05:00
Bill O'Donnell cf7ff3302c xfs: Conditionally upgrade existing inodes to use large extent counters
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 4f86bb4b66c999ad9ddcfd49fec93992eeba2715
Author: Chandan Babu R <chandan.babu@oracle.com>
Date:   Wed Mar 9 07:49:36 2022 +0000

    xfs: Conditionally upgrade existing inodes to use large extent counters

    This commit enables upgrading existing inodes to use large extent counters
    provided that underlying filesystem's superblock has large extent counter
    feature enabled.

    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:10:59 -05:00
Bill O'Donnell 5ba299e719 xfs: add missing cmap->br_state = XFS_EXT_NORM update
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 1a39ae415c1be1e46f5b3f97d438c7c4adc22b63
Author: Gao Xiang <xiang@kernel.org>
Date:   Fri Feb 25 16:18:30 2022 -0800

    xfs: add missing cmap->br_state = XFS_EXT_NORM update

    COW extents are already converted into written real extents after
    xfs_reflink_convert_cow_locked(), therefore cmap->br_state should
    reflect it.

    Otherwise, there is another necessary unwritten convertion
    triggered in xfs_dio_write_end_io() for direct I/O cases.

    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:10:49 -05:00
Bill O'Donnell 9e0bb79551 xfs: only run COW extent recovery when there are no live extents
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2167832

commit 7993f1a431bc5271369d359941485a9340658ac3
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Wed Dec 15 11:52:23 2021 -0800

    xfs: only run COW extent recovery when there are no live extents

    As part of multiple customer escalations due to file data corruption
    after copy on write operations, I wrote some fstests that use fsstress
    to hammer on COW to shake things loose.  Regrettably, I caught some
    filesystem shutdowns due to incorrect rmap operations with the following
    loop:

    mount <filesystem>                              # (0)
    fsstress <run only readonly ops> &              # (1)
    while true; do
            fsstress <run all ops>
            mount -o remount,ro                     # (2)
            fsstress <run only readonly ops>
            mount -o remount,rw                     # (3)
    done

    When (2) happens, notice that (1) is still running.  xfs_remount_ro will
    call xfs_blockgc_stop to walk the inode cache to free all the COW
    extents, but the blockgc mechanism races with (1)'s reader threads to
    take IOLOCKs and loses, which means that it doesn't clean them all out.
    Call such a file (A).

    When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which
    walks the ondisk refcount btree and frees any COW extent that it finds.
    This function does not check the inode cache, which means that incore
    COW forks of inode (A) is now inconsistent with the ondisk metadata.  If
    one of those former COW extents are allocated and mapped into another
    file (B) and someone triggers a COW to the stale reservation in (A), A's
    dirty data will be written into (B) and once that's done, those blocks
    will be transferred to (A)'s data fork without bumping the refcount.

    The results are catastrophic -- file (B) and the refcount btree are now
    corrupt.  In the first patch, we fixed the race condition in (2) so that
    (A) will always flush the COW fork.  In this second patch, we move the
    _recover_cow call to the initial mount call in (0) for safety.

    As mentioned previously, xfs_reflink_recover_cow walks the refcount
    btree looking for COW staging extents, and frees them.  This was
    intended to be run at mount time (when we know there are no live inodes)
    to clean up any leftover staging events that may have been left behind
    during an unclean shutdown.  As a time "optimization" for readonly
    mounts, we deferred this to the ro->rw transition, not realizing that
    any failure to clean all COW forks during a rw->ro transition would
    result in catastrophic corruption.

    Therefore, remove this optimization and only run the recovery routine
    when we're guaranteed not to have any COW staging extents anywhere,
    which means we always run this at mount time.  While we're at it, move
    the callsite to xfs_log_mount_finish because any refcount btree
    expansion (however unlikely given that we're removing records from the
    right side of the index) must be fed by a per-AG reservation, which
    doesn't exist in its current location.

    Fixes: 174edb0e46 ("xfs: store in-progress CoW allocations in the refcount btree")
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
2023-05-18 11:10:44 -05:00
Jeff Moyer 7af7c9943b xfs: add xfs_zero_range and xfs_truncate_page helpers
Bugzilla: https://bugzilla.redhat.com/2162211

commit f1ba5fafba9bfde4b040cd0d14256aed25a35c5e
Author: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Date:   Mon Nov 29 11:21:49 2021 +0100

    xfs: add xfs_zero_range and xfs_truncate_page helpers
    
    Add helpers to prepare for using different DAX operations.
    
    Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
    [hch: split from a larger patch + slight cleanups]
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Link: https://lore.kernel.org/r/20211129102203.2243509-16-hch@lst.de
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-03-09 03:57:06 -05:00
Carlos Maiolino 93fff0d397 xfs: rename xfs_bmap_add_free to xfs_free_extent_later
Bugzilla: https://bugzilla.redhat.com/2125724

xfs_bmap_add_free isn't a block mapping function; it schedules deferred
freeing operations for a later point in a compound transaction chain.
While it's primarily used by bunmapi, its use has expanded beyond that.
Move it to xfs_alloc.c and rename the function since it's now general
freeing functionality.  Bring the slab cache bits in line with the
way we handle the other intent items.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
(cherry picked from commit c201d9ca5392b20f04882848a071025b0e194c17)
2022-10-21 12:50:46 +02:00
Brian Foster d54a790d1d xfs: replace xfs_sb_version checks with feature flag checks
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2083143
Upstream Status: linux.git

commit 38c26bfd90e1999650d5ef40f90d721f05916643
Author: Dave Chinner <dchinner@redhat.com>
Date:   Wed Aug 18 18:46:37 2021 -0700

    xfs: replace xfs_sb_version checks with feature flag checks

    Convert the xfs_sb_version_hasfoo() to checks against
    mp->m_features. Checks of the superblock itself during disk
    operations (e.g. in the read/write verifiers and the to/from disk
    formatters) are not converted - they operate purely on the
    superblock state. Everything else should use the mount features.

    Large parts of this conversion were done with sed with commands like
    this:

    for f in `git grep -l xfs_sb_version_has fs/xfs/*.c`; do
            sed -i -e 's/xfs_sb_version_has\(.*\)(&\(.*\)->m_sb)/xfs_has_\1(\2)/' $f
    done

    With manual cleanups for things like "xfs_has_extflgbit" and other
    little inconsistencies in naming.

    The result is ia lot less typing to check features and an XFS binary
    size reduced by a bit over 3kB:

    $ size -t fs/xfs/built-in.a
            text       data     bss     dec     hex filenam
    before  1130866  311352     484 1442702  16038e (TOTALS)
    after   1127727  311352     484 1439563  15f74b (TOTALS)

    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Signed-off-by: Brian Foster <bfoster@redhat.com>
2022-08-25 08:11:34 -04:00
Dave Chinner a81a06211f xfs: convert refcount btree cursor to use perags
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner be9fb17d88 xfs: add a perag to the btree cursor
Which will eventually completely replace the agno in it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-06-02 10:48:24 +10:00
Dave Chinner 934933c3ee xfs: convert raw ag walks to use for_each_perag
Convert the raw walks to an iterator, pulling the current AG out of
pag->pag_agno instead of the loop iterator variable.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner 9bbafc7191 xfs: move xfs_perag_get/put to xfs_ag.[ch]
They are AG functions, not superblock functions, so move them to the
appropriate location.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Darrick J. Wong d4f74e162d xfs: fix xfs_reflink_unshare usage of filemap_write_and_wait_range
The final parameter of filemap_write_and_wait_range is the end of the
range to flush, not the length of the range to flush.

Fixes: 46afb0628b ("xfs: only flush the unshared range in xfs_reflink_unshare")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-04-29 07:45:44 -07:00
Christoph Hellwig 862a804aae xfs: move the XFS_IFEXTENTS check into xfs_iread_extents
Move the XFS_IFEXTENTS check from the callers into xfs_iread_extents to
simplify the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-15 09:35:50 -07:00
Christoph Hellwig 3e09ab8fdc xfs: move the di_flags2 field to struct xfs_inode
In preparation of removing the historic icinode struct, move the flags2
field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:05 -07:00
Christoph Hellwig b33ce57d3e xfs: move the di_cowextsize field to struct xfs_inode
In preparation of removing the historic icinode struct, move the
cowextsize field into the containing xfs_inode structure.  Also
switch to use the xfs_extlen_t instead of a uint32_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:04 -07:00
Christoph Hellwig 13d2c10b05 xfs: move the di_size field to struct xfs_inode
In preparation of removing the historic icinode struct, move the on-disk
size field into the containing xfs_inode structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-04-07 14:37:03 -07:00
Darrick J. Wong 766aabd599 xfs: flush eof/cowblocks if we can't reserve quota for file blocks
If a fs modification (data write, reflink, xattr set, fallocate, etc.)
is unable to reserve enough quota to handle the modification, try
clearing whatever space the filesystem might have been hanging onto in
the hopes of speeding up the filesystem.  The flushing behavior will
become particularly important when we add deferred inode inactivation
because that will increase the amount of space that isn't actively tied
to user data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03 09:18:49 -08:00
Darrick J. Wong 4ca7420568 xfs: try worst case space reservation upfront in xfs_reflink_remap_extent
Now that we've converted xfs_reflink_remap_extent to use the new
xfs_trans_alloc_inode API, we can focus on its slightly unusual behavior
with regard to quota reservations.

Since it's valid to remap written blocks into a hole, we must be able to
increase the quota count by the number of blocks in the mapping.
However, the incore space reservation process requires us to supply an
asymptotic guess before we can gain exclusive access to resources.  We'd
like to reserve all the quota we need up front, but we also don't want
to fail a written -> allocated remap operation unnecessarily.

The solution is to make the remap_extents function call the transaction
allocation function twice.  The first time we ask to reserve enough
space and quota to handle the absolute worst case situation, but if that
fails, we can fall back to the old strategy: ask for the bare minimum
space reservation upfront and increase the quota reservation later if we
need to.

Later in this patchset we change the transaction and quota code to try
to reclaim space if we cannot reserve free space or quota.
Restructuring the remap_extent function in this manner means that if the
fallback increase fails, we can pass that back to the caller knowing
that the transaction allocation already tried freeing space.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-02-03 09:18:49 -08:00
Darrick J. Wong f273387b04 xfs: refactor reflink functions to use xfs_trans_alloc_inode
The two remaining callers of xfs_trans_reserve_quota_nblks are in the
reflink code.  These conversions aren't as uniform as the previous
conversions, so call that out in a separate patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-02-03 09:18:49 -08:00
Darrick J. Wong 02b7ee4eb6 xfs: reserve data and rt quota at the same time
Modify xfs_trans_reserve_quota_nblks so that we can reserve data and
realtime blocks from the dquot at the same time.  This change has the
theoretical side effect that for allocations to realtime files we will
reserve from the dquot both the number of rtblocks being allocated and
the number of bmbt blocks that might be needed to add the mapping.
However, since the mount code disables quota if it finds a realtime
device, this should not result in any behavior changes.

Now that we've moved the inode creation callers away from using the
_nblks function, we can repurpose the (now unused) ninos argument for
realtime blocks, so make that change.  This also replaces the flags
argument with a boolean parameter to force the reservation since we
don't need to distinguish between data and rt quota reservations any
more, and the only flag being passed in was FORCE_RES.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03 09:18:49 -08:00
Darrick J. Wong 35b1101099 xfs: remove xfs_trans_unreserve_quota_nblks completely
xfs_trans_cancel will release all the quota resources that were reserved
on behalf of the transaction, so get rid of the explicit unreserve step.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03 09:18:49 -08:00
Darrick J. Wong 8554650003 xfs: create convenience wrappers for incore quota block reservations
Create a couple of convenience wrappers for creating and deleting quota
block reservations against future changes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03 09:18:49 -08:00
Darrick J. Wong 4abe21ad67 xfs: clean up quota reservation callsites
Convert a few xfs_trans_*reserve* callsites that are open-coding other
convenience functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03 09:18:49 -08:00
Chandan Babu R ee898d78c3 xfs: Check for extent overflow when remapping an extent
Remapping an extent involves unmapping the existing extent and mapping
in the new extent. When unmapping, an extent containing the entire unmap
range can be split into two extents,
i.e. | Old extent | hole | Old extent |
Hence extent count increases by 1.

Mapping in the new extent into the destination file can increase the
extent count by 1.

Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22 16:54:48 -08:00
Chandan Babu R 5f1d5bbfb2 xfs: Check for extent overflow when moving extent from cow to data fork
Moving an extent to data fork can cause a sub-interval of an existing
extent to be unmapped. This will increase extent count by 1. Mapping in
the new extent can increase the extent count by 1 again i.e.
 | Old extent | New extent | Old extent |
Hence number of extents increases by 2.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22 16:54:48 -08:00
Darrick J. Wong 46afb0628b xfs: only flush the unshared range in xfs_reflink_unshare
There's no reason to flush an entire file when we're unsharing part of
a file.  Therefore, only initiate writeback on the selected range.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2020-11-04 17:41:56 -08:00
Randy Dunlap b63da6c8df xfs: delete duplicated words + other fixes
Delete repeated words in fs/xfs/.
{we, that, the, a, to, fork}
Change "it it" to "it is" in one location.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: linux-fsdevel@vger.kernel.org
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-08-05 08:49:58 -07:00
Darrick J. Wong e2aaee9cd3 xfs: move helpers that lock and unlock two inodes against userspace IO
Move the double-inode locking helpers to xfs_inode.c since they're not
specific to reflink.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-07-06 10:46:57 -07:00
Darrick J. Wong 10b4bd6c9c xfs: refactor locking and unlocking two inodes against userspace IO
Refactor the two functions that we use to lock and unlock two inodes to
block userspace from initiating IO against a file, whether via system
calls or mmap activity.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-07-06 10:46:57 -07:00
Darrick J. Wong 451d34ee07 xfs: fix xfs_reflink_remap_prep calling conventions
Fix the return value of xfs_reflink_remap_prep so that its return value
conventions match the rest of xfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-07-06 10:46:57 -07:00
Darrick J. Wong 168eae803c xfs: reflink can skip remap existing mappings
If the source and destination map are identical, we can skip the remap
step to save some time.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-07-06 10:46:57 -07:00