Centos-kernel-stream-9/fs/btrfs
Filipe Manana 1f9b8c8fbc Btrfs: check if previous transaction aborted to avoid fs corruption
While we are committing a transaction, it's possible the previous one is
still finishing its commit and therefore we wait for it to finish first.
However we were not checking if that previous transaction ended up getting
aborted after we waited for it to commit, so we ended up committing the
current transaction which can lead to fs corruption because the new
superblock can point to trees that have had one or more nodes/leafs that
were never durably persisted.
The following sequence diagram exemplifies how this is possible:

          CPU 0                                                        CPU 1

  transaction N starts

  (...)

  btrfs_commit_transaction(N)

    cur_trans->state = TRANS_STATE_COMMIT_START;
    (...)
    cur_trans->state = TRANS_STATE_COMMIT_DOING;
    (...)

    cur_trans->state = TRANS_STATE_UNBLOCKED;
    root->fs_info->running_transaction = NULL;

                                                              btrfs_start_transaction()
                                                                 --> starts transaction N + 1

    btrfs_write_and_wait_transaction(trans, root);
      --> starts writing all new or COWed ebs created
          at transaction N

                                                              creates some new ebs, COWs some
                                                              existing ebs but doesn't COW or
                                                              deletes eb X

                                                              btrfs_commit_transaction(N + 1)
                                                                (...)
                                                                cur_trans->state = TRANS_STATE_COMMIT_START;
                                                                (...)
                                                                wait_for_commit(root, prev_trans);
                                                                  --> prev_trans == transaction N

    btrfs_write_and_wait_transaction() continues
    writing ebs
       --> fails writing eb X, we abort transaction N
           and set bit BTRFS_FS_STATE_ERROR on
           fs_info->fs_state, so no new transactions
           can start after setting that bit

       cleanup_transaction()
         btrfs_cleanup_one_transaction()
           wakes up task at CPU 1

                                                                continues, doesn't abort because
                                                                cur_trans->aborted (transaction N + 1)
                                                                is zero, and no checks for bit
                                                                BTRFS_FS_STATE_ERROR in fs_info->fs_state
                                                                are made

                                                                btrfs_write_and_wait_transaction(trans, root);
                                                                  --> succeeds, no errors during writeback

                                                                write_ctree_super(trans, root, 0);
                                                                  --> succeeds
                                                                  --> we have now a superblock that points us
                                                                      to some root that uses eb X, which was
                                                                      never written to disk

In this scenario future attempts to read eb X from disk results in an
error message like "parent transid verify failed on X wanted Y found Z".

So fix this by aborting the current transaction if after waiting for the
previous transaction we verify that it was aborted.

Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-08-19 14:27:31 -07:00
..
tests btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism. 2015-06-10 09:26:05 -07:00
Kconfig
Makefile
acl.c
async-thread.c btrfs: Fix lockdep warning of wr_ctx->wr_lock in scrub_free_wr_ctx() 2015-06-10 07:04:52 -07:00
async-thread.h btrfs: Fix lockdep warning of wr_ctx->wr_lock in scrub_free_wr_ctx() 2015-06-10 07:04:52 -07:00
backref.c Btrfs: fix warning in backref walking 2015-08-09 07:33:50 -07:00
backref.h
btrfs_inode.h Btrfs: fix warning of bytes_may_use 2015-07-01 17:17:21 -07:00
check-integrity.c
check-integrity.h
compression.c
compression.h
ctree.c btrfs: abort transaction on btrfs_reloc_cow_block() 2015-08-09 07:07:14 -07:00
ctree.h Merge branch 'jeffm-discard-4.3' into for-linus-4.3 2015-08-09 07:35:33 -07:00
delayed-inode.c
delayed-inode.h
delayed-ref.c btrfs: delayed-ref: double free in btrfs_add_delayed_tree_ref() 2015-06-24 12:28:03 -07:00
delayed-ref.h btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots. 2015-06-10 09:26:23 -07:00
dev-replace.c btrfs: its btrfs_err() instead of btrfs_error() 2015-07-22 18:20:53 -07:00
dev-replace.h
dir-item.c
disk-io.c Merge branch 'jeffm-discard-4.3' into for-linus-4.3 2015-08-09 07:35:33 -07:00
disk-io.h
export.c
export.h
extent-tree.c Merge branch 'jeffm-discard-4.3' into for-linus-4.3 2015-08-09 07:35:33 -07:00
extent-tree.h btrfs: qgroup: Add new qgroup calculation function 2015-06-10 09:25:49 -07:00
extent_io.c btrfs: Prevent from early transaction abort 2015-08-19 14:25:15 -07:00
extent_io.h
extent_map.c
extent_map.h
file-item.c
file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
free-space-cache.c btrfs: add missing discards when unpinning extents with -o discard 2015-07-29 08:15:29 -07:00
free-space-cache.h
hash.c
hash.h
inode-item.c
inode-map.c Btrfs: fix race between caching kthread and returning inode to inode cache 2015-06-30 14:36:46 -07:00
inode-map.h
inode.c Btrfs: add support for blkio controllers 2015-08-09 07:35:06 -07:00
ioctl.c btrfs: fix clone / extent-same deadlocks 2015-08-09 07:34:25 -07:00
locking.c btrfs: Add WARN_ON() for double lock in btrfs_tree_lock() 2015-08-09 07:07:14 -07:00
locking.h
lzo.c
math.h
ordered-data.c Btrfs: fix memory corruption on failure to submit bio for direct IO 2015-07-01 17:17:18 -07:00
ordered-data.h Btrfs: avoid syncing log in the fast fsync path when not necessary 2015-06-10 07:02:43 -07:00
orphan.c
print-tree.c
print-tree.h
props.c
props.h
qgroup.c btrfs: qgroup: allow user to clear the limitation on qgroup 2015-06-30 13:20:00 -07:00
qgroup.h btrfs: qgroup: Cleanup the old ref_node-oriented mechanism. 2015-06-10 09:26:11 -07:00
raid56.c Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c Btrfs: count devices correctly in readahead during RAID 5/6 replace 2015-08-09 07:34:26 -07:00
relocation.c btrfs: Remove unnecessary variants in relocation.c 2015-08-09 07:07:14 -07:00
root-tree.c
scrub.c Btrfs: fix parity scrub of RAID 5/6 with missing device 2015-08-09 07:34:26 -07:00
send.c Btrfs: use received_uuid of parent during send 2015-06-12 13:20:38 -07:00
send.h
struct-funcs.c
super.c Merge branch 'jeffm-discard-4.3' into for-linus-4.3 2015-08-09 07:35:33 -07:00
sysfs.c Btrfs: Check if kobject is initialized before put 2015-06-22 14:43:31 +02:00
sysfs.h
transaction.c Btrfs: check if previous transaction aborted to avoid fs corruption 2015-08-19 14:27:31 -07:00
transaction.h btrfs: add missing discards when unpinning extents with -o discard 2015-07-29 08:15:29 -07:00
tree-defrag.c btrfs: let tree defrag work in SSD mode 2015-06-02 19:34:33 -07:00
tree-log.c btrfs: Remove unused arguments in tree-log.c 2015-08-19 14:25:15 -07:00
tree-log.h
ulist.c btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c
volumes.c btrfs: use __GFP_NOFAIL in alloc_btrfs_bio 2015-08-19 14:25:15 -07:00
volumes.h Merge branch 'jeffm-discard-4.3' into for-linus-4.3 2015-08-09 07:35:33 -07:00
xattr.c
xattr.h
zlib.c