JIRA: https://issues.redhat.com/browse/RHEL-40250
commit 0d043351e5baf3857f915367deba2a518b6a0809
Author: Theodore Ts'o <tytso@mit.edu>
Date: Sat Nov 5 23:42:36 2022 -0400
ext4: fix fortify warning in fs/ext4/fast_commit.c:1551
With the new fortify string system, rework the memcpy to avoid this
warning:
memcpy: detected field-spanning write (size 60) of single field "&raw_inode->i_generation" at fs/ext4/fast_commit.c:1551 (size 4)
Cc: stable@kernel.org
Fixes: 54d9469bc515 ("fortify: Add run-time WARN for cross-field memcpy()")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-5335
jbd2_submit_inode_data() hardcoded use of
jbd2_journal_submit_inode_data_buffers() for submission of data pages.
Make it use j_submit_inode_data_buffers hook instead. This effectively
switches ext4 fastcommits to use ext4_writepages() for data writeout
instead of generic_writepages().
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221207112722.22220-9-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit f30ff35f6266993405c8659e48fddc3180692164)
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-5335
Instead of checksumming each field as it is added to the block, just
checksum each block before it is written. This is simpler, and also
much more efficient.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-8-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 8805dbcb3e83a4e5a6c91edc15643a7498e576ce)
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2188241
Tested: With xfstests
To avoid 'sparse' warnings about missing endianness conversions, don't
store native endianness values into struct ext4_fc_tl. Instead, use a
separate struct type, ext4_fc_tl_mem.
Fixes: dcc5827484d6 ("ext4: factor out ext4_fc_get_tl()")
Cc: Ye Bin <yebin10@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221217050212.150665-1-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 11768cfd98136dd8399480c60b7a5d3d3c7b109b)
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 48a6a66db82b8043d298a630f22c62d43550cae5
Author: Eric Biggers <ebiggers@google.com>
Due to several different off-by-one errors, or perhaps due to a late
change in design that wasn't fully reflected in the code that was
actually merged, there are several very strange constraints on how
fast-commit blocks are filled with tlv entries:
- tlvs must start at least 10 bytes before the end of the block, even
though the minimum tlv length is 8. Otherwise, the replay code will
ignore them. (BUG: ext4_fc_reserve_space() could violate this
requirement if called with a len of blocksize - 9 or blocksize - 8.
Fortunately, this doesn't seem to happen currently.)
- tlvs must end at least 1 byte before the end of the block. Otherwise
the replay code will consider them to be invalid. This quirk
contributed to a bug (fixed by an earlier commit) where uninitialized
memory was being leaked to disk in the last byte of blocks.
Also, strangely these constraints don't apply to the replay code in
e2fsprogs, which will accept any tlvs in the blocks (with no bounds
checks at all, but that is a separate issue...).
Given that this all seems to be a bug, let's fix it by just filling
blocks with tlv entries in the natural way.
Note that old kernels will be unable to replay fast-commit journals
created by kernels that have this commit.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-7-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 48a6a66db82b8043d298a630f22c62d43550cae5)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 8415ce07ecf0cc25efdd5db264a7133716e503cf
Author: Eric Biggers <ebiggers@google.com>
As is done elsewhere in the file, build the struct ext4_fc_tl on the
stack and memcpy() it into the buffer, rather than directly writing it
to a potentially-unaligned location in the buffer.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-6-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 8415ce07ecf0cc25efdd5db264a7133716e503cf)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 64b4a25c3de81a69724e888ec2db3533b43816e2
Author: Eric Biggers <ebiggers@google.com>
Validate the inode and filename lengths in fast-commit journal records
so that a malicious fast-commit journal cannot cause a crash by having
invalid values for these. Also validate EXT4_FC_TAG_DEL_RANGE.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-5-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 64b4a25c3de81a69724e888ec2db3533b43816e2)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 594bc43b410316d70bb42aeff168837888d96810
Author: Eric Biggers <ebiggers@google.com>
When space at the end of fast-commit journal blocks is unused, make sure
to zero it out so that uninitialized memory is not leaked to disk.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-4-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 594bc43b410316d70bb42aeff168837888d96810)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 4c0d5778385cb3618ff26a561ce41de2b7d9de70
Author: Eric Biggers <ebiggers@google.com>
Commit a80f7fcf18 ("ext4: fixup ext4_fc_track_* functions' signature")
extended the scope of the transaction in ext4_unlink() too far, making
it include the call to ext4_find_entry(). However, ext4_find_entry()
can deadlock when called from within a transaction because it may need
to set up the directory's encryption key.
Fix this by restoring the transaction to its original scope.
Reported-by: syzbot+1a748d0007eeac3ab079@syzkaller.appspotmail.com
Fixes: a80f7fcf18 ("ext4: fixup ext4_fc_track_* functions' signature")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-3-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 4c0d5778385cb3618ff26a561ce41de2b7d9de70)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 0fbcb5251fc81b58969b272c4fb7374a7b922e3e
Author: Eric Biggers <ebiggers@google.com>
fast-commit of create, link, and unlink operations in encrypted
directories is completely broken because the unencrypted filenames are
being written to the fast-commit journal instead of the encrypted
filenames. These operations can't be replayed, as encryption keys
aren't present at journal replay time. It is also an information leak.
Until if/when we can get this working properly, make encrypted directory
operations ineligible for fast-commit.
Note that fast-commit operations on encrypted regular files continue to
be allowed, as they seem to work.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-2-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 0fbcb5251fc81b58969b272c4fb7374a7b922e3e)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 1b45cc5c7b920fd8bf72e5a888ec7abeadf41e09
Author: Ye Bin <yebin10@huawei.com>
For scan loop must ensure that at least EXT4_FC_TAG_BASE_LEN space. If remain
space less than EXT4_FC_TAG_BASE_LEN which will lead to out of bound read
when mounting corrupt file system image.
ADD_RANGE/HEAD/TAIL is needed to add extra check when do journal scan, as this
three tags will read data during scan, tag length couldn't less than data length
which will read.
Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20220924075233.2315259-4-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 1b45cc5c7b920fd8bf72e5a888ec7abeadf41e09)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit dcc5827484d6e53ccda12334f8bbfafcc593ceda
Author: Ye Bin <yebin10@huawei.com>
Factor out ext4_fc_get_tl() to fill 'tl' with host byte order.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20220924075233.2315259-3-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit dcc5827484d6e53ccda12334f8bbfafcc593ceda)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 7ff5fddaddf2cc8d394f71e68648e9d8d7e41da8
Author: Ye Bin <yebin10@huawei.com>
Factor out ext4_free_ext_path() to free extent path. As after previous patch
'ext4_ext_drop_refs()' is only used in 'extents.c', so make it static.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220924021211.3831551-3-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 7ff5fddaddf2cc8d394f71e68648e9d8d7e41da8)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 27cd49780381c6ccbf248798e5e8fd076200ffba
Author: Ye Bin <yebin10@huawei.com>
To avoid to 'state->fc_regions_size' mismatch with 'state->fc_regions'
when fail to reallocate 'fc_reqions',only update 'state->fc_regions_size'
after 'state->fc_regions' is allocated successfully.
Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220921064040.3693255-4-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 27cd49780381c6ccbf248798e5e8fd076200ffba)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 7069d105c1f15c442b68af43f7fde784f3126739
Author: Ye Bin <yebin10@huawei.com>
As krealloc may return NULL, in this case 'state->fc_regions' may not be
freed by krealloc, but 'state->fc_regions' already set NULL. Then will
lead to 'state->fc_regions' memory leak.
Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220921064040.3693255-3-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 7069d105c1f15c442b68af43f7fde784f3126739)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 9305721a309fa1bd7c194e0d4a2335bf3b29dca4
Author: Ye Bin <yebin10@huawei.com>
As krealloc may return NULL, in this case 'state->fc_modified_inodes'
may not be freed by krealloc, but 'state->fc_modified_inodes' already
set NULL. Then will lead to 'state->fc_modified_inodes' memory leak.
Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220921064040.3693255-2-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 9305721a309fa1bd7c194e0d4a2335bf3b29dca4)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit e64e6ca90913b0dfe14bf7d529df0753a6746e23
Author: Ye Bin <yebin10@huawei.com>
If fastcommit is already disabled, there isn't need to mark inode ineligible.
So move 'ext4_fc_disabled()' judgement bofore 'ext4_should_journal_data(inode)'
judgement which can avoid to do meaningless judgement.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220916083836.388347-3-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit e64e6ca90913b0dfe14bf7d529df0753a6746e23)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit b7b80a35fb51319223e1fbf84128b8e5ebb91f86
Author: Ye Bin <yebin10@huawei.com>
Factor out ext4_fc_disabled(). No functional change.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220916083836.388347-2-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit b7b80a35fb51319223e1fbf84128b8e5ebb91f86)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit ccbf8eeb39f2ff00b54726a2b20b35d788c4ecb5
Author: Ye Bin <yebin10@huawei.com>
In 'ext4_fc_write_inode' function first call 'ext4_get_inode_loc' get 'iloc',
after use it miss release 'iloc.bh'.
So just release 'iloc.bh' before 'ext4_fc_write_inode' return.
Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220914100859.1415196-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit ccbf8eeb39f2ff00b54726a2b20b35d788c4ecb5)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2145193
Tested: xfstests
Upstream Status: upstream
commit 4978c659e7b5c1926cdb4b556e4ca1fd2de8ad42
Author: Jan Kara <jack@suse.cz>
We use jbd_debug() in some places in ext4. It seems a bit strange to use
jbd2 debugging output function for ext4 code. Also these days
ext4_debug() uses dynamic printk so each debug message can be enabled /
disabled on its own so the time when it made some sense to have these
combined (to allow easier common selecting of messages to report) has
passed. Just convert all jbd_debug() uses in ext4 to ext4_debug().
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220608112355.4397-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
(cherry picked from commit 4978c659e7b5c1926cdb4b556e4ca1fd2de8ad42)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2118511
commit 67c0f556302cfcdb5b5fb7933afa08cb1de75b36
Author: Bart Van Assche <bvanassche@acm.org>
Date: Thu Jul 14 11:07:17 2022 -0700
fs/ext4: Use the new blk_opf_t type
Improve static type checking by using the new blk_opf_t type for
variables that represent request flags.
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Baokun Li <libaokun1@huawei.com>
Cc: Ye Bin <yebin10@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-52-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2118511
Conflicts: drop change on ntfs3 which isn't supported on rhel,
drop one harmless change on ext4
commit 1420c4a549bf28ffddbed827d61fb3d4d2132ddb
Author: Bart Van Assche <bvanassche@acm.org>
Date: Thu Jul 14 11:07:13 2022 -0700
fs/buffer: Combine two submit_bh() and ll_rw_block() arguments
Both submit_bh() and ll_rw_block() accept a request operation type and
request flags as their first two arguments. Micro-optimize these two
functions by combining these first two arguments into a single argument.
This patch does not change the behavior of any of the modified code.
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Song Liu <song@kernel.org> (for the md changes)
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-48-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit 5641ace54471cb5c393e71f33232088602455c6b
Author: Ritesh Harjani <riteshh@linux.ibm.com>
This adds commit_tid info in ext4_fc_commit_start/stop which is helpful
in debugging fast_commit issues.
For e.g. issues where due to jbd2 journal full commit, FC miss to commit
updates to a file.
Also improves TP_prink format string i.e. all ext4 and jbd2 trace events
starts with "dev MAjOR,MINOR". Let's follow the same convention while we
are still at it.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/ebcd6b9ab5b718db30f90854497886801ce38c63.1647057583.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit d9bf099cb980d63cd9a45a135259a6cabcb814a5
Author: Ritesh Harjani <riteshh@linux.ibm.com>
This adds commit_tid argument in ext4_fc_update_stats()
so that we can add this information too in jbd_debug logs.
This is also required in a later patch to pass the commit_tid info in
ext4_fc_commit_start/stop() trace events.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/dabda3f2919a60e01887e798bf5915216b451733.1647057583.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit 1d2e2440c5191da82b1191298909283a58f0ece8
Author: Ritesh Harjani <riteshh@linux.ibm.com>
This patch adds the transaction & inode tid info in trace events for
callers of ext4_fc_track_template(). This is helpful in debugging race
conditions where an inode could belong to two different transaction tids.
It also fixes the checkpatch warnings which says use tabs instead of
spaces.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/c203c09dc11bb372803c430f621f25a4b8c2c8b4.1647057583.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit 78be0471da4e9fff874307d73d68f0173fa6a154
Author: Ritesh Harjani <riteshh@linux.ibm.com>
Currently ext4_fc_track_template() checks, whether the trace event
path belongs to replay or does sb has ineligible set, if yes it simply
returns. This patch pulls those checks before calling
ext4_fc_track_template() in the callers of ext4_fc_track_template().
[ Add checks to ext4_rename() which calls the __ext4_fc_track_*()
functions directly. -- TYT ]
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/3cd025d9c490218a92e6d8fb30b6123e693373e3.1647057583.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit cdce59a1549190b66f8e3fe465c2b2f714b98a94
Author: Ritesh Harjani <riteshh@linux.ibm.com>
Current code does not fully takes care of krealloc() error case, which
could lead to silent memory corruption or a kernel bug. This patch
fixes that.
Also it cleans up some duplicated error handling logic from various
functions in fast_commit.c file.
Reported-by: luo penghao <luo.penghao@zte.com.cn>
Suggested-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/62e8b6a1cce9359682051deb736a3c0953c9d1e9.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit bdc8a53a6f2f0b1cb5f991440f2100732299eb93
Author: Xin Yin <yinxin.x@bytedance.com>
in the follow scenario:
1. jbd start transaction n
2. task A get new handle for transaction n+1
3. task A do some actions and add inode to FC_Q_MAIN fc_q
4. jbd complete transaction n and clear FC_Q_MAIN fc_q
5. task A call fsync
Fast commit will lost the file actions during a full commit.
we should also add updates to staging queue during a full commit.
and in ext4_fc_cleanup(), when reset a inode's fc track range, check
it's i_sync_tid, if it bigger than current transaction tid, do not
rest it, or we will lost the track range.
And EXT4_MF_FC_COMMITTING is not needed anymore, so drop it.
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20220117093655.35160-3-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit e85c81ba8859a4c839bcd69c5d83b32954133a5b
Author: Xin Yin <yinxin.x@bytedance.com>
For the follow scenario:
1. jbd start commit transaction n
2. task A get new handle for transaction n+1
3. task A do some ineligible actions and mark FC_INELIGIBLE
4. jbd complete transaction n and clean FC_INELIGIBLE
5. task A call fsync
In this case fast commit will not fallback to full commit and
transaction n+1 also not handled by jbd.
Make ext4_fc_mark_ineligible() also record transaction tid for
latest ineligible case, when call ext4_fc_cleanup() check
current transaction tid, if small than latest ineligible tid
do not clear the EXT4_MF_FC_INELIGIBLE.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Suggested-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20220117093655.35160-2-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2079868
Tested: xfstests
Upstream Status: upstream
commit 599ea31d13617c5484c40cdf50d88301dc351cfc
Author: Xin Yin <yinxin.x@bytedance.com>
During fast commit replay procedure, we clear inode blocks bitmap in
ext4_ext_clear_bb(), this may cause ext4_mb_new_blocks_simple() allocate
blocks still in use.
Make ext4_fc_record_regions() also record physical disk regions used by
inodes during replay procedure. Then ext4_mb_new_blocks_simple() can
excludes these blocks in use.
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20220110035141.1980-2-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2041486
Tested: xfstests
Upstream Status: upstream
commit a660be97eb00c4d87bf881e1226fbd9d812690b7
Author: luo penghao <luo.penghao@zte.com.cn>
The local variable assignment at the end of the function is meaningless.
The clang_analyzer complains as follows:
fs/ext4/fast_commit.c:779:2 warning:
Value stored to 'dst' is never read
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Link: https://lore.kernel.org/r/20211104063406.2747-1-luo.penghao@zte.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2041486
Tested: xfstests
Upstream Status: upstream
commit 0b5b5a62b945a141e64011b2f90ee7e46f14be98
Author: Xin Yin <yinxin.x@bytedance.com>
For now ,we use ext4_punch_hole() during fast commit replay delete range
procedure. But it will be affected by inode->i_size, which may not
correct during fast commit replay procedure. The following test will
failed.
-create & write foo (len 1000K)
-falloc FALLOC_FL_ZERO_RANGE foo (range 400K - 600K)
-create & fsync bar
-falloc FALLOC_FL_PUNCH_HOLE foo (range 300K-500K)
-fsync foo
-crash before a full commit
After the fast_commit reply procedure, the range 400K-500K will not be
removed. Because in this case, when calling ext4_punch_hole() the
inode->i_size is 0, and it just retruns with doing nothing.
Change to use ext4_ext_remove_space() instead of ext4_punch_hole()
to remove blocks of inode directly.
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211223032337.5198-2-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2041486
Tested: xfstests
Upstream Status: upstream
commit d1199b94474ac4513b8491a4b751a8a466e1886b
Author: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
This series takes care of a couple of TODOs and adds new ones. Update
the TODOs section to reflect current state and future work that needs
to happen.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211223202140.2061101-5-harshads@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2041486
Tested: xfstests
Upstream Status: upstream
commit 1ebf21784b19d5bc269f39a5d1eedb7f29a7d152
Author: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Since there are no blocks in an inline data inode, there's no point in
fixing iblocks field in fast commit replay path for this inode.
Similarly, there's no point in fixing any block bitmaps / global block
counters with respect to such an inode. Just bail out from these
functions if an inline data inode is encountered.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211015182513.395917-2-harshads@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2041486
Tested: xfstests
Upstream Status: upstream
commit 6c31a689b2e9e1dee5cbe16b773648a2d84dfb02
Author: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
During the commit phase in fast commits if an inode with inline data
is being committed, also commit the inline data along with
inode. Since recovery code just blindly copies entire content found in
inode TLV, there is no change needed on the recovery path. Thus, this
change is backward compatiable.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211015182513.395917-1-harshads@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Bugzilla: https://bugzilla.redhat.com/2041486
Tested: xfstests
Upstream Status: upstream
commit a2c2f0826e2b75560b31daf1cd9a755ab93cf4c6
Author: Hou Tao <houtao1@huawei.com>
Now EXT4_FC_TAG_ADD_RANGE uses ext4_extent to track the
newly-added blocks, but the limit on the max value of
ee_len field is ignored, and it can lead to BUG_ON as
shown below when running command "fallocate -l 128M file"
on a fast_commit-enabled fs:
kernel BUG at fs/ext4/ext4_extents.h:199!
invalid opcode: 0000 [#1] SMP PTI
CPU: 3 PID: 624 Comm: fallocate Not tainted 5.14.0-rc6+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:ext4_fc_write_inode_data+0x1f3/0x200
Call Trace:
? ext4_fc_write_inode+0xf2/0x150
ext4_fc_commit+0x93b/0xa00
? ext4_fallocate+0x1ad/0x10d0
ext4_sync_file+0x157/0x340
? ext4_sync_file+0x157/0x340
vfs_fsync_range+0x49/0x80
do_fsync+0x3d/0x70
__x64_sys_fsync+0x14/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Simply fixing it by limiting the number of blocks
in one EXT4_FC_TAG_ADD_RANGE TLV.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: stable@kernel.org
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210820044505.474318-1-houtao1@huawei.com
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Fast commit recovery data on disk may not be aligned. So, when the
recovery code reads it, this patch makes sure that fast commit info
found on-disk is first memcpy-ed into an aligned variable before
accessing it. As a consequence of it, we also remove some macros that
could resulted in unaligned accesses.
Cc: stable@kernel.org
Fixes: 8016e29f43 ("ext4: fast commit recovery path")
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20210519215920.2037527-1-harshads@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Using no_printk() for jbd_debug() revealed two warnings:
fs/jbd2/recovery.c: In function 'fc_do_one_pass':
fs/jbd2/recovery.c:256:30: error: format '%d' expects a matching 'int' argument [-Werror=format=]
256 | jbd_debug(3, "Processing fast commit blk with seq %d");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/ext4/fast_commit.c: In function 'ext4_fc_replay_add_range':
fs/ext4/fast_commit.c:1732:30: error: format '%d' expects argument of type 'int', but argument 2 has type 'long unsigned int' [-Werror=format=]
1732 | jbd_debug(1, "Converting from %d to %d %lld",
The first one was added incorrectly, and was also missing a few newlines
in debug output, and the second one happened when the type of an
argument changed.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: d556435156 ("jbd2: avoid -Wempty-body warnings")
Fixes: 6db0746189 ("ext4: use BIT() macro for BH_** state bits")
Fixes: 5b849b5f96 ("jbd2: fast commit recovery path")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210409201211.1866633-1-arnd@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>