* [PATCH AUTOSEL 5.0 057/262] jbd2: fix invalid descriptor block checksum
[not found] <20190327180158.10245-1-sashal@kernel.org>
@ 2019-03-27 17:58 ` Sasha Levin
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 058/262] ext4: fix bigalloc cluster freeing when hole punching under load Sasha Levin
2019-03-27 17:59 ` [PATCH AUTOSEL 5.0 137/262] jbd2: fix race when writing superblock Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 17:58 UTC (permalink / raw)
To: linux-kernel, stable
Cc: luojiajun, Theodore Ts'o, Sasha Levin, linux-ext4
From: luojiajun <luojiajun3@huawei.com>
[ Upstream commit 6e876c3dd205d30b0db6850e97a03d75457df007 ]
In jbd2_journal_commit_transaction(), if we are in abort mode,
we may flush the buffer without setting descriptor block checksum
by goto start_journal_io. Then fs is mounted,
jbd2_descriptor_block_csum_verify() failed.
[ 271.379811] EXT4-fs (vdd): shut down requested (2)
[ 271.381827] Aborting journal on device vdd-8.
[ 271.597136] JBD2: Invalid checksum recovering block 22199 in log
[ 271.598023] JBD2: recovery failed
[ 271.598484] EXT4-fs (vdd): error loading journal
Fix this problem by keep setting descriptor block checksum if the
descriptor buffer is not NULL.
This checksum problem can be reproduced by xfstests generic/388.
Signed-off-by: luojiajun <luojiajun3@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/jbd2/commit.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 2eb55c3361a8..efd0ce9489ae 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -694,9 +694,11 @@ void jbd2_journal_commit_transaction(journal_t *journal)
the last tag we set up. */
tag->t_flags |= cpu_to_be16(JBD2_FLAG_LAST_TAG);
-
- jbd2_descriptor_block_csum_set(journal, descriptor);
start_journal_io:
+ if (descriptor)
+ jbd2_descriptor_block_csum_set(journal,
+ descriptor);
+
for (i = 0; i < bufs; i++) {
struct buffer_head *bh = wbuf[i];
/*
--
2.19.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH AUTOSEL 5.0 058/262] ext4: fix bigalloc cluster freeing when hole punching under load
[not found] <20190327180158.10245-1-sashal@kernel.org>
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 057/262] jbd2: fix invalid descriptor block checksum Sasha Levin
@ 2019-03-27 17:58 ` Sasha Levin
2019-03-27 17:59 ` [PATCH AUTOSEL 5.0 137/262] jbd2: fix race when writing superblock Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 17:58 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Eric Whitney, Theodore Ts'o, Sasha Levin, linux-ext4
From: Eric Whitney <enwlinux@gmail.com>
[ Upstream commit 7bd75230b43727b258a4f7a59d62114cffe1b6c8 ]
Ext4 may not free clusters correctly when punching holes in bigalloc
file systems under high load conditions. If it's not possible to
extend and restart the journal in ext4_ext_rm_leaf() when preparing to
remove blocks from a punched region, a retry of the entire punch
operation is triggered in ext4_ext_remove_space(). This causes a
partial cluster to be set to the first cluster in the extent found to
the right of the punched region. However, if the punch operation
prior to the retry had made enough progress to delete one or more
extents and a partial cluster candidate for freeing had already been
recorded, the retry would overwrite the partial cluster. The loss of
this information makes it impossible to correctly free the original
partial cluster in all cases.
This bug can cause generic/476 to fail when run as part of
xfstests-bld's bigalloc and bigalloc_1k test cases. The failure is
reported when e2fsck detects bad iblocks counts greater than expected
in units of whole clusters and also detects a number of negative block
bitmap differences equal to the iblocks discrepancy in cluster units.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/ext4/extents.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 240b6dea5441..252bbbb5a2f4 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2956,14 +2956,17 @@ int ext4_ext_remove_space(struct inode *inode, ext4_lblk_t start,
if (err < 0)
goto out;
- } else if (sbi->s_cluster_ratio > 1 && end >= ex_end) {
+ } else if (sbi->s_cluster_ratio > 1 && end >= ex_end &&
+ partial.state == initial) {
/*
- * If there's an extent to the right its first cluster
- * contains the immediate right boundary of the
- * truncated/punched region. Set partial_cluster to
- * its negative value so it won't be freed if shared
- * with the current extent. The end < ee_block case
- * is handled in ext4_ext_rm_leaf().
+ * If we're punching, there's an extent to the right.
+ * If the partial cluster hasn't been set, set it to
+ * that extent's first cluster and its state to nofree
+ * so it won't be freed should it contain blocks to be
+ * removed. If it's already set (tofree/nofree), we're
+ * retrying and keep the original partial cluster info
+ * so a cluster marked tofree as a result of earlier
+ * extent removal is not lost.
*/
lblk = ex_end + 1;
err = ext4_ext_search_right(inode, path, &lblk, &pblk,
--
2.19.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH AUTOSEL 5.0 137/262] jbd2: fix race when writing superblock
[not found] <20190327180158.10245-1-sashal@kernel.org>
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 057/262] jbd2: fix invalid descriptor block checksum Sasha Levin
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 058/262] ext4: fix bigalloc cluster freeing when hole punching under load Sasha Levin
@ 2019-03-27 17:59 ` Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 17:59 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Theodore Ts'o, Sasha Levin, linux-ext4
From: Theodore Ts'o <tytso@mit.edu>
[ Upstream commit 538bcaa6261b77e71d37f5596c33127c1a3ec3f7 ]
The jbd2 superblock is lockless now, so there is probably a race
condition between writing it so disk and modifing contents of it, which
may lead to checksum error. The following race is the one case that we
have captured.
jbd2 fsstress
jbd2_journal_commit_transaction
jbd2_journal_update_sb_log_tail
jbd2_write_superblock
jbd2_superblock_csum_set jbd2_journal_revoke
jbd2_journal_set_features(revork)
modify superblock
submit_bh(checksum incorrect)
Fix this by locking the buffer head before modifing it. We always
write the jbd2 superblock after we modify it, so this just means
calling the lock_buffer() a little earlier.
This checksum corruption problem can be reproduced by xfstests
generic/475.
Reported-by: zhangyi (F) <yi.zhang@huawei.com>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/jbd2/journal.c | 52 ++++++++++++++++++++++++-----------------------
1 file changed, 27 insertions(+), 25 deletions(-)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 8ef6b6daaa7a..88f2a49338a1 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1356,6 +1356,10 @@ static int journal_reset(journal_t *journal)
return jbd2_journal_start_thread(journal);
}
+/*
+ * This function expects that the caller will have locked the journal
+ * buffer head, and will return with it unlocked
+ */
static int jbd2_write_superblock(journal_t *journal, int write_flags)
{
struct buffer_head *bh = journal->j_sb_buffer;
@@ -1365,7 +1369,6 @@ static int jbd2_write_superblock(journal_t *journal, int write_flags)
trace_jbd2_write_superblock(journal, write_flags);
if (!(journal->j_flags & JBD2_BARRIER))
write_flags &= ~(REQ_FUA | REQ_PREFLUSH);
- lock_buffer(bh);
if (buffer_write_io_error(bh)) {
/*
* Oh, dear. A previous attempt to write the journal
@@ -1424,6 +1427,7 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid,
jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n",
tail_block, tail_tid);
+ lock_buffer(journal->j_sb_buffer);
sb->s_sequence = cpu_to_be32(tail_tid);
sb->s_start = cpu_to_be32(tail_block);
@@ -1454,18 +1458,17 @@ static void jbd2_mark_journal_empty(journal_t *journal, int write_op)
journal_superblock_t *sb = journal->j_superblock;
BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex));
- read_lock(&journal->j_state_lock);
- /* Is it already empty? */
- if (sb->s_start == 0) {
- read_unlock(&journal->j_state_lock);
+ lock_buffer(journal->j_sb_buffer);
+ if (sb->s_start == 0) { /* Is it already empty? */
+ unlock_buffer(journal->j_sb_buffer);
return;
}
+
jbd_debug(1, "JBD2: Marking journal as empty (seq %d)\n",
journal->j_tail_sequence);
sb->s_sequence = cpu_to_be32(journal->j_tail_sequence);
sb->s_start = cpu_to_be32(0);
- read_unlock(&journal->j_state_lock);
jbd2_write_superblock(journal, write_op);
@@ -1488,9 +1491,8 @@ void jbd2_journal_update_sb_errno(journal_t *journal)
journal_superblock_t *sb = journal->j_superblock;
int errcode;
- read_lock(&journal->j_state_lock);
+ lock_buffer(journal->j_sb_buffer);
errcode = journal->j_errno;
- read_unlock(&journal->j_state_lock);
if (errcode == -ESHUTDOWN)
errcode = 0;
jbd_debug(1, "JBD2: updating superblock error (errno %d)\n", errcode);
@@ -1894,28 +1896,27 @@ int jbd2_journal_set_features (journal_t *journal, unsigned long compat,
sb = journal->j_superblock;
+ /* Load the checksum driver if necessary */
+ if ((journal->j_chksum_driver == NULL) &&
+ INCOMPAT_FEATURE_ON(JBD2_FEATURE_INCOMPAT_CSUM_V3)) {
+ journal->j_chksum_driver = crypto_alloc_shash("crc32c", 0, 0);
+ if (IS_ERR(journal->j_chksum_driver)) {
+ printk(KERN_ERR "JBD2: Cannot load crc32c driver.\n");
+ journal->j_chksum_driver = NULL;
+ return 0;
+ }
+ /* Precompute checksum seed for all metadata */
+ journal->j_csum_seed = jbd2_chksum(journal, ~0, sb->s_uuid,
+ sizeof(sb->s_uuid));
+ }
+
+ lock_buffer(journal->j_sb_buffer);
+
/* If enabling v3 checksums, update superblock */
if (INCOMPAT_FEATURE_ON(JBD2_FEATURE_INCOMPAT_CSUM_V3)) {
sb->s_checksum_type = JBD2_CRC32C_CHKSUM;
sb->s_feature_compat &=
~cpu_to_be32(JBD2_FEATURE_COMPAT_CHECKSUM);
-
- /* Load the checksum driver */
- if (journal->j_chksum_driver == NULL) {
- journal->j_chksum_driver = crypto_alloc_shash("crc32c",
- 0, 0);
- if (IS_ERR(journal->j_chksum_driver)) {
- printk(KERN_ERR "JBD2: Cannot load crc32c "
- "driver.\n");
- journal->j_chksum_driver = NULL;
- return 0;
- }
-
- /* Precompute checksum seed for all metadata */
- journal->j_csum_seed = jbd2_chksum(journal, ~0,
- sb->s_uuid,
- sizeof(sb->s_uuid));
- }
}
/* If enabling v1 checksums, downgrade superblock */
@@ -1927,6 +1928,7 @@ int jbd2_journal_set_features (journal_t *journal, unsigned long compat,
sb->s_feature_compat |= cpu_to_be32(compat);
sb->s_feature_ro_compat |= cpu_to_be32(ro);
sb->s_feature_incompat |= cpu_to_be32(incompat);
+ unlock_buffer(journal->j_sb_buffer);
return 1;
#undef COMPAT_FEATURE_ON
--
2.19.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-03-27 19:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20190327180158.10245-1-sashal@kernel.org>
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 057/262] jbd2: fix invalid descriptor block checksum Sasha Levin
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 058/262] ext4: fix bigalloc cluster freeing when hole punching under load Sasha Levin
2019-03-27 17:59 ` [PATCH AUTOSEL 5.0 137/262] jbd2: fix race when writing superblock Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).