public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed
@ 2023-03-13 13:20 Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting Zhihao Cheng
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-13 13:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack, tudor.ambarus
  Cc: linux-ext4, linux-kernel, chengzhihao1, yi.zhang

Patch 1 fixes reusing stale buffer heads from last failed mounting.
Patch 2~4 reconstructs 'j_format_version' initialization and checking
in loading process.

v1->v2:
  Adopt suggestions from Tudor, add fix tag and corrupt 'stable' field
  in patch 1.
  Reserve empty lines in patch 4.

Zhang Yi (4):
  jbd2: remove unused feature macros
  jbd2: switch to check format version in superblock directly
  jbd2: factor out journal initialization from journal_get_superblock()
  jbd2: remove j_format_version

Zhihao Cheng (1):
  ext4: Fix reusing stale buffer heads from last failed mounting

 fs/ext4/super.c      | 15 +++++++------
 fs/jbd2/journal.c    | 53 +++++++++++++++++---------------------------
 include/linux/jbd2.h | 33 ++++++++++++---------------
 3 files changed, 42 insertions(+), 59 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-13 13:20 [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed Zhihao Cheng
@ 2023-03-13 13:20 ` Zhihao Cheng
  2023-03-14 11:33   ` Jan Kara
  2023-03-13 13:20 ` [PATCH v2 2/5] jbd2: remove unused feature macros Zhihao Cheng
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-13 13:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack, tudor.ambarus
  Cc: linux-ext4, linux-kernel, chengzhihao1, yi.zhang

Following process makes ext4 load stale buffer heads from last failed
mounting in a new mounting operation:
mount_bdev
 ext4_fill_super
 | ext4_load_and_init_journal
 |  ext4_load_journal
 |   jbd2_journal_load
 |    load_superblock
 |     journal_get_superblock
 |      set_buffer_verified(bh) // buffer head is verified
 |   jbd2_journal_recover // failed caused by EIO
 | goto failed_mount3a // skip 'sb->s_root' initialization
 deactivate_locked_super
  kill_block_super
   generic_shutdown_super
    if (sb->s_root)
    // false, skip ext4_put_super->invalidate_bdev->
    // invalidate_mapping_pages->mapping_evict_folio->
    // filemap_release_folio->try_to_free_buffers, which
    // cannot drop buffer head.
   blkdev_put
    blkdev_put_whole
     if (atomic_dec_and_test(&bdev->bd_openers))
     // false, systemd-udev happens to open the device. Then
     // blkdev_flush_mapping->kill_bdev->truncate_inode_pages->
     // truncate_inode_folio->truncate_cleanup_folio->
     // folio_invalidate->block_invalidate_folio->
     // filemap_release_folio->try_to_free_buffers will be skipped,
     // dropping buffer head is missed again.

Second mount:
ext4_fill_super
 ext4_load_and_init_journal
  ext4_load_journal
   ext4_get_journal
    jbd2_journal_init_inode
     journal_init_common
      bh = getblk_unmovable
       bh = __find_get_block // Found stale bh in last failed mounting
      journal->j_sb_buffer = bh
   jbd2_journal_load
    load_superblock
     journal_get_superblock
      if (buffer_verified(bh))
      // true, skip journal->j_format_version = 2, value is 0
    jbd2_journal_recover
     do_one_pass
      next_log_block += count_tags(journal, bh)
      // According to journal_tag_bytes(), 'tag_bytes' calculating is
      // affected by jbd2_has_feature_csum3(), jbd2_has_feature_csum3()
      // returns false because 'j->j_format_version >= 2' is not true,
      // then we get wrong next_log_block. The do_one_pass may exit
      // early whenoccuring non JBD2_MAGIC_NUMBER in 'next_log_block'.

The filesystem is corrupted here, journal is partially replayed, and
new journal sequence number actually is already used by last mounting.

The invalidate_bdev() can drop all buffer heads even racing with bare
reading block device(eg. systemd-udev), so we can fix it by invalidating
bdev in error handling path in __ext4_fill_super().

Fetch a reproducer in [Link].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217171
Fixes: 25ed6e8a54df ("jbd2: enable journal clients to enable v2 checksumming")
Cc: stable@vger.kernel.org # v3.5
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/ext4/super.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 88f7b8a88c76..7e990637bc48 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1126,6 +1126,12 @@ static void ext4_blkdev_remove(struct ext4_sb_info *sbi)
 	struct block_device *bdev;
 	bdev = sbi->s_journal_bdev;
 	if (bdev) {
+		/*
+		 * Invalidate the journal device's buffers.  We don't want them
+		 * floating about in memory - the physical journal device may
+		 * hotswapped, and it breaks the `ro-after' testing code.
+		 */
+		invalidate_bdev(bdev);
 		ext4_blkdev_put(bdev);
 		sbi->s_journal_bdev = NULL;
 	}
@@ -1271,14 +1277,8 @@ static void ext4_put_super(struct super_block *sb)
 
 	sync_blockdev(sb->s_bdev);
 	invalidate_bdev(sb->s_bdev);
-	if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) {
-		/*
-		 * Invalidate the journal device's buffers.  We don't want them
-		 * floating about in memory - the physical journal device may
-		 * hotswapped, and it breaks the `ro-after' testing code.
-		 */
+	if (sbi->s_journal_bdev) {
 		sync_blockdev(sbi->s_journal_bdev);
-		invalidate_bdev(sbi->s_journal_bdev);
 		ext4_blkdev_remove(sbi);
 	}
 
@@ -5610,6 +5610,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 	brelse(sbi->s_sbh);
 	ext4_blkdev_remove(sbi);
 out_fail:
+	invalidate_bdev(sb->s_bdev);
 	sb->s_fs_info = NULL;
 	return err ? err : ret;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/5] jbd2: remove unused feature macros
  2023-03-13 13:20 [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting Zhihao Cheng
@ 2023-03-13 13:20 ` Zhihao Cheng
  2023-03-14 11:35   ` Jan Kara
  2023-03-13 13:20 ` [PATCH v2 3/5] jbd2: switch to check format version in superblock directly Zhihao Cheng
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-13 13:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack, tudor.ambarus
  Cc: linux-ext4, linux-kernel, chengzhihao1, yi.zhang

From: Zhang Yi <yi.zhang@huawei.com>

JBD2_HAS_[IN|RO_]COMPAT_FEATURE macros are no longer used, just remove
them.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 include/linux/jbd2.h | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 5962072a4b19..ad7bb6861143 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -274,17 +274,6 @@ typedef struct journal_superblock_s
 /* 0x0400 */
 } journal_superblock_t;
 
-/* Use the jbd2_{has,set,clear}_feature_* helpers; these will be removed */
-#define JBD2_HAS_COMPAT_FEATURE(j,mask)					\
-	((j)->j_format_version >= 2 &&					\
-	 ((j)->j_superblock->s_feature_compat & cpu_to_be32((mask))))
-#define JBD2_HAS_RO_COMPAT_FEATURE(j,mask)				\
-	((j)->j_format_version >= 2 &&					\
-	 ((j)->j_superblock->s_feature_ro_compat & cpu_to_be32((mask))))
-#define JBD2_HAS_INCOMPAT_FEATURE(j,mask)				\
-	((j)->j_format_version >= 2 &&					\
-	 ((j)->j_superblock->s_feature_incompat & cpu_to_be32((mask))))
-
 #define JBD2_FEATURE_COMPAT_CHECKSUM		0x00000001
 
 #define JBD2_FEATURE_INCOMPAT_REVOKE		0x00000001
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/5] jbd2: switch to check format version in superblock directly
  2023-03-13 13:20 [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 2/5] jbd2: remove unused feature macros Zhihao Cheng
@ 2023-03-13 13:20 ` Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 4/5] jbd2: factor out journal initialization from journal_get_superblock() Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 5/5] jbd2: remove j_format_version Zhihao Cheng
  4 siblings, 0 replies; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-13 13:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack, tudor.ambarus
  Cc: linux-ext4, linux-kernel, chengzhihao1, yi.zhang

From: Zhang Yi <yi.zhang@huawei.com>

We should only check and set extented features if journal format version
is 2, and now we check the in memory copy of the superblock
'journal->j_format_version', which relys on the parameter initialization
sequence, switch to use the h_blocktype in superblock cloud be more
clear.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/jbd2/journal.c    | 16 +++++++---------
 include/linux/jbd2.h | 17 ++++++++++++++---
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index e80c781731f8..b991d5c21d16 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2059,10 +2059,12 @@ int jbd2_journal_load(journal_t *journal)
 		return err;
 
 	sb = journal->j_superblock;
-	/* If this is a V2 superblock, then we have to check the
-	 * features flags on it. */
 
-	if (journal->j_format_version >= 2) {
+	/*
+	 * If this is a V2 superblock, then we have to check the
+	 * features flags on it.
+	 */
+	if (jbd2_format_support_feature(journal)) {
 		if ((sb->s_feature_ro_compat &
 		     ~cpu_to_be32(JBD2_KNOWN_ROCOMPAT_FEATURES)) ||
 		    (sb->s_feature_incompat &
@@ -2224,7 +2226,7 @@ int jbd2_journal_check_used_features(journal_t *journal, unsigned long compat,
 	if (journal->j_format_version == 0 &&
 	    journal_get_superblock(journal) != 0)
 		return 0;
-	if (journal->j_format_version == 1)
+	if (!jbd2_format_support_feature(journal))
 		return 0;
 
 	sb = journal->j_superblock;
@@ -2254,11 +2256,7 @@ int jbd2_journal_check_available_features(journal_t *journal, unsigned long comp
 	if (!compat && !ro && !incompat)
 		return 1;
 
-	/* We can support any known requested features iff the
-	 * superblock is in version 2.  Otherwise we fail to support any
-	 * extended sb features. */
-
-	if (journal->j_format_version != 2)
+	if (!jbd2_format_support_feature(journal))
 		return 0;
 
 	if ((compat   & JBD2_KNOWN_COMPAT_FEATURES) == compat &&
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index ad7bb6861143..7095c0f17ad0 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1305,11 +1305,22 @@ struct journal_s
 		rwsem_release(&j->j_trans_commit_map, _THIS_IP_); \
 	} while (0)
 
+/*
+ * We can support any known requested features iff the
+ * superblock is not in version 1.  Otherwise we fail to support any
+ * extended sb features.
+ */
+static inline bool jbd2_format_support_feature(journal_t *j)
+{
+	return j->j_superblock->s_header.h_blocktype !=
+					cpu_to_be32(JBD2_SUPERBLOCK_V1);
+}
+
 /* journal feature predicate functions */
 #define JBD2_FEATURE_COMPAT_FUNCS(name, flagname) \
 static inline bool jbd2_has_feature_##name(journal_t *j) \
 { \
-	return ((j)->j_format_version >= 2 && \
+	return (jbd2_format_support_feature(j) && \
 		((j)->j_superblock->s_feature_compat & \
 		 cpu_to_be32(JBD2_FEATURE_COMPAT_##flagname)) != 0); \
 } \
@@ -1327,7 +1338,7 @@ static inline void jbd2_clear_feature_##name(journal_t *j) \
 #define JBD2_FEATURE_RO_COMPAT_FUNCS(name, flagname) \
 static inline bool jbd2_has_feature_##name(journal_t *j) \
 { \
-	return ((j)->j_format_version >= 2 && \
+	return (jbd2_format_support_feature(j) && \
 		((j)->j_superblock->s_feature_ro_compat & \
 		 cpu_to_be32(JBD2_FEATURE_RO_COMPAT_##flagname)) != 0); \
 } \
@@ -1345,7 +1356,7 @@ static inline void jbd2_clear_feature_##name(journal_t *j) \
 #define JBD2_FEATURE_INCOMPAT_FUNCS(name, flagname) \
 static inline bool jbd2_has_feature_##name(journal_t *j) \
 { \
-	return ((j)->j_format_version >= 2 && \
+	return (jbd2_format_support_feature(j) && \
 		((j)->j_superblock->s_feature_incompat & \
 		 cpu_to_be32(JBD2_FEATURE_INCOMPAT_##flagname)) != 0); \
 } \
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/5] jbd2: factor out journal initialization from journal_get_superblock()
  2023-03-13 13:20 [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed Zhihao Cheng
                   ` (2 preceding siblings ...)
  2023-03-13 13:20 ` [PATCH v2 3/5] jbd2: switch to check format version in superblock directly Zhihao Cheng
@ 2023-03-13 13:20 ` Zhihao Cheng
  2023-03-13 13:20 ` [PATCH v2 5/5] jbd2: remove j_format_version Zhihao Cheng
  4 siblings, 0 replies; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-13 13:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack, tudor.ambarus
  Cc: linux-ext4, linux-kernel, chengzhihao1, yi.zhang

From: Zhang Yi <yi.zhang@huawei.com>

Current journal_get_superblock() couple journal superblock checking and
partial journal initialization, factor out initialization part from it
to make things clear.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/jbd2/journal.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index b991d5c21d16..f99c46d880b2 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1922,21 +1922,13 @@ static int journal_get_superblock(journal_t *journal)
 		goto out;
 	}
 
-	switch(be32_to_cpu(sb->s_header.h_blocktype)) {
-	case JBD2_SUPERBLOCK_V1:
-		journal->j_format_version = 1;
-		break;
-	case JBD2_SUPERBLOCK_V2:
-		journal->j_format_version = 2;
-		break;
-	default:
+	if (be32_to_cpu(sb->s_header.h_blocktype) != JBD2_SUPERBLOCK_V1 &&
+	    be32_to_cpu(sb->s_header.h_blocktype) != JBD2_SUPERBLOCK_V2) {
 		printk(KERN_WARNING "JBD2: unrecognised superblock format ID\n");
 		goto out;
 	}
 
-	if (be32_to_cpu(sb->s_maxlen) < journal->j_total_len)
-		journal->j_total_len = be32_to_cpu(sb->s_maxlen);
-	else if (be32_to_cpu(sb->s_maxlen) > journal->j_total_len) {
+	if (be32_to_cpu(sb->s_maxlen) > journal->j_total_len) {
 		printk(KERN_WARNING "JBD2: journal file too short\n");
 		goto out;
 	}
@@ -1979,25 +1971,14 @@ static int journal_get_superblock(journal_t *journal)
 			journal->j_chksum_driver = NULL;
 			goto out;
 		}
-	}
-
-	if (jbd2_journal_has_csum_v2or3(journal)) {
 		/* Check superblock checksum */
 		if (sb->s_checksum != jbd2_superblock_csum(journal, sb)) {
 			printk(KERN_ERR "JBD2: journal checksum error\n");
 			err = -EFSBADCRC;
 			goto out;
 		}
-
-		/* Precompute checksum seed for all metadata */
-		journal->j_csum_seed = jbd2_chksum(journal, ~0, sb->s_uuid,
-						   sizeof(sb->s_uuid));
 	}
-
-	journal->j_revoke_records_per_block =
-				journal_revoke_records_per_block(journal);
 	set_buffer_verified(bh);
-
 	return 0;
 
 out:
@@ -2022,12 +2003,30 @@ static int load_superblock(journal_t *journal)
 
 	sb = journal->j_superblock;
 
+	switch (be32_to_cpu(sb->s_header.h_blocktype)) {
+	case JBD2_SUPERBLOCK_V1:
+		journal->j_format_version = 1;
+		break;
+	case JBD2_SUPERBLOCK_V2:
+		journal->j_format_version = 2;
+		break;
+	}
+
 	journal->j_tail_sequence = be32_to_cpu(sb->s_sequence);
 	journal->j_tail = be32_to_cpu(sb->s_start);
 	journal->j_first = be32_to_cpu(sb->s_first);
 	journal->j_errno = be32_to_cpu(sb->s_errno);
 	journal->j_last = be32_to_cpu(sb->s_maxlen);
 
+	if (be32_to_cpu(sb->s_maxlen) < journal->j_total_len)
+		journal->j_total_len = be32_to_cpu(sb->s_maxlen);
+	/* Precompute checksum seed for all metadata */
+	if (jbd2_journal_has_csum_v2or3(journal))
+		journal->j_csum_seed = jbd2_chksum(journal, ~0, sb->s_uuid,
+						   sizeof(sb->s_uuid));
+	journal->j_revoke_records_per_block =
+				journal_revoke_records_per_block(journal);
+
 	if (jbd2_has_feature_fast_commit(journal)) {
 		journal->j_fc_last = be32_to_cpu(sb->s_maxlen);
 		num_fc_blocks = jbd2_journal_get_num_fc_blks(sb);
@@ -2223,8 +2222,7 @@ int jbd2_journal_check_used_features(journal_t *journal, unsigned long compat,
 	if (!compat && !ro && !incompat)
 		return 1;
 	/* Load journal superblock if it is not loaded yet. */
-	if (journal->j_format_version == 0 &&
-	    journal_get_superblock(journal) != 0)
+	if (journal_get_superblock(journal))
 		return 0;
 	if (!jbd2_format_support_feature(journal))
 		return 0;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/5] jbd2: remove j_format_version
  2023-03-13 13:20 [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed Zhihao Cheng
                   ` (3 preceding siblings ...)
  2023-03-13 13:20 ` [PATCH v2 4/5] jbd2: factor out journal initialization from journal_get_superblock() Zhihao Cheng
@ 2023-03-13 13:20 ` Zhihao Cheng
  4 siblings, 0 replies; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-13 13:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack, tudor.ambarus
  Cc: linux-ext4, linux-kernel, chengzhihao1, yi.zhang

From: Zhang Yi <yi.zhang@huawei.com>

journal->j_format_version is no longer used, remove it.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/jbd2/journal.c    | 9 ---------
 include/linux/jbd2.h | 5 -----
 2 files changed, 14 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index f99c46d880b2..c19cdd402a5f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2003,15 +2003,6 @@ static int load_superblock(journal_t *journal)
 
 	sb = journal->j_superblock;
 
-	switch (be32_to_cpu(sb->s_header.h_blocktype)) {
-	case JBD2_SUPERBLOCK_V1:
-		journal->j_format_version = 1;
-		break;
-	case JBD2_SUPERBLOCK_V2:
-		journal->j_format_version = 2;
-		break;
-	}
-
 	journal->j_tail_sequence = be32_to_cpu(sb->s_sequence);
 	journal->j_tail = be32_to_cpu(sb->s_start);
 	journal->j_first = be32_to_cpu(sb->s_first);
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 7095c0f17ad0..b7c79f68f7ca 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -792,11 +792,6 @@ struct journal_s
 	 */
 	journal_superblock_t	*j_superblock;
 
-	/**
-	 * @j_format_version: Version of the superblock format.
-	 */
-	int			j_format_version;
-
 	/**
 	 * @j_state_lock: Protect the various scalars in the journal.
 	 */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-13 13:20 ` [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting Zhihao Cheng
@ 2023-03-14 11:33   ` Jan Kara
  2023-03-14 12:01     ` Zhihao Cheng
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2023-03-14 11:33 UTC (permalink / raw)
  To: Zhihao Cheng
  Cc: tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

On Mon 13-03-23 21:20:17, Zhihao Cheng wrote:
> Following process makes ext4 load stale buffer heads from last failed
> mounting in a new mounting operation:
> mount_bdev
>  ext4_fill_super
>  | ext4_load_and_init_journal
>  |  ext4_load_journal
>  |   jbd2_journal_load
>  |    load_superblock
>  |     journal_get_superblock
>  |      set_buffer_verified(bh) // buffer head is verified
>  |   jbd2_journal_recover // failed caused by EIO
>  | goto failed_mount3a // skip 'sb->s_root' initialization
>  deactivate_locked_super
>   kill_block_super
>    generic_shutdown_super
>     if (sb->s_root)
>     // false, skip ext4_put_super->invalidate_bdev->
>     // invalidate_mapping_pages->mapping_evict_folio->
>     // filemap_release_folio->try_to_free_buffers, which
>     // cannot drop buffer head.
>    blkdev_put
>     blkdev_put_whole
>      if (atomic_dec_and_test(&bdev->bd_openers))
>      // false, systemd-udev happens to open the device. Then
>      // blkdev_flush_mapping->kill_bdev->truncate_inode_pages->
>      // truncate_inode_folio->truncate_cleanup_folio->
>      // folio_invalidate->block_invalidate_folio->
>      // filemap_release_folio->try_to_free_buffers will be skipped,
>      // dropping buffer head is missed again.
> 
> Second mount:
> ext4_fill_super
>  ext4_load_and_init_journal
>   ext4_load_journal
>    ext4_get_journal
>     jbd2_journal_init_inode
>      journal_init_common
>       bh = getblk_unmovable
>        bh = __find_get_block // Found stale bh in last failed mounting
>       journal->j_sb_buffer = bh
>    jbd2_journal_load
>     load_superblock
>      journal_get_superblock
>       if (buffer_verified(bh))
>       // true, skip journal->j_format_version = 2, value is 0
>     jbd2_journal_recover
>      do_one_pass
>       next_log_block += count_tags(journal, bh)
>       // According to journal_tag_bytes(), 'tag_bytes' calculating is
>       // affected by jbd2_has_feature_csum3(), jbd2_has_feature_csum3()
>       // returns false because 'j->j_format_version >= 2' is not true,
>       // then we get wrong next_log_block. The do_one_pass may exit
>       // early whenoccuring non JBD2_MAGIC_NUMBER in 'next_log_block'.
> 
> The filesystem is corrupted here, journal is partially replayed, and
> new journal sequence number actually is already used by last mounting.
> 
> The invalidate_bdev() can drop all buffer heads even racing with bare
> reading block device(eg. systemd-udev), so we can fix it by invalidating
> bdev in error handling path in __ext4_fill_super().
> 
> Fetch a reproducer in [Link].
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217171
> Fixes: 25ed6e8a54df ("jbd2: enable journal clients to enable v2 checksumming")
> Cc: stable@vger.kernel.org # v3.5
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>

...

> @@ -1271,14 +1277,8 @@ static void ext4_put_super(struct super_block *sb)
>  
>  	sync_blockdev(sb->s_bdev);
>  	invalidate_bdev(sb->s_bdev);
> -	if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) {
> -		/*
> -		 * Invalidate the journal device's buffers.  We don't want them
> -		 * floating about in memory - the physical journal device may
> -		 * hotswapped, and it breaks the `ro-after' testing code.
> -		 */
> +	if (sbi->s_journal_bdev) {
>  		sync_blockdev(sbi->s_journal_bdev);
> -		invalidate_bdev(sbi->s_journal_bdev);
>  		ext4_blkdev_remove(sbi);
>  	}

Hum, but this will invalidate bhs only if journal is stored on a block
device. If journal is in the inode (the common case), we won't invalidate
anything (sbi->s_journal_bdev is NULL) and the same problem can happen?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] jbd2: remove unused feature macros
  2023-03-13 13:20 ` [PATCH v2 2/5] jbd2: remove unused feature macros Zhihao Cheng
@ 2023-03-14 11:35   ` Jan Kara
  2023-03-14 12:02     ` Zhihao Cheng
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2023-03-14 11:35 UTC (permalink / raw)
  To: Zhihao Cheng
  Cc: tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

On Mon 13-03-23 21:20:18, Zhihao Cheng wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> JBD2_HAS_[IN|RO_]COMPAT_FEATURE macros are no longer used, just remove
> them.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>

I gave you my Reviewed-by on this patch (and a few others in this series).
Why didn't you include it?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-14 11:33   ` Jan Kara
@ 2023-03-14 12:01     ` Zhihao Cheng
  2023-03-14 12:11       ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-14 12:01 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

在 2023/3/14 19:33, Jan Kara 写道:
Hi Jan,

> 
>> @@ -1271,14 +1277,8 @@ static void ext4_put_super(struct super_block *sb)
>>   
>>   	sync_blockdev(sb->s_bdev);
>>   	invalidate_bdev(sb->s_bdev);

For journal in the inode case, journal bhs come from block device, which 
means buffers will be dropped after this line 
'invalidate_bdev(sb->s_bdev)' being executed.

>> -	if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) {
>> -		/*
>> -		 * Invalidate the journal device's buffers.  We don't want them
>> -		 * floating about in memory - the physical journal device may
>> -		 * hotswapped, and it breaks the `ro-after' testing code.
>> -		 */
>> +	if (sbi->s_journal_bdev) {
>>   		sync_blockdev(sbi->s_journal_bdev);
>> -		invalidate_bdev(sbi->s_journal_bdev);
>>   		ext4_blkdev_remove(sbi);
>>   	}
>  > Hum, but this will invalidate bhs only if journal is stored on a block
> device. If journal is in the inode (the common case), we won't invalidate
> anything (sbi->s_journal_bdev is NULL) and the same problem can happen?
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] jbd2: remove unused feature macros
  2023-03-14 11:35   ` Jan Kara
@ 2023-03-14 12:02     ` Zhihao Cheng
  0 siblings, 0 replies; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-14 12:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

Hi Jan,
> On Mon 13-03-23 21:20:18, Zhihao Cheng wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> JBD2_HAS_[IN|RO_]COMPAT_FEATURE macros are no longer used, just remove
>> them.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> 
> I gave you my Reviewed-by on this patch (and a few others in this series).
> Why didn't you include it?
> 

Sorry, my fault. Will add in v3.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-14 12:01     ` Zhihao Cheng
@ 2023-03-14 12:11       ` Jan Kara
  2023-03-14 12:31         ` Zhihao Cheng
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2023-03-14 12:11 UTC (permalink / raw)
  To: Zhihao Cheng
  Cc: Jan Kara, tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

On Tue 14-03-23 20:01:46, Zhihao Cheng wrote:
> 在 2023/3/14 19:33, Jan Kara 写道:
> Hi Jan,
> 
> > 
> > > @@ -1271,14 +1277,8 @@ static void ext4_put_super(struct super_block *sb)
> > >   	sync_blockdev(sb->s_bdev);
> > >   	invalidate_bdev(sb->s_bdev);
> 
> For journal in the inode case, journal bhs come from block device, which
> means buffers will be dropped after this line 'invalidate_bdev(sb->s_bdev)'
> being executed.

Right, I've missed that. But then why do you remove the sbi->s_journal_bdev
!= sb->s_bdev condition below?

> > > -	if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) {
> > > -		/*
> > > -		 * Invalidate the journal device's buffers.  We don't want them
> > > -		 * floating about in memory - the physical journal device may
> > > -		 * hotswapped, and it breaks the `ro-after' testing code.
> > > -		 */
> > > +	if (sbi->s_journal_bdev) {
> > >   		sync_blockdev(sbi->s_journal_bdev);
> > > -		invalidate_bdev(sbi->s_journal_bdev);
> > >   		ext4_blkdev_remove(sbi);
> > >   	}
> >  > Hum, but this will invalidate bhs only if journal is stored on a block
> > device. If journal is in the inode (the common case), we won't invalidate
> > anything (sbi->s_journal_bdev is NULL) and the same problem can happen?
> > 

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-14 12:11       ` Jan Kara
@ 2023-03-14 12:31         ` Zhihao Cheng
  2023-03-14 14:28           ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-14 12:31 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

> On Tue 14-03-23 20:01:46, Zhihao Cheng wrote:
>> 在 2023/3/14 19:33, Jan Kara 写道:
>> Hi Jan,
>>
>>>
>>>> @@ -1271,14 +1277,8 @@ static void ext4_put_super(struct super_block *sb)
>>>>    	sync_blockdev(sb->s_bdev);
>>>>    	invalidate_bdev(sb->s_bdev);
>>
>> For journal in the inode case, journal bhs come from block device, which
>> means buffers will be dropped after this line 'invalidate_bdev(sb->s_bdev)'
>> being executed.
> 
> Right, I've missed that. But then why do you remove the sbi->s_journal_bdev
> != sb->s_bdev condition below?
> 

I think 'sbi->s_journal_bdev != sb->s_bdev' always becomes true if 
sbi->s_journal_bdev exists.


mount_bdev
  fmode_t mode = FMODE_READ | FMODE_EXCL
  bdev_a = blkdev_get_by_path(dev_name, mode, fs_type)


mount_bdev->ext4_fill_super->ext4_load_and_init_journal->ext4_load_journal->ext4_get_dev_journal:
  bdev_b = ext4_blkdev_get(j_dev, sb)
   bdev_b = blkdev_get_by_dev(dev, FMODE_READ|FMODE_WRITE|FMODE_EXCL, sb)
  EXT4_SB(sb)->s_journal_bdev = bdev_b


bdev_a cannot be bdev_b, because bd_prepare_to_claim() makes sure the 
same block device cannot be openned twice with mode 'FMODE_EXCL'.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-14 12:31         ` Zhihao Cheng
@ 2023-03-14 14:28           ` Jan Kara
  2023-03-14 14:37             ` Zhihao Cheng
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2023-03-14 14:28 UTC (permalink / raw)
  To: Zhihao Cheng
  Cc: Jan Kara, tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang

On Tue 14-03-23 20:31:43, Zhihao Cheng wrote:
> > On Tue 14-03-23 20:01:46, Zhihao Cheng wrote:
> > > 在 2023/3/14 19:33, Jan Kara 写道:
> > > Hi Jan,
> > > 
> > > > 
> > > > > @@ -1271,14 +1277,8 @@ static void ext4_put_super(struct super_block *sb)
> > > > >    	sync_blockdev(sb->s_bdev);
> > > > >    	invalidate_bdev(sb->s_bdev);
> > > 
> > > For journal in the inode case, journal bhs come from block device, which
> > > means buffers will be dropped after this line 'invalidate_bdev(sb->s_bdev)'
> > > being executed.
> > 
> > Right, I've missed that. But then why do you remove the sbi->s_journal_bdev
> > != sb->s_bdev condition below?
> > 
> 
> I think 'sbi->s_journal_bdev != sb->s_bdev' always becomes true if
> sbi->s_journal_bdev exists.

OK, fair point. But please move this cleanup into a separate commit with
this justification. Thanks!

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting
  2023-03-14 14:28           ` Jan Kara
@ 2023-03-14 14:37             ` Zhihao Cheng
  0 siblings, 0 replies; 14+ messages in thread
From: Zhihao Cheng @ 2023-03-14 14:37 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, adilger.kernel, jack, tudor.ambarus, linux-ext4,
	linux-kernel, yi.zhang


> OK, fair point. But please move this cleanup into a separate commit with
> this justification. Thanks!

OK, I will split it from patch 1 in v3.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-03-14 14:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-13 13:20 [PATCH v2 0/5] ext4: Fix stale buffer loading from last failed Zhihao Cheng
2023-03-13 13:20 ` [PATCH v2 1/5] ext4: Fix reusing stale buffer heads from last failed mounting Zhihao Cheng
2023-03-14 11:33   ` Jan Kara
2023-03-14 12:01     ` Zhihao Cheng
2023-03-14 12:11       ` Jan Kara
2023-03-14 12:31         ` Zhihao Cheng
2023-03-14 14:28           ` Jan Kara
2023-03-14 14:37             ` Zhihao Cheng
2023-03-13 13:20 ` [PATCH v2 2/5] jbd2: remove unused feature macros Zhihao Cheng
2023-03-14 11:35   ` Jan Kara
2023-03-14 12:02     ` Zhihao Cheng
2023-03-13 13:20 ` [PATCH v2 3/5] jbd2: switch to check format version in superblock directly Zhihao Cheng
2023-03-13 13:20 ` [PATCH v2 4/5] jbd2: factor out journal initialization from journal_get_superblock() Zhihao Cheng
2023-03-13 13:20 ` [PATCH v2 5/5] jbd2: remove j_format_version Zhihao Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox