* [PATCH 1/4] ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers
2024-08-30 5:37 [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer zhangshida
@ 2024-08-30 5:37 ` zhangshida
2024-08-30 5:37 ` [PATCH 2/4] ext4: hoist ext4_block_write_begin and replace the __block_write_begin zhangshida
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: zhangshida @ 2024-08-30 5:37 UTC (permalink / raw)
To: tytso, adilger.kernel, jack, ebiggers
Cc: linux-ext4, linux-kernel, zhangshida, starzhangzsd, Jan Kara
From: Shida Zhang <zhangshida@kylinos.cn>
For new uptodate buffers we also need to call write_end_fn() to persist the
uptodate content, similarly as folio_zero_new_buffers() does it.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 941c1c0d5c6e..a0a55cb8db53 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1389,9 +1389,9 @@ static void ext4_journalled_zero_new_buffers(handle_t *handle,
size = min(to, block_end) - start;
folio_zero_range(folio, start, size);
- write_end_fn(handle, inode, bh);
}
clear_buffer_new(bh);
+ write_end_fn(handle, inode, bh);
}
}
block_start = block_end;
--
2.33.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/4] ext4: hoist ext4_block_write_begin and replace the __block_write_begin
2024-08-30 5:37 [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer zhangshida
2024-08-30 5:37 ` [PATCH 1/4] ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers zhangshida
@ 2024-08-30 5:37 ` zhangshida
2024-08-30 5:37 ` [PATCH 3/4] ext4: fix a potential assertion failure due to improperly dirtied buffer zhangshida
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: zhangshida @ 2024-08-30 5:37 UTC (permalink / raw)
To: tytso, adilger.kernel, jack, ebiggers
Cc: linux-ext4, linux-kernel, zhangshida, starzhangzsd, Jan Kara
From: Shida Zhang <zhangshida@kylinos.cn>
Using __block_write_begin() make it inconvenient to journal the
user data dirty process. We can't tell the block layer maintainer,
‘Hey, we want to trace the dirty user data in ext4, can we add some
special code for ext4 in __block_write_begin?’:P
So use ext4_block_write_begin() instead.
The two functions are basically doing the same thing except for the
fscrypt related code. Remove the unnecessary #ifdef since
fscrypt_inode_uses_fs_layer_crypto() returns false (and it's known at
compile time) when !CONFIG_FS_ENCRYPTION.
And hoist the ext4_block_write_begin so that it can be used in other
files.
Suggested-by: Jan Kara <jack@suse.cz>
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/ext4.h | 2 ++
fs/ext4/inline.c | 10 +++++-----
fs/ext4/inode.c | 24 +++++-------------------
3 files changed, 12 insertions(+), 24 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 08acd152261e..5f8257b68190 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3851,6 +3851,8 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
return buffer_uptodate(bh);
}
+extern int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
+ get_block_t *get_block);
#endif /* __KERNEL__ */
#define EFSBADCRC EBADMSG /* Bad CRC detected */
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index e7a09a99837b..0a1a8431e281 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -601,10 +601,10 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
goto out;
if (ext4_should_dioread_nolock(inode)) {
- ret = __block_write_begin(&folio->page, from, to,
- ext4_get_block_unwritten);
+ ret = ext4_block_write_begin(folio, from, to,
+ ext4_get_block_unwritten);
} else
- ret = __block_write_begin(&folio->page, from, to, ext4_get_block);
+ ret = ext4_block_write_begin(folio, from, to, ext4_get_block);
if (!ret && ext4_should_journal_data(inode)) {
ret = ext4_walk_page_buffers(handle, inode,
@@ -856,8 +856,8 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
goto out;
}
- ret = __block_write_begin(&folio->page, 0, inline_size,
- ext4_da_get_block_prep);
+ ret = ext4_block_write_begin(folio, 0, inline_size,
+ ext4_da_get_block_prep);
if (ret) {
up_read(&EXT4_I(inode)->xattr_sem);
folio_unlock(folio);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a0a55cb8db53..4964c67e029e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1024,10 +1024,10 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode,
if (!buffer_mapped(bh) || buffer_freed(bh))
return 0;
/*
- * __block_write_begin() could have dirtied some buffers. Clean
+ * ext4_block_write_begin() could have dirtied some buffers. Clean
* the dirty bit as jbd2_journal_get_write_access() could complain
* otherwise about fs integrity issues. Setting of the dirty bit
- * by __block_write_begin() isn't a real problem here as we clear
+ * by ext4_block_write_begin() isn't a real problem here as we clear
* the bit before releasing a page lock and thus writeback cannot
* ever write the buffer.
*/
@@ -1041,9 +1041,8 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode,
return ret;
}
-#ifdef CONFIG_FS_ENCRYPTION
-static int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
- get_block_t *get_block)
+int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
+ get_block_t *get_block)
{
unsigned from = pos & (PAGE_SIZE - 1);
unsigned to = from + len;
@@ -1134,7 +1133,6 @@ static int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
return err;
}
-#endif
/*
* To preserve ordering, it is essential that the hole instantiation and
@@ -1216,19 +1214,11 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
/* In case writeback began while the folio was unlocked */
folio_wait_stable(folio);
-#ifdef CONFIG_FS_ENCRYPTION
if (ext4_should_dioread_nolock(inode))
ret = ext4_block_write_begin(folio, pos, len,
ext4_get_block_unwritten);
else
ret = ext4_block_write_begin(folio, pos, len, ext4_get_block);
-#else
- if (ext4_should_dioread_nolock(inode))
- ret = __block_write_begin(&folio->page, pos, len,
- ext4_get_block_unwritten);
- else
- ret = __block_write_begin(&folio->page, pos, len, ext4_get_block);
-#endif
if (!ret && ext4_should_journal_data(inode)) {
ret = ext4_walk_page_buffers(handle, inode,
folio_buffers(folio), from, to,
@@ -1241,7 +1231,7 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
folio_unlock(folio);
/*
- * __block_write_begin may have instantiated a few blocks
+ * ext4_block_write_begin may have instantiated a few blocks
* outside i_size. Trim these off again. Don't need
* i_size_read because we hold i_rwsem.
*
@@ -2961,11 +2951,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
if (IS_ERR(folio))
return PTR_ERR(folio);
-#ifdef CONFIG_FS_ENCRYPTION
ret = ext4_block_write_begin(folio, pos, len, ext4_da_get_block_prep);
-#else
- ret = __block_write_begin(&folio->page, pos, len, ext4_da_get_block_prep);
-#endif
if (ret < 0) {
folio_unlock(folio);
folio_put(folio);
--
2.33.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 3/4] ext4: fix a potential assertion failure due to improperly dirtied buffer
2024-08-30 5:37 [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer zhangshida
2024-08-30 5:37 ` [PATCH 1/4] ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers zhangshida
2024-08-30 5:37 ` [PATCH 2/4] ext4: hoist ext4_block_write_begin and replace the __block_write_begin zhangshida
@ 2024-08-30 5:37 ` zhangshida
2024-09-02 13:34 ` Jan Kara
2024-08-30 5:37 ` [PATCH 4/4] ext4: remove the special buffer dirty handling in do_journal_get_write_access zhangshida
2024-09-05 14:53 ` [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer Theodore Ts'o
4 siblings, 1 reply; 8+ messages in thread
From: zhangshida @ 2024-08-30 5:37 UTC (permalink / raw)
To: tytso, adilger.kernel, jack, ebiggers
Cc: linux-ext4, linux-kernel, zhangshida, starzhangzsd, Baolin Liu,
Jan Kara
From: Shida Zhang <zhangshida@kylinos.cn>
On an old kernel version(4.19, ext3, data=journal, pagesize=64k),
an assertion failure will occasionally be triggered by the line below:
-----------
jbd2_journal_commit_transaction
{
...
J_ASSERT_BH(bh, !buffer_dirty(bh));
/*
* The buffer on BJ_Forget list and not jbddirty means
...
}
-----------
The same condition may also be applied to the lattest kernel version.
When blocksize < pagesize and we truncate a file, there can be buffers in
the mapping tail page beyond i_size. These buffers will be filed to
transaction's BJ_Forget list by ext4_journalled_invalidatepage() during
truncation. When the transaction doing truncate starts committing, we can
grow the file again. This calls __block_write_begin() which allocates new
blocks under these buffers in the tail page we go through the branch:
if (buffer_new(bh)) {
clean_bdev_bh_alias(bh);
if (folio_test_uptodate(folio)) {
clear_buffer_new(bh);
set_buffer_uptodate(bh);
mark_buffer_dirty(bh);
continue;
}
...
}
Hence buffers on BJ_Forget list of the committing transaction get marked
dirty and this triggers the jbd2 assertion.
Teach ext4_block_write_begin() to properly handle files with data
journalling by avoiding dirtying them directly. Instead of
folio_zero_new_buffers() we use ext4_journalled_zero_new_buffers() which
takes care of handling journalling. We also don't need to mark new uptodate
buffers as dirty in ext4_block_write_begin(). That will be either done
either by block_commit_write() in case of success or by
folio_zero_new_buffers() in case of failure.
Reported-by: Baolin Liu <liubaolin@kylinos.cn>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
---
fs/ext4/ext4.h | 3 ++-
fs/ext4/inline.c | 7 ++++---
fs/ext4/inode.c | 42 ++++++++++++++++++++++++++++++++++--------
3 files changed, 40 insertions(+), 12 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 5f8257b68190..b653bd423b11 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3851,7 +3851,8 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
return buffer_uptodate(bh);
}
-extern int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
+extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
+ loff_t pos, unsigned len,
get_block_t *get_block);
#endif /* __KERNEL__ */
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 0a1a8431e281..8d5599d5af27 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -601,10 +601,11 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
goto out;
if (ext4_should_dioread_nolock(inode)) {
- ret = ext4_block_write_begin(folio, from, to,
+ ret = ext4_block_write_begin(handle, folio, from, to,
ext4_get_block_unwritten);
} else
- ret = ext4_block_write_begin(folio, from, to, ext4_get_block);
+ ret = ext4_block_write_begin(handle, folio, from, to,
+ ext4_get_block);
if (!ret && ext4_should_journal_data(inode)) {
ret = ext4_walk_page_buffers(handle, inode,
@@ -856,7 +857,7 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
goto out;
}
- ret = ext4_block_write_begin(folio, 0, inline_size,
+ ret = ext4_block_write_begin(NULL, folio, 0, inline_size,
ext4_da_get_block_prep);
if (ret) {
up_read(&EXT4_I(inode)->xattr_sem);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4964c67e029e..a28f279fd02f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -49,6 +49,11 @@
#include <trace/events/ext4.h>
+static void ext4_journalled_zero_new_buffers(handle_t *handle,
+ struct inode *inode,
+ struct folio *folio,
+ unsigned from, unsigned to);
+
static __u32 ext4_inode_csum(struct inode *inode, struct ext4_inode *raw,
struct ext4_inode_info *ei)
{
@@ -1041,7 +1046,8 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode,
return ret;
}
-int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
+int ext4_block_write_begin(handle_t *handle, struct folio *folio,
+ loff_t pos, unsigned len,
get_block_t *get_block)
{
unsigned from = pos & (PAGE_SIZE - 1);
@@ -1055,6 +1061,7 @@ int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
struct buffer_head *bh, *head, *wait[2];
int nr_wait = 0;
int i;
+ bool should_journal_data = ext4_should_journal_data(inode);
BUG_ON(!folio_test_locked(folio));
BUG_ON(from > PAGE_SIZE);
@@ -1084,10 +1091,22 @@ int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
if (err)
break;
if (buffer_new(bh)) {
+ /*
+ * We may be zeroing partial buffers or all new
+ * buffers in case of failure. Prepare JBD2 for
+ * that.
+ */
+ if (should_journal_data)
+ do_journal_get_write_access(handle,
+ inode, bh);
if (folio_test_uptodate(folio)) {
- clear_buffer_new(bh);
+ /*
+ * Unlike __block_write_begin() we leave
+ * dirtying of new uptodate buffers to
+ * ->write_end() time or
+ * folio_zero_new_buffers().
+ */
set_buffer_uptodate(bh);
- mark_buffer_dirty(bh);
continue;
}
if (block_end > to || block_start < from)
@@ -1117,7 +1136,11 @@ int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
err = -EIO;
}
if (unlikely(err)) {
- folio_zero_new_buffers(folio, from, to);
+ if (should_journal_data)
+ ext4_journalled_zero_new_buffers(handle, inode, folio,
+ from, to);
+ else
+ folio_zero_new_buffers(folio, from, to);
} else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
for (i = 0; i < nr_wait; i++) {
int err2;
@@ -1215,10 +1238,11 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
folio_wait_stable(folio);
if (ext4_should_dioread_nolock(inode))
- ret = ext4_block_write_begin(folio, pos, len,
+ ret = ext4_block_write_begin(handle, folio, pos, len,
ext4_get_block_unwritten);
else
- ret = ext4_block_write_begin(folio, pos, len, ext4_get_block);
+ ret = ext4_block_write_begin(handle, folio, pos, len,
+ ext4_get_block);
if (!ret && ext4_should_journal_data(inode)) {
ret = ext4_walk_page_buffers(handle, inode,
folio_buffers(folio), from, to,
@@ -2951,7 +2975,8 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
if (IS_ERR(folio))
return PTR_ERR(folio);
- ret = ext4_block_write_begin(folio, pos, len, ext4_da_get_block_prep);
+ ret = ext4_block_write_begin(NULL, folio, pos, len,
+ ext4_da_get_block_prep);
if (ret < 0) {
folio_unlock(folio);
folio_put(folio);
@@ -6205,7 +6230,8 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
if (folio_pos(folio) + len > size)
len = size - folio_pos(folio);
- err = __block_write_begin(&folio->page, 0, len, ext4_get_block);
+ err = ext4_block_write_begin(handle, folio, 0, len,
+ ext4_get_block);
if (!err) {
ret = VM_FAULT_SIGBUS;
if (ext4_journal_folio_buffers(handle, folio, len))
--
2.33.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 3/4] ext4: fix a potential assertion failure due to improperly dirtied buffer
2024-08-30 5:37 ` [PATCH 3/4] ext4: fix a potential assertion failure due to improperly dirtied buffer zhangshida
@ 2024-09-02 13:34 ` Jan Kara
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2024-09-02 13:34 UTC (permalink / raw)
To: zhangshida
Cc: tytso, adilger.kernel, jack, ebiggers, linux-ext4, linux-kernel,
zhangshida, Baolin Liu, Jan Kara
On Fri 30-08-24 13:37:38, zhangshida wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
>
> On an old kernel version(4.19, ext3, data=journal, pagesize=64k),
> an assertion failure will occasionally be triggered by the line below:
> -----------
> jbd2_journal_commit_transaction
> {
> ...
> J_ASSERT_BH(bh, !buffer_dirty(bh));
> /*
> * The buffer on BJ_Forget list and not jbddirty means
> ...
> }
> -----------
>
> The same condition may also be applied to the lattest kernel version.
>
> When blocksize < pagesize and we truncate a file, there can be buffers in
> the mapping tail page beyond i_size. These buffers will be filed to
> transaction's BJ_Forget list by ext4_journalled_invalidatepage() during
> truncation. When the transaction doing truncate starts committing, we can
> grow the file again. This calls __block_write_begin() which allocates new
> blocks under these buffers in the tail page we go through the branch:
>
> if (buffer_new(bh)) {
> clean_bdev_bh_alias(bh);
> if (folio_test_uptodate(folio)) {
> clear_buffer_new(bh);
> set_buffer_uptodate(bh);
> mark_buffer_dirty(bh);
> continue;
> }
> ...
> }
>
> Hence buffers on BJ_Forget list of the committing transaction get marked
> dirty and this triggers the jbd2 assertion.
>
> Teach ext4_block_write_begin() to properly handle files with data
> journalling by avoiding dirtying them directly. Instead of
> folio_zero_new_buffers() we use ext4_journalled_zero_new_buffers() which
> takes care of handling journalling. We also don't need to mark new uptodate
> buffers as dirty in ext4_block_write_begin(). That will be either done
> either by block_commit_write() in case of success or by
> folio_zero_new_buffers() in case of failure.
>
> Reported-by: Baolin Liu <liubaolin@kylinos.cn>
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/ext4.h | 3 ++-
> fs/ext4/inline.c | 7 ++++---
> fs/ext4/inode.c | 42 ++++++++++++++++++++++++++++++++++--------
> 3 files changed, 40 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 5f8257b68190..b653bd423b11 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -3851,7 +3851,8 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
> return buffer_uptodate(bh);
> }
>
> -extern int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
> +extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
> + loff_t pos, unsigned len,
> get_block_t *get_block);
> #endif /* __KERNEL__ */
>
> diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
> index 0a1a8431e281..8d5599d5af27 100644
> --- a/fs/ext4/inline.c
> +++ b/fs/ext4/inline.c
> @@ -601,10 +601,11 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
> goto out;
>
> if (ext4_should_dioread_nolock(inode)) {
> - ret = ext4_block_write_begin(folio, from, to,
> + ret = ext4_block_write_begin(handle, folio, from, to,
> ext4_get_block_unwritten);
> } else
> - ret = ext4_block_write_begin(folio, from, to, ext4_get_block);
> + ret = ext4_block_write_begin(handle, folio, from, to,
> + ext4_get_block);
>
> if (!ret && ext4_should_journal_data(inode)) {
> ret = ext4_walk_page_buffers(handle, inode,
> @@ -856,7 +857,7 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
> goto out;
> }
>
> - ret = ext4_block_write_begin(folio, 0, inline_size,
> + ret = ext4_block_write_begin(NULL, folio, 0, inline_size,
> ext4_da_get_block_prep);
> if (ret) {
> up_read(&EXT4_I(inode)->xattr_sem);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 4964c67e029e..a28f279fd02f 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -49,6 +49,11 @@
>
> #include <trace/events/ext4.h>
>
> +static void ext4_journalled_zero_new_buffers(handle_t *handle,
> + struct inode *inode,
> + struct folio *folio,
> + unsigned from, unsigned to);
> +
> static __u32 ext4_inode_csum(struct inode *inode, struct ext4_inode *raw,
> struct ext4_inode_info *ei)
> {
> @@ -1041,7 +1046,8 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode,
> return ret;
> }
>
> -int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
> +int ext4_block_write_begin(handle_t *handle, struct folio *folio,
> + loff_t pos, unsigned len,
> get_block_t *get_block)
> {
> unsigned from = pos & (PAGE_SIZE - 1);
> @@ -1055,6 +1061,7 @@ int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
> struct buffer_head *bh, *head, *wait[2];
> int nr_wait = 0;
> int i;
> + bool should_journal_data = ext4_should_journal_data(inode);
>
> BUG_ON(!folio_test_locked(folio));
> BUG_ON(from > PAGE_SIZE);
> @@ -1084,10 +1091,22 @@ int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
> if (err)
> break;
> if (buffer_new(bh)) {
> + /*
> + * We may be zeroing partial buffers or all new
> + * buffers in case of failure. Prepare JBD2 for
> + * that.
> + */
> + if (should_journal_data)
> + do_journal_get_write_access(handle,
> + inode, bh);
> if (folio_test_uptodate(folio)) {
> - clear_buffer_new(bh);
> + /*
> + * Unlike __block_write_begin() we leave
> + * dirtying of new uptodate buffers to
> + * ->write_end() time or
> + * folio_zero_new_buffers().
> + */
> set_buffer_uptodate(bh);
> - mark_buffer_dirty(bh);
> continue;
> }
> if (block_end > to || block_start < from)
> @@ -1117,7 +1136,11 @@ int ext4_block_write_begin(struct folio *folio, loff_t pos, unsigned len,
> err = -EIO;
> }
> if (unlikely(err)) {
> - folio_zero_new_buffers(folio, from, to);
> + if (should_journal_data)
> + ext4_journalled_zero_new_buffers(handle, inode, folio,
> + from, to);
> + else
> + folio_zero_new_buffers(folio, from, to);
> } else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
> for (i = 0; i < nr_wait; i++) {
> int err2;
> @@ -1215,10 +1238,11 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
> folio_wait_stable(folio);
>
> if (ext4_should_dioread_nolock(inode))
> - ret = ext4_block_write_begin(folio, pos, len,
> + ret = ext4_block_write_begin(handle, folio, pos, len,
> ext4_get_block_unwritten);
> else
> - ret = ext4_block_write_begin(folio, pos, len, ext4_get_block);
> + ret = ext4_block_write_begin(handle, folio, pos, len,
> + ext4_get_block);
> if (!ret && ext4_should_journal_data(inode)) {
> ret = ext4_walk_page_buffers(handle, inode,
> folio_buffers(folio), from, to,
> @@ -2951,7 +2975,8 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
> if (IS_ERR(folio))
> return PTR_ERR(folio);
>
> - ret = ext4_block_write_begin(folio, pos, len, ext4_da_get_block_prep);
> + ret = ext4_block_write_begin(NULL, folio, pos, len,
> + ext4_da_get_block_prep);
> if (ret < 0) {
> folio_unlock(folio);
> folio_put(folio);
> @@ -6205,7 +6230,8 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
> if (folio_pos(folio) + len > size)
> len = size - folio_pos(folio);
>
> - err = __block_write_begin(&folio->page, 0, len, ext4_get_block);
> + err = ext4_block_write_begin(handle, folio, 0, len,
> + ext4_get_block);
> if (!err) {
> ret = VM_FAULT_SIGBUS;
> if (ext4_journal_folio_buffers(handle, folio, len))
> --
> 2.33.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 4/4] ext4: remove the special buffer dirty handling in do_journal_get_write_access
2024-08-30 5:37 [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer zhangshida
` (2 preceding siblings ...)
2024-08-30 5:37 ` [PATCH 3/4] ext4: fix a potential assertion failure due to improperly dirtied buffer zhangshida
@ 2024-08-30 5:37 ` zhangshida
2024-09-02 13:35 ` Jan Kara
2024-09-05 14:53 ` [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer Theodore Ts'o
4 siblings, 1 reply; 8+ messages in thread
From: zhangshida @ 2024-08-30 5:37 UTC (permalink / raw)
To: tytso, adilger.kernel, jack, ebiggers
Cc: linux-ext4, linux-kernel, zhangshida, starzhangzsd, Jan Kara
From: Shida Zhang <zhangshida@kylinos.cn>
This kinda revert the commit 56d35a4cd13e("ext4: Fix dirtying of
journalled buffers in data=journal mode") made by Jan 14 years ago,
since the do_get_write_access() itself can deal with the extra
unexpected buf dirting things in a proper way now.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
---
fs/ext4/inode.c | 18 +-----------------
1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a28f279fd02f..2687bf451a25 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1023,27 +1023,11 @@ static int ext4_dirty_journalled_data(handle_t *handle, struct buffer_head *bh)
int do_journal_get_write_access(handle_t *handle, struct inode *inode,
struct buffer_head *bh)
{
- int dirty = buffer_dirty(bh);
- int ret;
-
if (!buffer_mapped(bh) || buffer_freed(bh))
return 0;
- /*
- * ext4_block_write_begin() could have dirtied some buffers. Clean
- * the dirty bit as jbd2_journal_get_write_access() could complain
- * otherwise about fs integrity issues. Setting of the dirty bit
- * by ext4_block_write_begin() isn't a real problem here as we clear
- * the bit before releasing a page lock and thus writeback cannot
- * ever write the buffer.
- */
- if (dirty)
- clear_buffer_dirty(bh);
BUFFER_TRACE(bh, "get write access");
- ret = ext4_journal_get_write_access(handle, inode->i_sb, bh,
+ return ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
- if (!ret && dirty)
- ret = ext4_dirty_journalled_data(handle, bh);
- return ret;
}
int ext4_block_write_begin(handle_t *handle, struct folio *folio,
--
2.33.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 4/4] ext4: remove the special buffer dirty handling in do_journal_get_write_access
2024-08-30 5:37 ` [PATCH 4/4] ext4: remove the special buffer dirty handling in do_journal_get_write_access zhangshida
@ 2024-09-02 13:35 ` Jan Kara
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2024-09-02 13:35 UTC (permalink / raw)
To: zhangshida
Cc: tytso, adilger.kernel, jack, ebiggers, linux-ext4, linux-kernel,
zhangshida, Jan Kara
On Fri 30-08-24 13:37:39, zhangshida wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
>
> This kinda revert the commit 56d35a4cd13e("ext4: Fix dirtying of
> journalled buffers in data=journal mode") made by Jan 14 years ago,
> since the do_get_write_access() itself can deal with the extra
> unexpected buf dirting things in a proper way now.
>
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 18 +-----------------
> 1 file changed, 1 insertion(+), 17 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index a28f279fd02f..2687bf451a25 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1023,27 +1023,11 @@ static int ext4_dirty_journalled_data(handle_t *handle, struct buffer_head *bh)
> int do_journal_get_write_access(handle_t *handle, struct inode *inode,
> struct buffer_head *bh)
> {
> - int dirty = buffer_dirty(bh);
> - int ret;
> -
> if (!buffer_mapped(bh) || buffer_freed(bh))
> return 0;
> - /*
> - * ext4_block_write_begin() could have dirtied some buffers. Clean
> - * the dirty bit as jbd2_journal_get_write_access() could complain
> - * otherwise about fs integrity issues. Setting of the dirty bit
> - * by ext4_block_write_begin() isn't a real problem here as we clear
> - * the bit before releasing a page lock and thus writeback cannot
> - * ever write the buffer.
> - */
> - if (dirty)
> - clear_buffer_dirty(bh);
> BUFFER_TRACE(bh, "get write access");
> - ret = ext4_journal_get_write_access(handle, inode->i_sb, bh,
> + return ext4_journal_get_write_access(handle, inode->i_sb, bh,
> EXT4_JTR_NONE);
> - if (!ret && dirty)
> - ret = ext4_dirty_journalled_data(handle, bh);
> - return ret;
> }
>
> int ext4_block_write_begin(handle_t *handle, struct folio *folio,
> --
> 2.33.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer
2024-08-30 5:37 [PATCH v6 0/4] Fix an error caused by improperly dirtied buffer zhangshida
` (3 preceding siblings ...)
2024-08-30 5:37 ` [PATCH 4/4] ext4: remove the special buffer dirty handling in do_journal_get_write_access zhangshida
@ 2024-09-05 14:53 ` Theodore Ts'o
4 siblings, 0 replies; 8+ messages in thread
From: Theodore Ts'o @ 2024-09-05 14:53 UTC (permalink / raw)
To: adilger.kernel, jack, ebiggers, zhangshida
Cc: Theodore Ts'o, linux-ext4, linux-kernel, zhangshida
On Fri, 30 Aug 2024 13:37:35 +0800, zhangshida wrote:
> On an old kernel version(4.19, ext3, data=journal, pagesize=64k),
> an assertion failure will occasionally be triggered by the line below:
> ---------
> jbd2_journal_commit_transaction
> {
> ...
> J_ASSERT_BH(bh, !buffer_dirty(bh));
> /*
> * The buffer on BJ_Forget list and not jbddirty means
> ...
> }
> ---------
>
> [...]
Applied, thanks!
[1/4] ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers
commit: 3910b513fcdf33030798755c722174ef4a827d2a
[2/4] ext4: hoist ext4_block_write_begin and replace the __block_write_begin
commit: 6b730a405037501a260d6efd24782d2737e65d07
[3/4] ext4: fix a potential assertion failure due to improperly dirtied buffer
commit: cb3de5fc876ee9ef2b830c9e6cdafac5c90903ef
[4/4] ext4: remove the special buffer dirty handling in do_journal_get_write_access
commit: 183aa1d3baea18b199505455a0247b13de826e2f
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
^ permalink raw reply [flat|nested] 8+ messages in thread