* [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range()
@ 2023-11-01 16:38 Ojaswin Mujoo
2023-11-01 16:38 ` [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range() Ojaswin Mujoo
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Ojaswin Mujoo @ 2023-11-01 16:38 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Ritesh Harjani, linux-kernel, Jan Kara
** Changes v1 -> v2 **
* Instead of exiting early in ext4_block_zero_page_range() for unwrit
bhs like in v1, lets let the current logic be as it is and instead
document the handing of unwritten buffer heads to make the intent
clear
v1:
https://lore.kernel.org/linux-ext4/20231019165546.norapdphdyx7g3ob@quack3/T/#mbd0ab69d55487493edbd465b3882051e5bc2365d
** Original Cover Letter **
As per discussion with Jan here [1], this patchset intends to exit early
from __ext4_block_zero_page_range() incase the block we are about to
zero (partially) is unwritten and unmapped, since such a block doesn't
require zeroing.
Further, also make sure that calls to ext4_zero_partial_blocks()
truncate the page cache completely beforehand, so that they don't rely
on ext4_zero_partial_block() -> __ext4_block_zero_page_range() to zero
out non block aligned edges of pagecache.
Reviews and comments are appreciated!
Regards,
ojaswin
[1]
https://lore.kernel.org/linux-ext4/20230914141920.lw2nlpzhcxwuz2y6@quack3/
Ojaswin Mujoo (2):
ext4: treat end of range as exclusive in ext4_zero_range()
ext4: Clarify handling of unwritten bh in
__ext4_block_zero_page_range()
fs/ext4/extents.c | 6 ++++--
fs/ext4/inode.c | 6 ++++++
2 files changed, 10 insertions(+), 2 deletions(-)
--
2.39.3
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range()
2023-11-01 16:38 [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Ojaswin Mujoo
@ 2023-11-01 16:38 ` Ojaswin Mujoo
2023-11-02 10:08 ` Jan Kara
2023-11-01 16:38 ` [PATCH v2 2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range() Ojaswin Mujoo
2024-01-09 2:53 ` [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Theodore Ts'o
2 siblings, 1 reply; 6+ messages in thread
From: Ojaswin Mujoo @ 2023-11-01 16:38 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Ritesh Harjani, linux-kernel, Jan Kara
The call to filemap_write_and_wait_range() assumes the range passed to be
inclusive, so fix the call to make sure we follow that.
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
fs/ext4/extents.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 880f383df684..265ae30a51b9 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4522,7 +4522,8 @@ static long ext4_zero_range(struct file *file, loff_t offset,
* Round up offset. This is not fallocate, we need to zero out
* blocks, so convert interior block aligned part of the range to
* unwritten and possibly manually zero out unaligned parts of the
- * range.
+ * range. Here, start and partial_begin are inclusive, end and
+ * partial_end are exclusive.
*/
start = round_up(offset, 1 << blkbits);
end = round_down((offset + len), 1 << blkbits);
@@ -4608,7 +4609,8 @@ static long ext4_zero_range(struct file *file, loff_t offset,
* disk in case of crash before zeroing trans is committed.
*/
if (ext4_should_journal_data(inode)) {
- ret = filemap_write_and_wait_range(mapping, start, end);
+ ret = filemap_write_and_wait_range(mapping, start,
+ end - 1);
if (ret) {
filemap_invalidate_unlock(mapping);
goto out_mutex;
--
2.39.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range()
2023-11-01 16:38 [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Ojaswin Mujoo
2023-11-01 16:38 ` [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range() Ojaswin Mujoo
@ 2023-11-01 16:38 ` Ojaswin Mujoo
2023-11-02 10:08 ` Jan Kara
2024-01-09 2:53 ` [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Theodore Ts'o
2 siblings, 1 reply; 6+ messages in thread
From: Ojaswin Mujoo @ 2023-11-01 16:38 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Ritesh Harjani, linux-kernel, Jan Kara
As an optimization, I was trying to work on exiting early from this
function if dealing with unwritten extent since they anyways read 0.
However, it was realised that there are certain code paths that can
end up calling ext4_block_zero_page_range() for an unwritten bh that
might still have data in pagecache. In this case, we can't exit early
and we do require to process the bh and zero out the pagecache to ensure
that a writeback can't kick in at a later time and flush the stale
pagecache to disk.
Since, adding the logic to exit early for unwritten bh was turning out
to be much more nuanced and the current code already handles it well,
just add a comment to explicitly document this behavior.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
fs/ext4/inode.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d7732320431a..76921e834dd4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3632,6 +3632,12 @@ void ext4_set_aops(struct inode *inode)
inode->i_mapping->a_ops = &ext4_aops;
}
+/*
+ * Here we can't skip an unwritten buffer even though it usually reads zero
+ * because it might have data in pagecache (eg, if called from ext4_zero_range,
+ * ext4_punch_hole, etc) which needs to be properly zeroed out. Otherwise a
+ * racing writeback can come later and flush the stale pagecache to disk.
+ */
static int __ext4_block_zero_page_range(handle_t *handle,
struct address_space *mapping, loff_t from, loff_t length)
{
--
2.39.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range()
2023-11-01 16:38 ` [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range() Ojaswin Mujoo
@ 2023-11-02 10:08 ` Jan Kara
0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2023-11-02 10:08 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: linux-ext4, Theodore Ts'o, Ritesh Harjani, linux-kernel,
Jan Kara
On Wed 01-11-23 22:08:10, Ojaswin Mujoo wrote:
> The call to filemap_write_and_wait_range() assumes the range passed to be
> inclusive, so fix the call to make sure we follow that.
>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/extents.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 880f383df684..265ae30a51b9 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4522,7 +4522,8 @@ static long ext4_zero_range(struct file *file, loff_t offset,
> * Round up offset. This is not fallocate, we need to zero out
> * blocks, so convert interior block aligned part of the range to
> * unwritten and possibly manually zero out unaligned parts of the
> - * range.
> + * range. Here, start and partial_begin are inclusive, end and
> + * partial_end are exclusive.
> */
> start = round_up(offset, 1 << blkbits);
> end = round_down((offset + len), 1 << blkbits);
> @@ -4608,7 +4609,8 @@ static long ext4_zero_range(struct file *file, loff_t offset,
> * disk in case of crash before zeroing trans is committed.
> */
> if (ext4_should_journal_data(inode)) {
> - ret = filemap_write_and_wait_range(mapping, start, end);
> + ret = filemap_write_and_wait_range(mapping, start,
> + end - 1);
> if (ret) {
> filemap_invalidate_unlock(mapping);
> goto out_mutex;
> --
> 2.39.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range()
2023-11-01 16:38 ` [PATCH v2 2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range() Ojaswin Mujoo
@ 2023-11-02 10:08 ` Jan Kara
0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2023-11-02 10:08 UTC (permalink / raw)
To: Ojaswin Mujoo
Cc: linux-ext4, Theodore Ts'o, Ritesh Harjani, linux-kernel,
Jan Kara
On Wed 01-11-23 22:08:11, Ojaswin Mujoo wrote:
> As an optimization, I was trying to work on exiting early from this
> function if dealing with unwritten extent since they anyways read 0.
> However, it was realised that there are certain code paths that can
> end up calling ext4_block_zero_page_range() for an unwritten bh that
> might still have data in pagecache. In this case, we can't exit early
> and we do require to process the bh and zero out the pagecache to ensure
> that a writeback can't kick in at a later time and flush the stale
> pagecache to disk.
>
> Since, adding the logic to exit early for unwritten bh was turning out
> to be much more nuanced and the current code already handles it well,
> just add a comment to explicitly document this behavior.
>
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d7732320431a..76921e834dd4 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3632,6 +3632,12 @@ void ext4_set_aops(struct inode *inode)
> inode->i_mapping->a_ops = &ext4_aops;
> }
>
> +/*
> + * Here we can't skip an unwritten buffer even though it usually reads zero
> + * because it might have data in pagecache (eg, if called from ext4_zero_range,
> + * ext4_punch_hole, etc) which needs to be properly zeroed out. Otherwise a
> + * racing writeback can come later and flush the stale pagecache to disk.
> + */
> static int __ext4_block_zero_page_range(handle_t *handle,
> struct address_space *mapping, loff_t from, loff_t length)
> {
> --
> 2.39.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range()
2023-11-01 16:38 [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Ojaswin Mujoo
2023-11-01 16:38 ` [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range() Ojaswin Mujoo
2023-11-01 16:38 ` [PATCH v2 2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range() Ojaswin Mujoo
@ 2024-01-09 2:53 ` Theodore Ts'o
2 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2024-01-09 2:53 UTC (permalink / raw)
To: linux-ext4, Ojaswin Mujoo
Cc: Theodore Ts'o, Ritesh Harjani, linux-kernel, Jan Kara
On Wed, 01 Nov 2023 22:08:09 +0530, Ojaswin Mujoo wrote:
> ** Changes v1 -> v2 **
>
> * Instead of exiting early in ext4_block_zero_page_range() for unwrit
> bhs like in v1, lets let the current logic be as it is and instead
> document the handing of unwritten buffer heads to make the intent
> clear
>
> [...]
Applied, thanks!
[1/2] ext4: treat end of range as exclusive in ext4_zero_range()
commit: 92573369144f40397e8514440afdf59f24905b40
[2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range()
commit: c6bfd724098457a1162a7b9fef07af176720055b
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-01-09 2:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-01 16:38 [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Ojaswin Mujoo
2023-11-01 16:38 ` [PATCH v2 1/2] ext4: treat end of range as exclusive in ext4_zero_range() Ojaswin Mujoo
2023-11-02 10:08 ` Jan Kara
2023-11-01 16:38 ` [PATCH v2 2/2] ext4: Clarify handling of unwritten bh in __ext4_block_zero_page_range() Ojaswin Mujoo
2023-11-02 10:08 ` Jan Kara
2024-01-09 2:53 ` [PATCH v2 0/2] Document handing of unwritten bh in ext4_block_zero_page_range() Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox