* [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout
@ 2013-01-31 7:24 Dmitry Monakhov
2013-01-31 13:35 ` Yongqiang Yang
2013-01-31 14:18 ` Jan Kara
0 siblings, 2 replies; 4+ messages in thread
From: Dmitry Monakhov @ 2013-01-31 7:24 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, xiaoqiangnk, Dmitry Monakhov
We have to update extent's state after first ext4_split_extent_at otherwise this result
in following trace:
->ext4_ext_handle_uninitialized_extents (ex=[1000:20:uninit], lblock 1000, max_blocks 10)
->ext4_split_extent_at(ex=[1000,128], lblk 10010) /// First split
->ext4_ext_split() -> ENOSPC
->ext4_ext_zeroout
->ext4_ext_dirty -> ex=[1000:20:init]
->ext4_split_extent_at(ex=[1000,128], lblk 10000) /// Second split
if(split == ee_block)
if (split_flag & EXT4_EXT_MARK_UNINIT2)
ext4_ext_mark_uninitialized(ex); ex=[1000:20:uninit] /// The bug!
->ext4_ext_dirty ->ex=[1000:20:uninit]
At the end ext4_convert_unwritten_extents_endio() will findout large uninitialized
extent.
TESTCASE: https://github.com/dmonakhov/xfstests/commit/1a1c4f337d4d198803436c63a56625b1a78d8a5e
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/ext4/extents.c | 14 ++++++++++----
1 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 97cac01..7a3f679 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3091,18 +3091,24 @@ static int ext4_split_extent(handle_t *handle,
if (err)
goto out;
}
-
+ /* Update path is required because previous ext4_split_extent_at() may
+ * result in split of original leaf or extent zeroout.
+ */
ext4_ext_drop_refs(path);
path = ext4_ext_find_extent(inode, map->m_lblk, path);
if (IS_ERR(path))
return PTR_ERR(path);
+ depth = ext_depth(inode);
+ ex = path[depth].p_ext;
+ uninitialized = ext4_ext_is_uninitialized(ex);
if (map->m_lblk >= ee_block) {
split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
- if (uninitialized)
+ if (uninitialized) {
split_flag1 |= EXT4_EXT_MARK_UNINIT1;
- if (split_flag & EXT4_EXT_MARK_UNINIT2)
- split_flag1 |= EXT4_EXT_MARK_UNINIT2;
+ if (split_flag & EXT4_EXT_MARK_UNINIT2)
+ split_flag1 |= EXT4_EXT_MARK_UNINIT2;
+ }
err = ext4_split_extent_at(handle, inode, path,
map->m_lblk, split_flag1, flags);
if (err)
--
1.7.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout
2013-01-31 7:24 [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout Dmitry Monakhov
@ 2013-01-31 13:35 ` Yongqiang Yang
2013-01-31 14:18 ` Jan Kara
1 sibling, 0 replies; 4+ messages in thread
From: Yongqiang Yang @ 2013-01-31 13:35 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Ext4 Developers List, Theodore Ts'o, Jan Kara
Hi Dmitry,
I am a little confusing. ext4_convert_unwritten_extents_endio() is
used to convert uninit extent to init, it finds out uninit extent is
the expected case. Am I missing something ?
I will look into the code this weekend.
In your description, extent [1000,128] is split to [1000,20] and
[1020, 108], right?
Thanks,
Yongqiang.
On Thu, Jan 31, 2013 at 3:24 PM, Dmitry Monakhov <dmonakhov@openvz.org> wrote:
> We have to update extent's state after first ext4_split_extent_at otherwise this result
> in following trace:
> ->ext4_ext_handle_uninitialized_extents (ex=[1000:20:uninit], lblock 1000, max_blocks 10)
> ->ext4_split_extent_at(ex=[1000,128], lblk 10010) /// First split
> ->ext4_ext_split() -> ENOSPC
> ->ext4_ext_zeroout
> ->ext4_ext_dirty -> ex=[1000:20:init]
> ->ext4_split_extent_at(ex=[1000,128], lblk 10000) /// Second split
> if(split == ee_block)
> if (split_flag & EXT4_EXT_MARK_UNINIT2)
> ext4_ext_mark_uninitialized(ex); ex=[1000:20:uninit] /// The bug!
> ->ext4_ext_dirty ->ex=[1000:20:uninit]
>
> At the end ext4_convert_unwritten_extents_endio() will findout large uninitialized
> extent.
>
> TESTCASE: https://github.com/dmonakhov/xfstests/commit/1a1c4f337d4d198803436c63a56625b1a78d8a5e
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
> fs/ext4/extents.c | 14 ++++++++++----
> 1 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 97cac01..7a3f679 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3091,18 +3091,24 @@ static int ext4_split_extent(handle_t *handle,
> if (err)
> goto out;
> }
> -
> + /* Update path is required because previous ext4_split_extent_at() may
> + * result in split of original leaf or extent zeroout.
> + */
> ext4_ext_drop_refs(path);
> path = ext4_ext_find_extent(inode, map->m_lblk, path);
> if (IS_ERR(path))
> return PTR_ERR(path);
> + depth = ext_depth(inode);
> + ex = path[depth].p_ext;
> + uninitialized = ext4_ext_is_uninitialized(ex);
>
> if (map->m_lblk >= ee_block) {
> split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
> - if (uninitialized)
> + if (uninitialized) {
> split_flag1 |= EXT4_EXT_MARK_UNINIT1;
> - if (split_flag & EXT4_EXT_MARK_UNINIT2)
> - split_flag1 |= EXT4_EXT_MARK_UNINIT2;
> + if (split_flag & EXT4_EXT_MARK_UNINIT2)
> + split_flag1 |= EXT4_EXT_MARK_UNINIT2;
> + }
> err = ext4_split_extent_at(handle, inode, path,
> map->m_lblk, split_flag1, flags);
> if (err)
> --
> 1.7.1
>
--
Best Wishes
Yongqiang Yang
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout
2013-01-31 7:24 [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout Dmitry Monakhov
2013-01-31 13:35 ` Yongqiang Yang
@ 2013-01-31 14:18 ` Jan Kara
2013-01-31 15:02 ` Dmitry Monakhov
1 sibling, 1 reply; 4+ messages in thread
From: Jan Kara @ 2013-01-31 14:18 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: linux-ext4, tytso, jack, xiaoqiangnk
On Thu 31-01-13 11:24:58, Dmitry Monakhov wrote:
> We have to update extent's state after first ext4_split_extent_at otherwise this result
> in following trace:
> ->ext4_ext_handle_uninitialized_extents (ex=[1000:20:uninit], lblock 1000, max_blocks 10)
> ->ext4_split_extent_at(ex=[1000,128], lblk 10010) /// First split
> ->ext4_ext_split() -> ENOSPC
> ->ext4_ext_zeroout
> ->ext4_ext_dirty -> ex=[1000:20:init]
> ->ext4_split_extent_at(ex=[1000,128], lblk 10000) /// Second split
> if(split == ee_block)
> if (split_flag & EXT4_EXT_MARK_UNINIT2)
> ext4_ext_mark_uninitialized(ex); ex=[1000:20:uninit] /// The bug!
> ->ext4_ext_dirty ->ex=[1000:20:uninit]
>
> At the end ext4_convert_unwritten_extents_endio() will findout large uninitialized
> extent.
Thanks for debugging this. You fix look correct so you can add
Reviewed-by: Jan Kara <jack@suse.cz>
but I have to say the above changelog isn't optimal. I had to look into
the code to verify you are actually speaking about what I think you are
speaking. I think there are some mistakes in block numbers and notation for
extents isn't comletely clear either. Can we make the changelog something
like:
When ext4_split_extent_at() ends up doing zeroout & conversion to
initialized instead of split & conversion, ext4_split_extent() gets
confused and can wrongly mark the extent back as uninitialized resulting in
end IO code getting confused from large unwritten extents (it doesn't
result in data corruption mostly by luck).
The example of problematic behavior is:
lblk len lblk len
ext4_split_extent() (ex=[1000,30,uninit], map=[1010,10])
ext4_split_extent_at() (split [1000,30,uninit] at 1020)
ext4_ext_insert_extent() -> ENOSPC
ext4_ext_zeroout()
-> extent [1000,30] is now initialized
ext4_split_extent_at() (split [1000,30,init] at 1010,
MARK_UNINIT1 | MARK_UNINIT2)
-> extent is split and parts marked as uninitialized
Fix the problem by rechecking extent type after the first
ext4_split_extent_at() returns.
---
What do you think? BTW: we don't have to further try to split the extent
once it gets initialized do we? For now I'd keep your fix just to make
ext4_split_extent() generic but noone really calls that function for
initialized extent or is interested in splitting once the extent gets
initialized. That code seriously needs some diet...
Honza
>
> TESTCASE: https://github.com/dmonakhov/xfstests/commit/1a1c4f337d4d198803436c63a56625b1a78d8a5e
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
> fs/ext4/extents.c | 14 ++++++++++----
> 1 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 97cac01..7a3f679 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3091,18 +3091,24 @@ static int ext4_split_extent(handle_t *handle,
> if (err)
> goto out;
> }
> -
> + /* Update path is required because previous ext4_split_extent_at() may
> + * result in split of original leaf or extent zeroout.
> + */
Style nit comment should look like:
/*
* Update path is required because previous ext4_split_extent_at() may
* result in split of original leaf or extent zeroout.
*/
> ext4_ext_drop_refs(path);
> path = ext4_ext_find_extent(inode, map->m_lblk, path);
> if (IS_ERR(path))
> return PTR_ERR(path);
> + depth = ext_depth(inode);
> + ex = path[depth].p_ext;
> + uninitialized = ext4_ext_is_uninitialized(ex);
>
> if (map->m_lblk >= ee_block) {
> split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
> - if (uninitialized)
> + if (uninitialized) {
> split_flag1 |= EXT4_EXT_MARK_UNINIT1;
> - if (split_flag & EXT4_EXT_MARK_UNINIT2)
> - split_flag1 |= EXT4_EXT_MARK_UNINIT2;
> + if (split_flag & EXT4_EXT_MARK_UNINIT2)
> + split_flag1 |= EXT4_EXT_MARK_UNINIT2;
> + }
> err = ext4_split_extent_at(handle, inode, path,
> map->m_lblk, split_flag1, flags);
> if (err)
> --
> 1.7.1
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout
2013-01-31 14:18 ` Jan Kara
@ 2013-01-31 15:02 ` Dmitry Monakhov
0 siblings, 0 replies; 4+ messages in thread
From: Dmitry Monakhov @ 2013-01-31 15:02 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4, tytso, jack, xiaoqiangnk
On Thu, 31 Jan 2013 15:18:34 +0100, Jan Kara <jack@suse.cz> wrote:
> On Thu 31-01-13 11:24:58, Dmitry Monakhov wrote:
> > We have to update extent's state after first ext4_split_extent_at otherwise this result
> > in following trace:
> > ->ext4_ext_handle_uninitialized_extents (ex=[1000:20:uninit], lblock 1000, max_blocks 10)
> > ->ext4_split_extent_at(ex=[1000,128], lblk 10010) /// First split
> > ->ext4_ext_split() -> ENOSPC
> > ->ext4_ext_zeroout
> > ->ext4_ext_dirty -> ex=[1000:20:init]
> > ->ext4_split_extent_at(ex=[1000,128], lblk 10000) /// Second split
> > if(split == ee_block)
> > if (split_flag & EXT4_EXT_MARK_UNINIT2)
> > ext4_ext_mark_uninitialized(ex); ex=[1000:20:uninit] /// The bug!
> > ->ext4_ext_dirty ->ex=[1000:20:uninit]
> >
> > At the end ext4_convert_unwritten_extents_endio() will findout large uninitialized
> > extent.
> Thanks for debugging this. You fix look correct so you can add
> Reviewed-by: Jan Kara <jack@suse.cz>
Actually patch itself is sub-optimal because second split probably will
also hit ENOSPC and endup with second zeroout. Off course ENOSPC is not
the place where optimization should take place but still.
I've already send updated version here:
message-id:<1359643738-22435-1-git-send-email-dmonakhov@openvz.org>
> but I have to say the above changelog isn't optimal. I had to look into
> the code to verify you are actually speaking about what I think you are
> speaking. I think there are some mistakes in block numbers and notation for
> extents isn't comletely clear either. Can we make the changelog something
> like:
>
> When ext4_split_extent_at() ends up doing zeroout & conversion to
> initialized instead of split & conversion, ext4_split_extent() gets
> confused and can wrongly mark the extent back as uninitialized resulting in
> end IO code getting confused from large unwritten extents (it doesn't
> result in data corruption mostly by luck).
But it is likely to result in data loss because xxx_endio also probably
failed to split extent due to ENOSPC.
>
> The example of problematic behavior is:
> lblk len lblk len
> ext4_split_extent() (ex=[1000,30,uninit], map=[1010,10])
> ext4_split_extent_at() (split [1000,30,uninit] at 1020)
> ext4_ext_insert_extent() -> ENOSPC
> ext4_ext_zeroout()
> -> extent [1000,30] is now initialized
> ext4_split_extent_at() (split [1000,30,init] at 1010,
> MARK_UNINIT1 | MARK_UNINIT2)
> -> extent is split and parts marked as uninitialized
>
> Fix the problem by rechecking extent type after the first
> ext4_split_extent_at() returns.
> ---
>
> What do you think? BTW: we don't have to further try to split the extent
> once it gets initialized do we? For now I'd keep your fix just to make
> ext4_split_extent() generic but noone really calls that function for
> initialized extent or is interested in splitting once the extent gets
> initialized. That code seriously needs some diet...
That is correct, but it is not illegal to split initialized extents
(punch_hole theoretically may use that). But what I'm absolutely sure that
(MARK_UNINIT1|MARK_UNINIT2|MAY_ZEROOUT) flags is not applicable to
initialized extent (see new bugon in second version of my patch)
>
> Honza
>
> >
> > TESTCASE: https://github.com/dmonakhov/xfstests/commit/1a1c4f337d4d198803436c63a56625b1a78d8a5e
> >
> > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> > ---
> > fs/ext4/extents.c | 14 ++++++++++----
> > 1 files changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index 97cac01..7a3f679 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -3091,18 +3091,24 @@ static int ext4_split_extent(handle_t *handle,
> > if (err)
> > goto out;
> > }
> > -
> > + /* Update path is required because previous ext4_split_extent_at() may
> > + * result in split of original leaf or extent zeroout.
> > + */
> Style nit comment should look like:
> /*
> * Update path is required because previous ext4_split_extent_at() may
> * result in split of original leaf or extent zeroout.
> */
>
> > ext4_ext_drop_refs(path);
> > path = ext4_ext_find_extent(inode, map->m_lblk, path);
> > if (IS_ERR(path))
> > return PTR_ERR(path);
> > + depth = ext_depth(inode);
> > + ex = path[depth].p_ext;
> > + uninitialized = ext4_ext_is_uninitialized(ex);
> >
> > if (map->m_lblk >= ee_block) {
> > split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
> > - if (uninitialized)
> > + if (uninitialized) {
> > split_flag1 |= EXT4_EXT_MARK_UNINIT1;
> > - if (split_flag & EXT4_EXT_MARK_UNINIT2)
> > - split_flag1 |= EXT4_EXT_MARK_UNINIT2;
> > + if (split_flag & EXT4_EXT_MARK_UNINIT2)
> > + split_flag1 |= EXT4_EXT_MARK_UNINIT2;
> > + }
> > err = ext4_split_extent_at(handle, inode, path,
> > map->m_lblk, split_flag1, flags);
> > if (err)
> > --
> > 1.7.1
> >
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-01-31 15:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-31 7:24 [PATCH] ext4: ext4_split_extent shoult take care about extent zeroout Dmitry Monakhov
2013-01-31 13:35 ` Yongqiang Yang
2013-01-31 14:18 ` Jan Kara
2013-01-31 15:02 ` Dmitry Monakhov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).