* [PATCH] btrfs: fix outstanding extents calculation
@ 2022-03-28 12:32 Naohiro Aota
2022-03-28 12:58 ` Johannes Thumshirn
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Naohiro Aota @ 2022-03-28 12:32 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota
Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there is outstanding extents left.
In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the CoW case, which releasing fewer outstanding extents than expected.
Fix it by calling btrfs_delalloc_release_extents() for the original length.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
Modules linked in: btrfs blake2b_generic xor lzo_compress
lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
zsmalloc
CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
destroy_inode+0x33/0x70
dispose_list+0x43/0x60
evict_inodes+0x161/0x1b0
generic_shutdown_super+0x2d/0x110
kill_anon_super+0xf/0x20
btrfs_kill_super+0xd/0x20 [btrfs]
deactivate_locked_super+0x27/0x90
cleanup_mnt+0x12c/0x180
task_work_run+0x54/0x80
exit_to_user_mode_prepare+0x152/0x160
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x42/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f854a000fb7
Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: stable@vger.kernel.org # 5.17
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
fs/btrfs/inode.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c7b15634fe70..5c0c54057921 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
u64 block_start, orig_start, orig_block_len, ram_bytes;
bool can_nocow = false;
bool space_reserved = false;
+ u64 prev_len;
int ret = 0;
/*
@@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
can_nocow = true;
}
+ prev_len = len;
if (can_nocow) {
struct extent_map *em2;
@@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
goto out;
}
} else {
- const u64 prev_len = len;
-
/* Our caller expects us to free the input extent map. */
free_extent_map(em);
*map = NULL;
@@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
* We have created our ordered extent, so we can now release our reservation
* for an outstanding extent.
*/
- btrfs_delalloc_release_extents(BTRFS_I(inode), len);
+ btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
/*
* Need to update the i_size under the extent lock so buffered
--
2.35.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 12:32 [PATCH] btrfs: fix outstanding extents calculation Naohiro Aota
@ 2022-03-28 12:58 ` Johannes Thumshirn
2022-03-28 13:08 ` Filipe Manana
2022-03-28 19:26 ` David Sterba
2 siblings, 0 replies; 11+ messages in thread
From: Johannes Thumshirn @ 2022-03-28 12:58 UTC (permalink / raw)
To: Naohiro Aota, linux-btrfs@vger.kernel.org
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 12:32 [PATCH] btrfs: fix outstanding extents calculation Naohiro Aota
2022-03-28 12:58 ` Johannes Thumshirn
@ 2022-03-28 13:08 ` Filipe Manana
2022-03-28 13:14 ` Johannes Thumshirn
2022-03-28 13:17 ` Filipe Manana
2022-03-28 19:26 ` David Sterba
2 siblings, 2 replies; 11+ messages in thread
From: Filipe Manana @ 2022-03-28 13:08 UTC (permalink / raw)
To: Naohiro Aota; +Cc: linux-btrfs
On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> which tells there is outstanding extents left.
I can't trigger the warning with generic/406.
Any special setup or config to trigger it?
The change looks fine to me, however I'm curious why this isn't triggered
with generic/406, nor anyone else reported it before, since the test is
fully deterministic.
Thanks.
>
> In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> with btrfs_delalloc_release_extents(). However, the "len" can be modified
> in the CoW case, which releasing fewer outstanding extents than expected.
>
> Fix it by calling btrfs_delalloc_release_extents() for the original length.
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> Modules linked in: btrfs blake2b_generic xor lzo_compress
> lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
> zsmalloc
> CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
> 65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
> 83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
> RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
> RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
> RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
> R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
> R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
> FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> destroy_inode+0x33/0x70
> dispose_list+0x43/0x60
> evict_inodes+0x161/0x1b0
> generic_shutdown_super+0x2d/0x110
> kill_anon_super+0xf/0x20
> btrfs_kill_super+0xd/0x20 [btrfs]
> deactivate_locked_super+0x27/0x90
> cleanup_mnt+0x12c/0x180
> task_work_run+0x54/0x80
> exit_to_user_mode_prepare+0x152/0x160
> syscall_exit_to_user_mode+0x12/0x30
> do_syscall_64+0x42/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f854a000fb7
>
> Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> CC: stable@vger.kernel.org # 5.17
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
> fs/btrfs/inode.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index c7b15634fe70..5c0c54057921 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> u64 block_start, orig_start, orig_block_len, ram_bytes;
> bool can_nocow = false;
> bool space_reserved = false;
> + u64 prev_len;
> int ret = 0;
>
> /*
> @@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> can_nocow = true;
> }
>
> + prev_len = len;
> if (can_nocow) {
> struct extent_map *em2;
>
> @@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> goto out;
> }
> } else {
> - const u64 prev_len = len;
> -
> /* Our caller expects us to free the input extent map. */
> free_extent_map(em);
> *map = NULL;
> @@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> * We have created our ordered extent, so we can now release our reservation
> * for an outstanding extent.
> */
> - btrfs_delalloc_release_extents(BTRFS_I(inode), len);
> + btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
>
> /*
> * Need to update the i_size under the extent lock so buffered
> --
> 2.35.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:08 ` Filipe Manana
@ 2022-03-28 13:14 ` Johannes Thumshirn
2022-03-28 13:21 ` Filipe Manana
2022-03-28 13:17 ` Filipe Manana
1 sibling, 1 reply; 11+ messages in thread
From: Johannes Thumshirn @ 2022-03-28 13:14 UTC (permalink / raw)
To: Filipe Manana, Naohiro Aota; +Cc: linux-btrfs@vger.kernel.org
On 28/03/2022 15:09, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
>> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
>> which tells there is outstanding extents left.
>
> I can't trigger the warning with generic/406.
> Any special setup or config to trigger it?
>
> The change looks fine to me, however I'm curious why this isn't triggered
> with generic/406, nor anyone else reported it before, since the test is
> fully deterministic.
>
I am able to trigger the WARN() with a different test (which is for a different,
not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:08 ` Filipe Manana
2022-03-28 13:14 ` Johannes Thumshirn
@ 2022-03-28 13:17 ` Filipe Manana
2022-03-28 13:40 ` Naohiro Aota
1 sibling, 1 reply; 11+ messages in thread
From: Filipe Manana @ 2022-03-28 13:17 UTC (permalink / raw)
To: Naohiro Aota; +Cc: linux-btrfs
On Mon, Mar 28, 2022 at 02:08:30PM +0100, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > which tells there is outstanding extents left.
>
> I can't trigger the warning with generic/406.
> Any special setup or config to trigger it?
>
> The change looks fine to me, however I'm curious why this isn't triggered
> with generic/406, nor anyone else reported it before, since the test is
> fully deterministic.
Also the subject "btrfs: fix outstanding extents calculation" is confusing,
as the problem is not in the outstanding extents calculation code, but
rather not releasing delalloc by the correct amount in a specific code path.
So something like "btrfs: release correct delalloc amount in direct IO write path"
would be more correct and specific.
>
> Thanks.
>
> >
> > In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> > extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> > btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> > with btrfs_delalloc_release_extents(). However, the "len" can be modified
> > in the CoW case, which releasing fewer outstanding extents than expected.
> >
> > Fix it by calling btrfs_delalloc_release_extents() for the original length.
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> > Modules linked in: btrfs blake2b_generic xor lzo_compress
> > lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
> > zsmalloc
> > CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> > RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> > Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
> > 65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
> > 83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
> > RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
> > RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
> > RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
> > RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
> > R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
> > R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
> > FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <TASK>
> > destroy_inode+0x33/0x70
> > dispose_list+0x43/0x60
> > evict_inodes+0x161/0x1b0
> > generic_shutdown_super+0x2d/0x110
> > kill_anon_super+0xf/0x20
> > btrfs_kill_super+0xd/0x20 [btrfs]
> > deactivate_locked_super+0x27/0x90
> > cleanup_mnt+0x12c/0x180
> > task_work_run+0x54/0x80
> > exit_to_user_mode_prepare+0x152/0x160
> > syscall_exit_to_user_mode+0x12/0x30
> > do_syscall_64+0x42/0x80
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7f854a000fb7
> >
> > Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> > CC: stable@vger.kernel.org # 5.17
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> > fs/btrfs/inode.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index c7b15634fe70..5c0c54057921 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > u64 block_start, orig_start, orig_block_len, ram_bytes;
> > bool can_nocow = false;
> > bool space_reserved = false;
> > + u64 prev_len;
> > int ret = 0;
> >
> > /*
> > @@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > can_nocow = true;
> > }
> >
> > + prev_len = len;
> > if (can_nocow) {
> > struct extent_map *em2;
> >
> > @@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > goto out;
> > }
> > } else {
> > - const u64 prev_len = len;
> > -
> > /* Our caller expects us to free the input extent map. */
> > free_extent_map(em);
> > *map = NULL;
> > @@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > * We have created our ordered extent, so we can now release our reservation
> > * for an outstanding extent.
> > */
> > - btrfs_delalloc_release_extents(BTRFS_I(inode), len);
> > + btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
> >
> > /*
> > * Need to update the i_size under the extent lock so buffered
> > --
> > 2.35.1
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:14 ` Johannes Thumshirn
@ 2022-03-28 13:21 ` Filipe Manana
2022-03-28 13:22 ` Johannes Thumshirn
2022-03-28 13:36 ` Naohiro Aota
0 siblings, 2 replies; 11+ messages in thread
From: Filipe Manana @ 2022-03-28 13:21 UTC (permalink / raw)
To: Johannes Thumshirn; +Cc: Naohiro Aota, linux-btrfs@vger.kernel.org
On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
> On 28/03/2022 15:09, Filipe Manana wrote:
> > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> >> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> >> which tells there is outstanding extents left.
> >
> > I can't trigger the warning with generic/406.
> > Any special setup or config to trigger it?
> >
> > The change looks fine to me, however I'm curious why this isn't triggered
> > with generic/406, nor anyone else reported it before, since the test is
> > fully deterministic.
> >
>
> I am able to trigger the WARN() with a different test (which is for a different,
> not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
I have no doubts about the fix being correct.
I'm just puzzled why I can't trigger it with generic/406, given that it's a very
deterministic test.
If there's any special config or setup (mount options, zoned fs, etc), I would
like to have it explicitly mentioned in the changelog.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:21 ` Filipe Manana
@ 2022-03-28 13:22 ` Johannes Thumshirn
2022-03-28 13:36 ` Naohiro Aota
1 sibling, 0 replies; 11+ messages in thread
From: Johannes Thumshirn @ 2022-03-28 13:22 UTC (permalink / raw)
To: Filipe Manana; +Cc: Naohiro Aota, linux-btrfs@vger.kernel.org
On 28/03/2022 15:21, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
>> On 28/03/2022 15:09, Filipe Manana wrote:
>>> On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
>>>> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
>>>> which tells there is outstanding extents left.
>>>
>>> I can't trigger the warning with generic/406.
>>> Any special setup or config to trigger it?
>>>
>>> The change looks fine to me, however I'm curious why this isn't triggered
>>> with generic/406, nor anyone else reported it before, since the test is
>>> fully deterministic.
>>>
>>
>> I am able to trigger the WARN() with a different test (which is for a different,
>> not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
>
> I have no doubts about the fix being correct.
> I'm just puzzled why I can't trigger it with generic/406, given that it's a very
> deterministic test.
>
> If there's any special config or setup (mount options, zoned fs, etc), I would
> like to have it explicitly mentioned in the changelog.
>
This I have to deferre to Naohiro, he was the one who was able to reproduce it
with generic/406 (on a non-zoned fs IIRC)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:21 ` Filipe Manana
2022-03-28 13:22 ` Johannes Thumshirn
@ 2022-03-28 13:36 ` Naohiro Aota
2022-03-28 13:45 ` Filipe Manana
1 sibling, 1 reply; 11+ messages in thread
From: Naohiro Aota @ 2022-03-28 13:36 UTC (permalink / raw)
To: Filipe Manana; +Cc: Johannes Thumshirn, linux-btrfs@vger.kernel.org
On Mon, Mar 28, 2022 at 02:21:28PM +0100, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
> > On 28/03/2022 15:09, Filipe Manana wrote:
> > > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > >> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > >> which tells there is outstanding extents left.
> > >
> > > I can't trigger the warning with generic/406.
> > > Any special setup or config to trigger it?
> > >
> > > The change looks fine to me, however I'm curious why this isn't triggered
> > > with generic/406, nor anyone else reported it before, since the test is
> > > fully deterministic.
> > >
> >
> > I am able to trigger the WARN() with a different test (which is for a different,
> > not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
>
> I have no doubts about the fix being correct.
> I'm just puzzled why I can't trigger it with generic/406, given that it's a very
> deterministic test.
>
> If there's any special config or setup (mount options, zoned fs, etc), I would
> like to have it explicitly mentioned in the changelog.
I don't think it's super special. I can always reproduce it on 1GB
zram device. Here is the mkfs setup, and no mount options are
specified.
++ mkfs.btrfs -f -d single -m single /dev/zram0
btrfs-progs v5.16.2
See http://btrfs.wiki.kernel.org for more information.
Performing full device TRIM /dev/zram0 (1.00GiB) ...
NOTE: several default settings have changed in version 5.15, please make sure
this does not affect your deployments:
- DUP for metadata (-m dup)
- enabled no-holes (-O no-holes)
- enabled free-space-tree (-R free-space-tree)
Label: (null)
UUID: b7260fb1-fa0e-4acd-8c3d-0530799a9fd3
Node size: 16384
Sector size: 4096
Filesystem size: 1.00GiB
Block group profiles:
Data: single 8.00MiB
Metadata: single 8.00MiB
System: single 4.00MiB
SSD detected: yes
Zoned device: no
Incompat features: extref, skinny-metadata, no-holes
Runtime features: free-space-tree
Checksum: crc32c
Number of devices: 1
Devices:
ID SIZE PATH
1 1.00GiB /dev/zram0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:17 ` Filipe Manana
@ 2022-03-28 13:40 ` Naohiro Aota
0 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2022-03-28 13:40 UTC (permalink / raw)
To: Filipe Manana; +Cc: linux-btrfs@vger.kernel.org
On Mon, Mar 28, 2022 at 02:17:49PM +0100, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 02:08:30PM +0100, Filipe Manana wrote:
> > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > > Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > > which tells there is outstanding extents left.
> >
> > I can't trigger the warning with generic/406.
> > Any special setup or config to trigger it?
> >
> > The change looks fine to me, however I'm curious why this isn't triggered
> > with generic/406, nor anyone else reported it before, since the test is
> > fully deterministic.
>
> Also the subject "btrfs: fix outstanding extents calculation" is confusing,
> as the problem is not in the outstanding extents calculation code, but
> rather not releasing delalloc by the correct amount in a specific code path.
>
> So something like "btrfs: release correct delalloc amount in direct IO write path"
> would be more correct and specific.
Thank you. That is much more informative. I'll fix the subject to it.
> >
> > Thanks.
> >
> > >
> > > In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> > > extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> > > btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> > > with btrfs_delalloc_release_extents(). However, the "len" can be modified
> > > in the CoW case, which releasing fewer outstanding extents than expected.
> > >
> > > Fix it by calling btrfs_delalloc_release_extents() for the original length.
> > >
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> > > Modules linked in: btrfs blake2b_generic xor lzo_compress
> > > lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
> > > zsmalloc
> > > CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> > > RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> > > Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
> > > 65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
> > > 83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
> > > RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
> > > RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
> > > RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
> > > RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
> > > R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
> > > R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
> > > FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > > <TASK>
> > > destroy_inode+0x33/0x70
> > > dispose_list+0x43/0x60
> > > evict_inodes+0x161/0x1b0
> > > generic_shutdown_super+0x2d/0x110
> > > kill_anon_super+0xf/0x20
> > > btrfs_kill_super+0xd/0x20 [btrfs]
> > > deactivate_locked_super+0x27/0x90
> > > cleanup_mnt+0x12c/0x180
> > > task_work_run+0x54/0x80
> > > exit_to_user_mode_prepare+0x152/0x160
> > > syscall_exit_to_user_mode+0x12/0x30
> > > do_syscall_64+0x42/0x80
> > > entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > RIP: 0033:0x7f854a000fb7
> > >
> > > Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> > > CC: stable@vger.kernel.org # 5.17
> > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > > ---
> > > fs/btrfs/inode.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > index c7b15634fe70..5c0c54057921 100644
> > > --- a/fs/btrfs/inode.c
> > > +++ b/fs/btrfs/inode.c
> > > @@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > > u64 block_start, orig_start, orig_block_len, ram_bytes;
> > > bool can_nocow = false;
> > > bool space_reserved = false;
> > > + u64 prev_len;
> > > int ret = 0;
> > >
> > > /*
> > > @@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > > can_nocow = true;
> > > }
> > >
> > > + prev_len = len;
> > > if (can_nocow) {
> > > struct extent_map *em2;
> > >
> > > @@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > > goto out;
> > > }
> > > } else {
> > > - const u64 prev_len = len;
> > > -
> > > /* Our caller expects us to free the input extent map. */
> > > free_extent_map(em);
> > > *map = NULL;
> > > @@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > > * We have created our ordered extent, so we can now release our reservation
> > > * for an outstanding extent.
> > > */
> > > - btrfs_delalloc_release_extents(BTRFS_I(inode), len);
> > > + btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
> > >
> > > /*
> > > * Need to update the i_size under the extent lock so buffered
> > > --
> > > 2.35.1
> > >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 13:36 ` Naohiro Aota
@ 2022-03-28 13:45 ` Filipe Manana
0 siblings, 0 replies; 11+ messages in thread
From: Filipe Manana @ 2022-03-28 13:45 UTC (permalink / raw)
To: Naohiro Aota; +Cc: Johannes Thumshirn, linux-btrfs@vger.kernel.org
On Mon, Mar 28, 2022 at 01:36:36PM +0000, Naohiro Aota wrote:
> On Mon, Mar 28, 2022 at 02:21:28PM +0100, Filipe Manana wrote:
> > On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
> > > On 28/03/2022 15:09, Filipe Manana wrote:
> > > > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > > >> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > > >> which tells there is outstanding extents left.
> > > >
> > > > I can't trigger the warning with generic/406.
> > > > Any special setup or config to trigger it?
> > > >
> > > > The change looks fine to me, however I'm curious why this isn't triggered
> > > > with generic/406, nor anyone else reported it before, since the test is
> > > > fully deterministic.
> > > >
> > >
> > > I am able to trigger the WARN() with a different test (which is for a different,
> > > not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
> >
> > I have no doubts about the fix being correct.
> > I'm just puzzled why I can't trigger it with generic/406, given that it's a very
> > deterministic test.
> >
> > If there's any special config or setup (mount options, zoned fs, etc), I would
> > like to have it explicitly mentioned in the changelog.
>
> I don't think it's super special. I can always reproduce it on 1GB
> zram device. Here is the mkfs setup, and no mount options are
> specified.
Ok, that, the 1G device, explains it.
It's trigfering a short-write, due to not being able to allocate a large extent and
instead allocating a smaller one. And the test does not fail if we write less than
what we requested, as it redirects xfs_io's stdout to the .full file and does not
check that we wrote the exact amount we asked to write (which is a rather bad test
IMO, should also check we are able to read what we wrote before, etc).
The change looks good to me.
With an updated subject, you can add:
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Thanks.
>
> ++ mkfs.btrfs -f -d single -m single /dev/zram0
> btrfs-progs v5.16.2
> See http://btrfs.wiki.kernel.org for more information.
>
> Performing full device TRIM /dev/zram0 (1.00GiB) ...
> NOTE: several default settings have changed in version 5.15, please make sure
> this does not affect your deployments:
> - DUP for metadata (-m dup)
> - enabled no-holes (-O no-holes)
> - enabled free-space-tree (-R free-space-tree)
>
> Label: (null)
> UUID: b7260fb1-fa0e-4acd-8c3d-0530799a9fd3
> Node size: 16384
> Sector size: 4096
> Filesystem size: 1.00GiB
> Block group profiles:
> Data: single 8.00MiB
> Metadata: single 8.00MiB
> System: single 4.00MiB
> SSD detected: yes
> Zoned device: no
> Incompat features: extref, skinny-metadata, no-holes
> Runtime features: free-space-tree
> Checksum: crc32c
> Number of devices: 1
> Devices:
> ID SIZE PATH
> 1 1.00GiB /dev/zram0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] btrfs: fix outstanding extents calculation
2022-03-28 12:32 [PATCH] btrfs: fix outstanding extents calculation Naohiro Aota
2022-03-28 12:58 ` Johannes Thumshirn
2022-03-28 13:08 ` Filipe Manana
@ 2022-03-28 19:26 ` David Sterba
2 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2022-03-28 19:26 UTC (permalink / raw)
To: Naohiro Aota; +Cc: linux-btrfs
On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> which tells there is outstanding extents left.
>
> In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> with btrfs_delalloc_release_extents(). However, the "len" can be modified
> in the CoW case, which releasing fewer outstanding extents than expected.
>
> Fix it by calling btrfs_delalloc_release_extents() for the original length.
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> Modules linked in: btrfs blake2b_generic xor lzo_compress
> lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
> zsmalloc
> CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
> 65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
> 83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
> RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
> RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
> RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
> R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
> R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
> FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> destroy_inode+0x33/0x70
> dispose_list+0x43/0x60
> evict_inodes+0x161/0x1b0
> generic_shutdown_super+0x2d/0x110
> kill_anon_super+0xf/0x20
> btrfs_kill_super+0xd/0x20 [btrfs]
> deactivate_locked_super+0x27/0x90
> cleanup_mnt+0x12c/0x180
> task_work_run+0x54/0x80
> exit_to_user_mode_prepare+0x152/0x160
> syscall_exit_to_user_mode+0x12/0x30
> do_syscall_64+0x42/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f854a000fb7
>
> Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> CC: stable@vger.kernel.org # 5.17
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Added to misc-next, with updated subject and changelog, thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-03-28 19:31 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-28 12:32 [PATCH] btrfs: fix outstanding extents calculation Naohiro Aota
2022-03-28 12:58 ` Johannes Thumshirn
2022-03-28 13:08 ` Filipe Manana
2022-03-28 13:14 ` Johannes Thumshirn
2022-03-28 13:21 ` Filipe Manana
2022-03-28 13:22 ` Johannes Thumshirn
2022-03-28 13:36 ` Naohiro Aota
2022-03-28 13:45 ` Filipe Manana
2022-03-28 13:17 ` Filipe Manana
2022-03-28 13:40 ` Naohiro Aota
2022-03-28 19:26 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.