* [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems
@ 2026-05-22 9:02 Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 1/5] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-05-22 9:02 UTC (permalink / raw)
To: linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig,
Johannes Thumshirn
This series fixes premature ENOSPC errors and deadlock issues on zoned BTRFS
filesystems when running xfstest generic/747, which tests garbage collection
on zoned block devices using direct and buffered I/O.
There where two issues revealed in the investigation:
Async reclaim deadlock: On zoned filesystems, the flush state machine
only executes RECLAIM_ZONES and RESET_ZONES in later flush states
(FLUSH_DELALLOC and beyond). By the time these states are reached,
ticket waiters can deadlock waiting for space that can only be freed by
zone reset. The fix adds RECLAIM_ZONES and RESET_ZONES to the first
async reclaim loop (FLUSH_ALLOC) specifically for zoned filesystems,
ensuring zone reset happens early enough to free space for pending
allocation tickets.
Additionally, the series fixes a bug in data relocation block group selection
where the first block group was incorrectly skipped, and adds a new flush
state (BTRFS_RESERVE_FLUSH_DATA_RELOCATION) to use priority reclaim for
zoned data relocation operations.
Using this, generic/747 passes when lowering the fill_percent variable
to 80% (from the 95%) that XFS uses. I'm still examining if this is due
to XFS not having metadata on zoned or other statfs space estimation
bugs or something else.
Changes to v2:
- Re-do the data-relocation block-group selection algorithm
Changes to v1:
- Removed the zone_unusable statfs patch
- Removed the heavy handed flushing patch
- Added fixes tags where needed
- Added more comments
Johannes Thumshirn (5):
btrfs: zoned: document RECLAIM_ZONES flush state
btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints
btrfs: zoned: always set data_relocation_bg
btrfs: zoned: don't account data relocation space-info in statfs free
space
btrfs: zoned: fix deadlock waiting for ticket during data relocation
fs/btrfs/delalloc-space.c | 2 ++
fs/btrfs/space-info.c | 9 +++++++++
fs/btrfs/space-info.h | 11 +++++++++++
fs/btrfs/super.c | 3 ++-
fs/btrfs/zoned.c | 8 +-------
include/trace/events/btrfs.h | 1 +
6 files changed, 26 insertions(+), 8 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3 1/5] btrfs: zoned: document RECLAIM_ZONES flush state
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
@ 2026-05-22 9:02 ` Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 2/5] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints Johannes Thumshirn
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-05-22 9:02 UTC (permalink / raw)
To: linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig,
Johannes Thumshirn
Document the purpose of the RECLAIM_ZONES flush state.
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/space-info.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index f0436eea1544..739984462677 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1411,6 +1411,13 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work)
* This is where we reclaim all of the pinned space generated by running the
* iputs
*
+ * RECLAIM_ZONES
+ * This state only works for the zoned mode. We scan the block groups in the
+ * reclaim_bgs_list and check if we can relocate them. If yes perform the
+ * relocation to garbage collect the zone. On each of these runs
+ * BTRFS_ZONED_SYNC_RECLAIM_BATCH (5) block-groups will be reclaimed, after all
+ * unused block-groups have been deleted.
+ *
* RESET_ZONES
* This state works only for the zoned mode. We scan the unused block group
* list and reset the zones and reuse the block group.
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 2/5] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 1/5] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
@ 2026-05-22 9:02 ` Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-05-22 9:02 UTC (permalink / raw)
To: linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig,
Johannes Thumshirn
Decode the 'RECLAIM_ZONES' state in tracepoints, as of now only the
numerical state is shown in the tracepoints.
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
include/trace/events/btrfs.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index ec1df8b94517..ed272a100fa8 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -101,6 +101,7 @@ struct find_free_extent_ctl;
EM( ALLOC_CHUNK_FORCE, "ALLOC_CHUNK_FORCE") \
EM( RUN_DELAYED_IPUTS, "RUN_DELAYED_IPUTS") \
EM( COMMIT_TRANS, "COMMIT_TRANS") \
+ EM( RECLAIM_ZONES, "RECLAIM_ZONES") \
EMe(RESET_ZONES, "RESET_ZONES")
/*
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 1/5] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 2/5] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints Johannes Thumshirn
@ 2026-05-22 9:02 ` Johannes Thumshirn
2026-05-22 18:27 ` Boris Burkov
2026-05-25 2:36 ` Naohiro Aota
2026-05-22 9:02 ` [PATCH v3 4/5] btrfs: zoned: don't account data relocation space-info in statfs free space Johannes Thumshirn
` (3 subsequent siblings)
6 siblings, 2 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-05-22 9:02 UTC (permalink / raw)
To: linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig,
Johannes Thumshirn
When searching for a data relocation block-group on mount,
btrfs_zoned_reserve_data_reloc_bg() is looking for the first empty DATA
block-group. But it first checks if the block-group is empty and if yes
continues the search, and then checks if it is the first DATA block-group.
There is actually no point in looking for the second empty DATA block
group as new DATA allocations will just allocate a new chunk for it. Pick
the first DATA block-group without any allocations done and set it as
relocation block-group.
Singed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/zoned.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 16dd87aa06f2..a4d2fb774f72 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -2763,7 +2763,6 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
struct btrfs_block_group *bg;
struct list_head *bg_list;
u64 alloc_flags;
- bool first = true;
bool did_chunk_alloc = false;
int index;
int ret;
@@ -2784,13 +2783,9 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
again:
bg_list = &space_info->block_groups[index];
list_for_each_entry(bg, bg_list, list) {
- if (bg->alloc_offset != 0)
- continue;
- if (first) {
- first = false;
+ if (bg->alloc_offset != 0)
continue;
- }
if (space_info == data_sinfo) {
/* Migrate the block group to the data relocation space_info. */
@@ -2849,7 +2844,6 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
* We allocated a new block group in the data relocation space_info. We
* can take that one.
*/
- first = false;
did_chunk_alloc = true;
goto again;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 4/5] btrfs: zoned: don't account data relocation space-info in statfs free space
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
` (2 preceding siblings ...)
2026-05-22 9:02 ` [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
@ 2026-05-22 9:02 ` Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 5/5] btrfs: zoned: fix deadlock waiting for ticket during data relocation Johannes Thumshirn
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-05-22 9:02 UTC (permalink / raw)
To: linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig,
Johannes Thumshirn
Don't account the free space in a data relocation space-info sub-group as
usable free space in statfs.
This is misleading as no user allocations can be made in this space-info
sub-group. It is only a target for relocation.
Fixes: f92ee31e031c ("btrfs: introduce btrfs_space_info sub-group")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/super.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index fb15decb0861..a0dbc0d2213f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1739,7 +1739,8 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
int mixed = 0;
list_for_each_entry(found, &fs_info->space_info, list) {
- if (found->flags & BTRFS_BLOCK_GROUP_DATA) {
+ if (found->flags & BTRFS_BLOCK_GROUP_DATA &&
+ found->subgroup_id != BTRFS_SUB_GROUP_DATA_RELOC) {
int i;
total_free_data += found->disk_total - found->disk_used;
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 5/5] btrfs: zoned: fix deadlock waiting for ticket during data relocation
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
` (3 preceding siblings ...)
2026-05-22 9:02 ` [PATCH v3 4/5] btrfs: zoned: don't account data relocation space-info in statfs free space Johannes Thumshirn
@ 2026-05-22 9:02 ` Johannes Thumshirn
2026-05-25 3:57 ` [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Naohiro Aota
2026-05-25 11:35 ` David Sterba
6 siblings, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-05-22 9:02 UTC (permalink / raw)
To: linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig,
Johannes Thumshirn
When performing data relocation on a zoned filesystem, BTRFS can deadlock
in handle_reserve_tickets(). The relocation process is waiting on a space
reservation ticket that can never be fulfilled, because the relocation
itself is the operation responsible for freeing up that space.
Fix this by introducing a new flush state,
BTRFS_RESERVE_FLUSH_ZONED_RELOCATION, specifically for data chunk
allocation during zoned relocation. Like
BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE, this state uses
priority_reclaim_data_space() instead of the normal flushing path, which
avoids re-entering the relocation code and breaking the deadlock cycle.
In btrfs_alloc_data_chunk_ondemand(), select this new flush state when the
inode belongs to a data relocation root on a zoned filesystem.
Fixes: e2a7fd22378f ("btrfs: zoned: add zone reclaim flush state for DATA space_info")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/delalloc-space.c | 2 ++
fs/btrfs/space-info.c | 2 ++
fs/btrfs/space-info.h | 11 +++++++++++
3 files changed, 15 insertions(+)
diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index 0970799d0aa4..4293a6383433 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -134,6 +134,8 @@ int btrfs_alloc_data_chunk_ondemand(const struct btrfs_inode *inode, u64 bytes)
if (btrfs_is_free_space_inode(inode))
flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
+ else if (btrfs_is_zoned(fs_info) && btrfs_is_data_reloc_root(root))
+ flush = BTRFS_RESERVE_FLUSH_ZONED_RELOCATION;
return btrfs_reserve_data_bytes(data_sinfo_for_inode(inode), bytes, flush);
}
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 739984462677..e6641597b321 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1705,6 +1705,7 @@ static int handle_reserve_ticket(struct btrfs_space_info *space_info,
ARRAY_SIZE(evict_flush_states));
break;
case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
+ case BTRFS_RESERVE_FLUSH_ZONED_RELOCATION:
priority_reclaim_data_space(space_info, ticket);
break;
default:
@@ -1968,6 +1969,7 @@ int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE ||
+ flush == BTRFS_RESERVE_FLUSH_ZONED_RELOCATION ||
flush == BTRFS_RESERVE_NO_FLUSH, "flush=%d", flush);
ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA,
"current->journal_info=0x%lx flush=%d",
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 24f45072ca4b..aa836e8a9d4a 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -77,6 +77,17 @@ enum btrfs_reserve_flush_enum {
*/
BTRFS_RESERVE_FLUSH_ALL_STEAL,
+ /*
+ * This is for relocation on zoned filesystems only. We need to use
+ * priority flushing for this, because otherwise we can deadlock on
+ * waiting for a ticket, that cannot be granted, because we cannot do
+ * any allocations.
+ *
+ * Apart from being specific to zoned relocation, it is equal to
+ * BTRFS_FLUSH_FREE_SPACE_INODE.
+ */
+ BTRFS_RESERVE_FLUSH_ZONED_RELOCATION,
+
/*
* This is for btrfs_use_block_rsv only. We have exhausted our block
* rsv and our global block rsv. This can happen for things like
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg
2026-05-22 9:02 ` [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
@ 2026-05-22 18:27 ` Boris Burkov
2026-05-25 2:36 ` Naohiro Aota
1 sibling, 0 replies; 10+ messages in thread
From: Boris Burkov @ 2026-05-22 18:27 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: linux-btrfs, Filipe Manana, David Sterba, Hans Holmberg,
Damien Le Moal, Naohiro Aota, Christoph Hellwig
On Fri, May 22, 2026 at 11:02:45AM +0200, Johannes Thumshirn wrote:
> When searching for a data relocation block-group on mount,
> btrfs_zoned_reserve_data_reloc_bg() is looking for the first empty DATA
> block-group. But it first checks if the block-group is empty and if yes
> continues the search, and then checks if it is the first DATA block-group.
>
> There is actually no point in looking for the second empty DATA block
> group as new DATA allocations will just allocate a new chunk for it. Pick
> the first DATA block-group without any allocations done and set it as
> relocation block-group.
>
Looks good, thanks.
Reviewed-by: Boris Burkov <boris@bur.io>
> Singed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> fs/btrfs/zoned.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index 16dd87aa06f2..a4d2fb774f72 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -2763,7 +2763,6 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
> struct btrfs_block_group *bg;
> struct list_head *bg_list;
> u64 alloc_flags;
> - bool first = true;
> bool did_chunk_alloc = false;
> int index;
> int ret;
> @@ -2784,13 +2783,9 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
> again:
> bg_list = &space_info->block_groups[index];
> list_for_each_entry(bg, bg_list, list) {
> - if (bg->alloc_offset != 0)
> - continue;
>
> - if (first) {
> - first = false;
> + if (bg->alloc_offset != 0)
> continue;
> - }
>
> if (space_info == data_sinfo) {
> /* Migrate the block group to the data relocation space_info. */
> @@ -2849,7 +2844,6 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
> * We allocated a new block group in the data relocation space_info. We
> * can take that one.
> */
> - first = false;
> did_chunk_alloc = true;
> goto again;
> }
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg
2026-05-22 9:02 ` [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
2026-05-22 18:27 ` Boris Burkov
@ 2026-05-25 2:36 ` Naohiro Aota
1 sibling, 0 replies; 10+ messages in thread
From: Naohiro Aota @ 2026-05-25 2:36 UTC (permalink / raw)
To: Johannes Thumshirn, linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig
On Fri May 22, 2026 at 6:02 PM JST, Johannes Thumshirn wrote:
> When searching for a data relocation block-group on mount,
> btrfs_zoned_reserve_data_reloc_bg() is looking for the first empty DATA
> block-group. But it first checks if the block-group is empty and if yes
> continues the search, and then checks if it is the first DATA block-group.
>
> There is actually no point in looking for the second empty DATA block
> group as new DATA allocations will just allocate a new chunk for it. Pick
> the first DATA block-group without any allocations done and set it as
> relocation block-group.
I think we can add context here.
At first, the commit 694ce5e143d6 ("btrfs: zoned: reserve data_reloc
block group on mount") introduced the functionality. At that time, we
took second unused (used == 0) block group, as the first one might be
a block group used for normal data.
Later, commit daa0fde32235 ("btrfs: zoned: fix data relocation block
group reservation") switched to look for an empty block group
(alloc_offset == 0). At this point, there is no reason taking the second
one anymore. So, this commit is fixing an issue in commit daa0fde32235.
>
> Singed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> fs/btrfs/zoned.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index 16dd87aa06f2..a4d2fb774f72 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -2763,7 +2763,6 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
> struct btrfs_block_group *bg;
> struct list_head *bg_list;
> u64 alloc_flags;
> - bool first = true;
> bool did_chunk_alloc = false;
> int index;
> int ret;
> @@ -2784,13 +2783,9 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
Could you please fix the comment here?
/* Scan the data space_info to find empty block groups. Take the second one. */
Other than that,
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
> again:
> bg_list = &space_info->block_groups[index];
> list_for_each_entry(bg, bg_list, list) {
> - if (bg->alloc_offset != 0)
> - continue;
>
> - if (first) {
> - first = false;
> + if (bg->alloc_offset != 0)
> continue;
> - }
>
> if (space_info == data_sinfo) {
> /* Migrate the block group to the data relocation space_info. */
> @@ -2849,7 +2844,6 @@ void btrfs_zoned_reserve_data_reloc_bg(struct btrfs_fs_info *fs_info)
> * We allocated a new block group in the data relocation space_info. We
> * can take that one.
> */
> - first = false;
> did_chunk_alloc = true;
> goto again;
> }
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
` (4 preceding siblings ...)
2026-05-22 9:02 ` [PATCH v3 5/5] btrfs: zoned: fix deadlock waiting for ticket during data relocation Johannes Thumshirn
@ 2026-05-25 3:57 ` Naohiro Aota
2026-05-25 11:35 ` David Sterba
6 siblings, 0 replies; 10+ messages in thread
From: Naohiro Aota @ 2026-05-25 3:57 UTC (permalink / raw)
To: Johannes Thumshirn, linux-btrfs
Cc: Filipe Manana, David Sterba, Hans Holmberg, Boris Burkov,
Damien Le Moal, Naohiro Aota, Christoph Hellwig
On Fri May 22, 2026 at 6:02 PM JST, Johannes Thumshirn wrote:
> This series fixes premature ENOSPC errors and deadlock issues on zoned BTRFS
> filesystems when running xfstest generic/747, which tests garbage collection
> on zoned block devices using direct and buffered I/O.
>
> There where two issues revealed in the investigation:
>
> Async reclaim deadlock: On zoned filesystems, the flush state machine
> only executes RECLAIM_ZONES and RESET_ZONES in later flush states
> (FLUSH_DELALLOC and beyond). By the time these states are reached,
> ticket waiters can deadlock waiting for space that can only be freed by
> zone reset. The fix adds RECLAIM_ZONES and RESET_ZONES to the first
> async reclaim loop (FLUSH_ALLOC) specifically for zoned filesystems,
> ensuring zone reset happens early enough to free space for pending
> allocation tickets.
>
> Additionally, the series fixes a bug in data relocation block group selection
> where the first block group was incorrectly skipped, and adds a new flush
> state (BTRFS_RESERVE_FLUSH_DATA_RELOCATION) to use priority reclaim for
> zoned data relocation operations.
>
> Using this, generic/747 passes when lowering the fill_percent variable
> to 80% (from the 95%) that XFS uses. I'm still examining if this is due
> to XFS not having metadata on zoned or other statfs space estimation
> bugs or something else.
For the series,
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
` (5 preceding siblings ...)
2026-05-25 3:57 ` [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Naohiro Aota
@ 2026-05-25 11:35 ` David Sterba
6 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2026-05-25 11:35 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: linux-btrfs, Filipe Manana, David Sterba, Hans Holmberg,
Boris Burkov, Damien Le Moal, Naohiro Aota, Christoph Hellwig
On Fri, May 22, 2026 at 11:02:42AM +0200, Johannes Thumshirn wrote:
> This series fixes premature ENOSPC errors and deadlock issues on zoned BTRFS
> filesystems when running xfstest generic/747, which tests garbage collection
> on zoned block devices using direct and buffered I/O.
>
> There where two issues revealed in the investigation:
>
> Async reclaim deadlock: On zoned filesystems, the flush state machine
> only executes RECLAIM_ZONES and RESET_ZONES in later flush states
> (FLUSH_DELALLOC and beyond). By the time these states are reached,
> ticket waiters can deadlock waiting for space that can only be freed by
> zone reset. The fix adds RECLAIM_ZONES and RESET_ZONES to the first
> async reclaim loop (FLUSH_ALLOC) specifically for zoned filesystems,
> ensuring zone reset happens early enough to free space for pending
> allocation tickets.
>
> Additionally, the series fixes a bug in data relocation block group selection
> where the first block group was incorrectly skipped, and adds a new flush
> state (BTRFS_RESERVE_FLUSH_DATA_RELOCATION) to use priority reclaim for
> zoned data relocation operations.
>
> Using this, generic/747 passes when lowering the fill_percent variable
> to 80% (from the 95%) that XFS uses. I'm still examining if this is due
> to XFS not having metadata on zoned or other statfs space estimation
> bugs or something else.
>
> Changes to v2:
> - Re-do the data-relocation block-group selection algorithm
>
> Changes to v1:
> - Removed the zone_unusable statfs patch
> - Removed the heavy handed flushing patch
> - Added fixes tags where needed
> - Added more comments
>
> Johannes Thumshirn (5):
> btrfs: zoned: document RECLAIM_ZONES flush state
> btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints
> btrfs: zoned: always set data_relocation_bg
> btrfs: zoned: don't account data relocation space-info in statfs free
> space
> btrfs: zoned: fix deadlock waiting for ticket during data relocation
Added to for-next, with updated changelog in patch 3 and deleted
comment. Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-25 11:35 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22 9:02 [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 1/5] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 2/5] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 3/5] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
2026-05-22 18:27 ` Boris Burkov
2026-05-25 2:36 ` Naohiro Aota
2026-05-22 9:02 ` [PATCH v3 4/5] btrfs: zoned: don't account data relocation space-info in statfs free space Johannes Thumshirn
2026-05-22 9:02 ` [PATCH v3 5/5] btrfs: zoned: fix deadlock waiting for ticket during data relocation Johannes Thumshirn
2026-05-25 3:57 ` [PATCH v3 0/5] btrfs: zoned: fix deadlock and space reporting issues for zoned filesystems Naohiro Aota
2026-05-25 11:35 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox