From: Boris Burkov <boris@bur.io>
To: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: linux-btrfs@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>,
Hans Holmberg <Hans.Holmberg@wdc.com>,
Damien Le Moal <dlemoal@kernel.org>,
Naohiro Aota <naohiro.aota@wdc.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 6/7] btrfs: zoned: fix deadlock waiting for ticket during data relocation
Date: Fri, 15 May 2026 10:26:04 -0700 [thread overview]
Message-ID: <20260515172604.GD1197064@zen.localdomain> (raw)
In-Reply-To: <20260513123445.43197-7-johannes.thumshirn@wdc.com>
On Wed, May 13, 2026 at 02:34:44PM +0200, Johannes Thumshirn wrote:
> When performing data relocation on a zoned filesystem, BTRFS can deadlock
> in handle_reserve_tickets(). The relocation process is waiting on a space
> reservation ticket that can never be fulfilled, because the relocation
> itself is the operation responsible for freeing up that space.
>
> Fix this by introducing a new flush state,
> BTRFS_RESERVE_FLUSH_DATA_RELOCATION, specifically for data chunk
> allocation during zoned relocation. Like
> BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE, this state uses
> priority_reclaim_data_space() instead of the normal flushing path, which
> avoids re-entering the relocation code and breaking the deadlock cycle.
>
> In btrfs_alloc_data_chunk_ondemand(), select this new flush state when the
> inode belongs to a data relocation root on a zoned filesystem.
This looks good, but a general question: are any of the other data
flushers valid to run at this point or only chunk allocation?
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> fs/btrfs/delalloc-space.c | 2 ++
> fs/btrfs/space-info.c | 2 ++
> fs/btrfs/space-info.h | 11 +++++++++++
> 3 files changed, 15 insertions(+)
>
> diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
> index 0970799d0aa4..c9d3ec6bbc3c 100644
> --- a/fs/btrfs/delalloc-space.c
> +++ b/fs/btrfs/delalloc-space.c
> @@ -134,6 +134,8 @@ int btrfs_alloc_data_chunk_ondemand(const struct btrfs_inode *inode, u64 bytes)
>
> if (btrfs_is_free_space_inode(inode))
> flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
> + else if (btrfs_is_zoned(fs_info) && btrfs_is_data_reloc_root(root))
> + flush = BTRFS_RESERVE_FLUSH_DATA_RELOCATION;
>
> return btrfs_reserve_data_bytes(data_sinfo_for_inode(inode), bytes, flush);
> }
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 58256a9c056d..ec811a77ebb1 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -1703,6 +1703,7 @@ static int handle_reserve_ticket(struct btrfs_space_info *space_info,
> ARRAY_SIZE(evict_flush_states));
> break;
> case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
> + case BTRFS_RESERVE_FLUSH_DATA_RELOCATION:
> priority_reclaim_data_space(space_info, ticket);
> break;
> default:
> @@ -1966,6 +1967,7 @@ int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
>
> ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
> flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE ||
> + flush == BTRFS_RESERVE_FLUSH_DATA_RELOCATION ||
> flush == BTRFS_RESERVE_NO_FLUSH, "flush=%d", flush);
> ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA,
> "current->journal_info=0x%lx flush=%d",
> diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
> index 24f45072ca4b..f2b8be2af5c3 100644
> --- a/fs/btrfs/space-info.h
> +++ b/fs/btrfs/space-info.h
> @@ -77,6 +77,17 @@ enum btrfs_reserve_flush_enum {
> */
> BTRFS_RESERVE_FLUSH_ALL_STEAL,
>
> + /*
> + * This is for relocation on zoned filesystems only. We need to use
> + * priority flushing for this, because otherwise we can deadlock on
> + * waiting for a ticket, that cannot be granted, because we cannot do
> + * any allocations.
> + *
> + * Apart from being specific to zoned relocation, it is equal to
> + * BTRFS_FLUSH_FREE_SPACE_INODE.
> + */
Should it be named ZONED_RELOCATION then?
> + BTRFS_RESERVE_FLUSH_DATA_RELOCATION,
> +
> /*
> * This is for btrfs_use_block_rsv only. We have exhausted our block
> * rsv and our global block rsv. This can happen for things like
> --
> 2.54.0
>
next prev parent reply other threads:[~2026-05-15 17:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 12:34 [PATCH 0/7] btrfs: fixes around generic/747 on zoned filesystems Johannes Thumshirn
2026-05-13 12:34 ` [PATCH 1/7] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
2026-05-14 14:44 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 2/7] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints Johannes Thumshirn
2026-05-13 12:34 ` [PATCH 3/7] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
2026-05-14 5:42 ` Damien Le Moal
2026-05-14 14:54 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 4/7] btrfs: zoned: don't account data relocation space-info in statfs free space Johannes Thumshirn
2026-05-14 5:42 ` Damien Le Moal
2026-05-15 4:38 ` Christoph Hellwig
2026-05-13 12:34 ` [PATCH 5/7] btrfs: zoned: subtract zone_unusable space in statfs Johannes Thumshirn
2026-05-14 5:43 ` Damien Le Moal
2026-05-15 4:39 ` Christoph Hellwig
2026-05-15 9:26 ` Johannes Thumshirn
2026-05-15 11:34 ` Christoph Hellwig
2026-05-15 21:05 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 6/7] btrfs: zoned: fix deadlock waiting for ticket during data relocation Johannes Thumshirn
2026-05-15 17:26 ` Boris Burkov [this message]
2026-05-13 12:34 ` [RFC PATCH 7/7] btrfs: zoned: add RECLAIM_ZONES and RESET_ZONES to first async reclaim loop Johannes Thumshirn
2026-05-15 18:38 ` Boris Burkov
2026-05-14 14:43 ` [PATCH 0/7] btrfs: fixes around generic/747 on zoned filesystems Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260515172604.GD1197064@zen.localdomain \
--to=boris@bur.io \
--cc=Hans.Holmberg@wdc.com \
--cc=dlemoal@kernel.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=hch@lst.de \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=naohiro.aota@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox