From: Boris Burkov <boris@bur.io>
To: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: linux-btrfs@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>,
Hans Holmberg <Hans.Holmberg@wdc.com>,
Damien Le Moal <dlemoal@kernel.org>,
Naohiro Aota <naohiro.aota@wdc.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 6/7] btrfs: zoned: fix deadlock waiting for ticket during data relocation
Date: Fri, 15 May 2026 10:26:04 -0700 [thread overview]
Message-ID: <20260515172604.GD1197064@zen.localdomain> (raw)
In-Reply-To: <20260513123445.43197-7-johannes.thumshirn@wdc.com>
On Wed, May 13, 2026 at 02:34:44PM +0200, Johannes Thumshirn wrote:
> When performing data relocation on a zoned filesystem, BTRFS can deadlock
> in handle_reserve_tickets(). The relocation process is waiting on a space
> reservation ticket that can never be fulfilled, because the relocation
> itself is the operation responsible for freeing up that space.
>
> Fix this by introducing a new flush state,
> BTRFS_RESERVE_FLUSH_DATA_RELOCATION, specifically for data chunk
> allocation during zoned relocation. Like
> BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE, this state uses
> priority_reclaim_data_space() instead of the normal flushing path, which
> avoids re-entering the relocation code and breaking the deadlock cycle.
>
> In btrfs_alloc_data_chunk_ondemand(), select this new flush state when the
> inode belongs to a data relocation root on a zoned filesystem.
This looks good, but a general question: are any of the other data
flushers valid to run at this point or only chunk allocation?
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> fs/btrfs/delalloc-space.c | 2 ++
> fs/btrfs/space-info.c | 2 ++
> fs/btrfs/space-info.h | 11 +++++++++++
> 3 files changed, 15 insertions(+)
>
> diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
> index 0970799d0aa4..c9d3ec6bbc3c 100644
> --- a/fs/btrfs/delalloc-space.c
> +++ b/fs/btrfs/delalloc-space.c
> @@ -134,6 +134,8 @@ int btrfs_alloc_data_chunk_ondemand(const struct btrfs_inode *inode, u64 bytes)
>
> if (btrfs_is_free_space_inode(inode))
> flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
> + else if (btrfs_is_zoned(fs_info) && btrfs_is_data_reloc_root(root))
> + flush = BTRFS_RESERVE_FLUSH_DATA_RELOCATION;
>
> return btrfs_reserve_data_bytes(data_sinfo_for_inode(inode), bytes, flush);
> }
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 58256a9c056d..ec811a77ebb1 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -1703,6 +1703,7 @@ static int handle_reserve_ticket(struct btrfs_space_info *space_info,
> ARRAY_SIZE(evict_flush_states));
> break;
> case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
> + case BTRFS_RESERVE_FLUSH_DATA_RELOCATION:
> priority_reclaim_data_space(space_info, ticket);
> break;
> default:
> @@ -1966,6 +1967,7 @@ int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
>
> ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
> flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE ||
> + flush == BTRFS_RESERVE_FLUSH_DATA_RELOCATION ||
> flush == BTRFS_RESERVE_NO_FLUSH, "flush=%d", flush);
> ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA,
> "current->journal_info=0x%lx flush=%d",
> diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
> index 24f45072ca4b..f2b8be2af5c3 100644
> --- a/fs/btrfs/space-info.h
> +++ b/fs/btrfs/space-info.h
> @@ -77,6 +77,17 @@ enum btrfs_reserve_flush_enum {
> */
> BTRFS_RESERVE_FLUSH_ALL_STEAL,
>
> + /*
> + * This is for relocation on zoned filesystems only. We need to use
> + * priority flushing for this, because otherwise we can deadlock on
> + * waiting for a ticket, that cannot be granted, because we cannot do
> + * any allocations.
> + *
> + * Apart from being specific to zoned relocation, it is equal to
> + * BTRFS_FLUSH_FREE_SPACE_INODE.
> + */
Should it be named ZONED_RELOCATION then?
> + BTRFS_RESERVE_FLUSH_DATA_RELOCATION,
> +
> /*
> * This is for btrfs_use_block_rsv only. We have exhausted our block
> * rsv and our global block rsv. This can happen for things like
> --
> 2.54.0
>
next prev parent reply other threads:[~2026-05-15 17:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 12:34 [PATCH 0/7] btrfs: fixes around generic/747 on zoned filesystems Johannes Thumshirn
2026-05-13 12:34 ` [PATCH 1/7] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
2026-05-14 14:44 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 2/7] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints Johannes Thumshirn
2026-05-13 12:34 ` [PATCH 3/7] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
2026-05-14 5:42 ` Damien Le Moal
2026-05-14 14:54 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 4/7] btrfs: zoned: don't account data relocation space-info in statfs free space Johannes Thumshirn
2026-05-14 5:42 ` Damien Le Moal
2026-05-15 4:38 ` Christoph Hellwig
2026-05-13 12:34 ` [PATCH 5/7] btrfs: zoned: subtract zone_unusable space in statfs Johannes Thumshirn
2026-05-14 5:43 ` Damien Le Moal
2026-05-15 4:39 ` Christoph Hellwig
2026-05-15 9:26 ` Johannes Thumshirn
2026-05-15 11:34 ` Christoph Hellwig
2026-05-15 21:05 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 6/7] btrfs: zoned: fix deadlock waiting for ticket during data relocation Johannes Thumshirn
2026-05-15 17:26 ` Boris Burkov [this message]
2026-05-13 12:34 ` [RFC PATCH 7/7] btrfs: zoned: add RECLAIM_ZONES and RESET_ZONES to first async reclaim loop Johannes Thumshirn
2026-05-15 18:38 ` Boris Burkov
2026-05-14 14:43 ` [PATCH 0/7] btrfs: fixes around generic/747 on zoned filesystems Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260515172604.GD1197064@zen.localdomain \
--to=boris@bur.io \
--cc=Hans.Holmberg@wdc.com \
--cc=dlemoal@kernel.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=hch@lst.de \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=naohiro.aota@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.