linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org, Naohiro Aota <Naohiro.Aota@wdc.com>,
	Filipe Manana <fdmanana@suse.com>,
	Anand Jain <anand.jain@oracle.com>
Subject: Re: [PATCH v4 3/3] btrfs: zoned: automatically reclaim zones
Date: Thu, 15 Apr 2021 14:36:24 -0400	[thread overview]
Message-ID: <63c82817-751c-b200-abfc-b7e669affa93@toxicpanda.com> (raw)
In-Reply-To: <920701be19f36b4f7ed84efd53a3d0550700f047.1618494550.git.johannes.thumshirn@wdc.com>

On 4/15/21 9:58 AM, Johannes Thumshirn wrote:
> When a file gets deleted on a zoned file system, the space freed is not
> returned back into the block group's free space, but is migrated to
> zone_unusable.
> 
> As this zone_unusable space is behind the current write pointer it is not
> possible to use it for new allocations. In the current implementation a
> zone is reset once all of the block group's space is accounted as zone
> unusable.
> 
> This behaviour can lead to premature ENOSPC errors on a busy file system.
> 
> Instead of only reclaiming the zone once it is completely unusable,
> kick off a reclaim job once the amount of unusable bytes exceeds a user
> configurable threshold between 51% and 100%. It can be set per mounted
> filesystem via the sysfs tunable bg_reclaim_threshold which is set to 75%
> per default.
> 
> Similar to reclaiming unused block groups, these dirty block groups are
> added to a to_reclaim list and then on a transaction commit, the reclaim
> process is triggered but after we deleted unused block groups, which will
> free space for the relocation process.
> 
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   fs/btrfs/block-group.c       | 96 ++++++++++++++++++++++++++++++++++++
>   fs/btrfs/block-group.h       |  3 ++
>   fs/btrfs/ctree.h             |  6 +++
>   fs/btrfs/disk-io.c           | 13 +++++
>   fs/btrfs/free-space-cache.c  |  9 +++-
>   fs/btrfs/sysfs.c             | 35 +++++++++++++
>   fs/btrfs/volumes.c           |  2 +-
>   fs/btrfs/volumes.h           |  1 +
>   include/trace/events/btrfs.h | 12 +++++
>   9 files changed, 175 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index bbb5a6e170c7..3f06ea42c013 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -1485,6 +1485,92 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
>   	spin_unlock(&fs_info->unused_bgs_lock);
>   }
>   
> +void btrfs_reclaim_bgs_work(struct work_struct *work)
> +{
> +	struct btrfs_fs_info *fs_info =
> +		container_of(work, struct btrfs_fs_info, reclaim_bgs_work);
> +	struct btrfs_block_group *bg;
> +	struct btrfs_space_info *space_info;
> +	int ret = 0;
> +
> +	if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
> +		return;
> +
> +	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE))
> +		return;
> +
> +	mutex_lock(&fs_info->reclaim_bgs_lock);
> +	spin_lock(&fs_info->unused_bgs_lock);
> +	while (!list_empty(&fs_info->reclaim_bgs)) {
> +		bg = list_first_entry(&fs_info->reclaim_bgs,
> +				      struct btrfs_block_group,
> +				      bg_list);
> +		list_del_init(&bg->bg_list);
> +
> +		space_info = bg->space_info;
> +		spin_unlock(&fs_info->unused_bgs_lock);
> +
> +		/* Don't want to race with allocators so take the groups_sem */
> +		down_write(&space_info->groups_sem);
> +
> +		spin_lock(&bg->lock);
> +		if (bg->reserved || bg->pinned || bg->ro) {
> +			/*
> +			 * We want to bail if we made new allocations or have
> +			 * outstanding allocations in this block group.  We do
> +			 * the ro check in case balance is currently acting on
> +			 * this block group.
> +			 */
> +			spin_unlock(&bg->lock);
> +			up_write(&space_info->groups_sem);
> +			goto next;
> +		}
> +		spin_unlock(&bg->lock);
> +

Here I think we want a

if (btrfs_fs_closing())
	goto next;

so we don't block out a umount for all eternity.  Thanks,

Josef

  reply	other threads:[~2021-04-15 18:36 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-15 13:58 [PATCH v4 0/3] btrfs: zoned: automatic BG reclaim Johannes Thumshirn
2021-04-15 13:58 ` [PATCH v4 1/3] btrfs: zoned: reset zones of relocated block groups Johannes Thumshirn
2021-04-15 18:26   ` Josef Bacik
2021-04-16  5:50     ` Johannes Thumshirn
2021-04-16  9:11   ` Filipe Manana
2021-04-16  9:14     ` Johannes Thumshirn
2021-04-16  9:30       ` Filipe Manana
2021-04-16  9:32         ` Johannes Thumshirn
2021-04-15 13:58 ` [PATCH v4 2/3] btrfs: rename delete_unused_bgs_mutex Johannes Thumshirn
2021-04-15 18:26   ` Josef Bacik
2021-04-16  9:15   ` Filipe Manana
2021-04-16 16:36   ` David Sterba
2021-04-15 13:58 ` [PATCH v4 3/3] btrfs: zoned: automatically reclaim zones Johannes Thumshirn
2021-04-15 18:36   ` Josef Bacik [this message]
2021-04-16  5:50     ` Johannes Thumshirn
2021-04-16  9:28   ` Filipe Manana
2021-04-16 16:38   ` David Sterba
2021-04-16 16:41   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63c82817-751c-b200-abfc-b7e669affa93@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=anand.jain@oracle.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=johannes.thumshirn@wdc.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).