Re: [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit()

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit()
Date: Mon, 30 Dec 2019 09:29:30 -0500	[thread overview]
Message-ID: <6a67c32a-668e-675b-e317-62f1aaf27fcd@toxicpanda.com> (raw)
In-Reply-To: <0a71a88b-9942-ca8e-5478-d6ea48356daf@gmx.com>

On 12/27/19 8:09 PM, Qu Wenruo wrote:
> 
> 
> On 2019/12/28 上午2:32, Josef Bacik wrote:
>> On 12/25/19 8:39 AM, Qu Wenruo wrote:
>>> There are several bug reports of ENOSPC error in
>>> btrfs_run_delalloc_range().
>>>
>>> With some extra info from one reporter, it turns out that
>>> can_overcommit() is using a wrong way to calculate allocatable metadata
>>> space.
>>>
>>> The most typical case would look like:
>>>     devid 1 unallocated:    1G
>>>     devid 2 unallocated:  10G
>>>     metadata profile:    RAID1
>>>
>>> In above case, we can at most allocate 1G chunk for metadata, due to
>>> unbalanced disk free space.
>>> But current can_overcommit() uses factor based calculation, which never
>>> consider the disk free space balance.
>>>
>>>
>>> To address this problem, here comes the per-profile available space
>>> array, which gets updated every time a chunk get allocated/removed or a
>>> device get grown or shrunk.
>>>
>>> This provides a quick way for hotter place like can_overcommit() to grab
>>> an estimation on how many bytes it can over-commit.
>>>
>>> The per-profile available space calculation tries to keep the behavior
>>> of chunk allocator, thus it can handle uneven disks pretty well.
>>>
>>> The RFC tag is here because I'm not yet confident enough about the
>>> implementation.
>>> I'm not sure this is the proper to go, or just a over-engineered mess.
>>>
>>
>> In general I like the approach, however re-doing the whole calculation
>> once we add or remove a chunk seems like overkill.  Can we get away with
>> just doing the big expensive calculation on mount, and then adjust
>> available up and down as we add and remove chunks?
> 
> That looks good on a quick glance, but in practice it may not work as
> expected, mostly due to the small difference in sort.
> 
> Current chunk allocator works by sorting the max hole size as primary
> sort index, thus it may cause difference on some corner case.
> Without proper re-calculation, the difference may drift larger and larger.
> 
> Thus I prefer to be a little safer to do extra calculation each time
> chunk get allocated/remove.
> And that calculation is not that heavy, it just iterate the device lists
> several times, and all access are in-memory without sleep, it should be
> pretty fast.
> 

Ahh I hadn't thought of different hole sizes.  You're right that it shouldn't 
matter in practice, it's not like chunk allocation is a fast path.  This seems 
reasonable to me then, I'll go through the patches properly.  Thanks,

Josef

     prev parent reply	other threads:[~2019-12-30 14:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 13:39 [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit() Qu Wenruo
2019-12-25 13:39 ` [PATCH RFC 1/3] btrfs: Introduce per-profile available space facility Qu Wenruo
2019-12-30 16:14   ` Josef Bacik
2019-12-25 13:39 ` [PATCH RFC 2/3] btrfs: Update per-profile available space when device size/used space get updated Qu Wenruo
2019-12-30 16:17   ` Josef Bacik
2019-12-31  0:25     ` Qu Wenruo
2019-12-25 13:39 ` [PATCH RFC 3/3] btrfs: space-info: Use per-profile available space in can_overcommit() Qu Wenruo
2019-12-30 16:17   ` Josef Bacik
2019-12-27 18:32 ` [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit() Josef Bacik
2019-12-28  1:09   ` Qu Wenruo
2019-12-30 14:29     ` Josef Bacik [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6a67c32a-668e-675b-e317-62f1aaf27fcd@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox