From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit()
Date: Mon, 30 Dec 2019 09:29:30 -0500 [thread overview]
Message-ID: <6a67c32a-668e-675b-e317-62f1aaf27fcd@toxicpanda.com> (raw)
In-Reply-To: <0a71a88b-9942-ca8e-5478-d6ea48356daf@gmx.com>
On 12/27/19 8:09 PM, Qu Wenruo wrote:
>
>
> On 2019/12/28 上午2:32, Josef Bacik wrote:
>> On 12/25/19 8:39 AM, Qu Wenruo wrote:
>>> There are several bug reports of ENOSPC error in
>>> btrfs_run_delalloc_range().
>>>
>>> With some extra info from one reporter, it turns out that
>>> can_overcommit() is using a wrong way to calculate allocatable metadata
>>> space.
>>>
>>> The most typical case would look like:
>>> devid 1 unallocated: 1G
>>> devid 2 unallocated: 10G
>>> metadata profile: RAID1
>>>
>>> In above case, we can at most allocate 1G chunk for metadata, due to
>>> unbalanced disk free space.
>>> But current can_overcommit() uses factor based calculation, which never
>>> consider the disk free space balance.
>>>
>>>
>>> To address this problem, here comes the per-profile available space
>>> array, which gets updated every time a chunk get allocated/removed or a
>>> device get grown or shrunk.
>>>
>>> This provides a quick way for hotter place like can_overcommit() to grab
>>> an estimation on how many bytes it can over-commit.
>>>
>>> The per-profile available space calculation tries to keep the behavior
>>> of chunk allocator, thus it can handle uneven disks pretty well.
>>>
>>> The RFC tag is here because I'm not yet confident enough about the
>>> implementation.
>>> I'm not sure this is the proper to go, or just a over-engineered mess.
>>>
>>
>> In general I like the approach, however re-doing the whole calculation
>> once we add or remove a chunk seems like overkill. Can we get away with
>> just doing the big expensive calculation on mount, and then adjust
>> available up and down as we add and remove chunks?
>
> That looks good on a quick glance, but in practice it may not work as
> expected, mostly due to the small difference in sort.
>
> Current chunk allocator works by sorting the max hole size as primary
> sort index, thus it may cause difference on some corner case.
> Without proper re-calculation, the difference may drift larger and larger.
>
> Thus I prefer to be a little safer to do extra calculation each time
> chunk get allocated/remove.
> And that calculation is not that heavy, it just iterate the device lists
> several times, and all access are in-memory without sleep, it should be
> pretty fast.
>
Ahh I hadn't thought of different hole sizes. You're right that it shouldn't
matter in practice, it's not like chunk allocation is a fast path. This seems
reasonable to me then, I'll go through the patches properly. Thanks,
Josef
prev parent reply other threads:[~2019-12-30 14:29 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-25 13:39 [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit() Qu Wenruo
2019-12-25 13:39 ` [PATCH RFC 1/3] btrfs: Introduce per-profile available space facility Qu Wenruo
2019-12-30 16:14 ` Josef Bacik
2019-12-25 13:39 ` [PATCH RFC 2/3] btrfs: Update per-profile available space when device size/used space get updated Qu Wenruo
2019-12-30 16:17 ` Josef Bacik
2019-12-31 0:25 ` Qu Wenruo
2019-12-25 13:39 ` [PATCH RFC 3/3] btrfs: space-info: Use per-profile available space in can_overcommit() Qu Wenruo
2019-12-30 16:17 ` Josef Bacik
2019-12-27 18:32 ` [PATCH RFC 0/3] Introduce per-profile available space array to avoid over-confident can_overcommit() Josef Bacik
2019-12-28 1:09 ` Qu Wenruo
2019-12-30 14:29 ` Josef Bacik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a67c32a-668e-675b-e317-62f1aaf27fcd@toxicpanda.com \
--to=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox