From: Nikolay Borisov <nborisov@suse.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
Josef Bacik <josef@toxicpanda.com>,
linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 1/3] btrfs: add a comment describing block-rsvs
Date: Tue, 4 Feb 2020 12:32:36 +0200 [thread overview]
Message-ID: <e184b283-dffd-898c-53c9-681d399ace98@suse.com> (raw)
In-Reply-To: <8d33b43f-bcba-0fed-60e5-2908e219181b@gmx.com>
On 4.02.20 г. 11:30 ч., Qu Wenruo wrote:
>
>
> On 2020/2/4 上午4:44, Josef Bacik wrote:
>> This is a giant comment at the top of block-rsv.c describing generally
>> how block rsvs work. It is purely about the block rsv's themselves, and
>> nothing to do with how the actual reservation system works.
>
> Such comment really helps!
>
> Although it looks like there are too many words but too little ascii
> arts or graphs.
> Not sure if it's really easy to read.
>
> And some questions inlined below.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>> fs/btrfs/block-rsv.c | 81 ++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 81 insertions(+)
>>
>> diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
>> index d07bd41a7c1e..54380f477f80 100644
>> --- a/fs/btrfs/block-rsv.c
>> +++ b/fs/btrfs/block-rsv.c
>> @@ -6,6 +6,87 @@
>> #include "space-info.h"
>> #include "transaction.h"
>>
<snip>
>> + *
>> + * We go to modify the tree for our operation, we allocate a tree block, which
>> + * calls btrfs_use_block_rsv(), and subtracts nodesize from
>> + * block_rsv->reserved.
>> + *
>> + * We finish our operation, we subtract our original reservation from ->size,
>> + * and then we subtract ->size from ->reserved if there is an excess and free
>> + * the excess back to the space info, by reducing space_info->bytes_may_use by
>> + * the excess amount.
>
> So I find the workflow can be expressed like this using timeline (?) graph:
>
> +--- Reserve:
> | Entrance: btrfs_block_rsv_add(), btrfs_block_rsv_refill()
> |
> | Calculate needed bytes by btrfs_calc*(), then add the needed space
> | to our ->size and our ->reserved.
> | This also contributes to space_info->bytes_may_use.
> |
> +--- Use:
> | Entrance: btrfs_use_block_rsv()
> |
> | We're allocating a tree block, will subtracts nodesize from
> | block_rsv->reserved.
> |
> +--- Finish:
> Entrance: btrfs_block_rsv_release()
>
> we subtract our original reservation from ->size,
> and then we subtract ->size from ->reserved if there is an excess
> and free the excess back to the space info, by reducing
> space_info->bytes_may_use by the excess amount.
I find this graphic helpful. Also IMO it's important to explicitly state
that ->size is based on an overestimation, whereas the space subtracted
from ->reserved is always based on real usage, hence we can have a case
where we end up with excess space that can be returned.
Over reservation is mentioned in the BLOCK_RSV_GLOBAL paragraph but I
think it should be here and can be removed from there.
>
>> + *
>> + * In some cases we may return this excess to the global block reserve or
>> + * delayed refs reserve if either of their ->size is greater than their
>> + * ->reserved.
>> + *
>
> Types of block_rsv:
>
>> + * BLOCK_RSV_TRANS, BLOCK_RSV_DELOPS, BLOCK_RSV_CHUNK
>> + * These behave normally, as described above, just within the confines of the
>> + * lifetime of ther particular operation (transaction for the whole trans
>> + * handle lifetime, for example).
>> + *
>> + * BLOCK_RSV_GLOBAL
>> + * This has existed forever, with diminishing degrees of importance.
>> + * Currently it exists to save us from ourselves. We definitely over-reserve
>> + * space most of the time, but the nature of COW is that we do not know how
>> + * much space we may need to use for any given operation. This is
>> + * particularly true about the extent tree. Modifying one extent could
>> + * balloon into 1000 modifications of the extent tree, which we have no way of
>> + * properly predicting. To cover this case we have the global reserve act as
>> + * the "root" space to allow us to not abort the transaciton when things are
nit: s/transaciton/transaction
>> + * very tight. As such we tend to treat this space as sacred, and only use it
>> + * if we are desparate. Generally we should no longer be depending on its
nit: s/desparate/desperate
>> + * space, and if new use cases arise we need to address them elsewhere.
>
> Although we all know global rsv is really important for essential tree
> updates, can we make it a little simpler?
> It looks too long to read though.
The 2nd sentence of the paragraph can be removed. Also it can be
mentioned that globalrsv is used for other trees apart from extent i.e
chunk/csum ones. Also isn't it used to ensure progress of unlink() ?
>
> I guess we don't need to put all related info here.
> Maybe just mentioning the usage of each type is enough?
> (Since the reader will still go greping for more details)
>
> This also applies to the remaining types.
I disagree, those comment provide glimpses of the problem that
necessitated having block rsv in the first place. It's good to read this
before diving into the code.
<snip>
next prev parent reply other threads:[~2020-02-04 10:32 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-03 20:44 [PATCH 0/3] Add comments describing how space reservation works Josef Bacik
2020-02-03 20:44 ` [PATCH 1/3] btrfs: add a comment describing block-rsvs Josef Bacik
2020-02-04 9:30 ` Qu Wenruo
2020-02-04 10:32 ` Nikolay Borisov [this message]
2020-02-03 20:44 ` [PATCH 2/3] btrfs: add a comment describing delalloc space reservation Josef Bacik
2020-02-04 9:48 ` Qu Wenruo
2020-02-04 12:27 ` Nikolay Borisov
2020-02-04 12:39 ` Qu Wenruo
2020-02-05 13:44 ` David Sterba
2020-02-03 20:44 ` [PATCH 3/3] btrfs: describe the space reservation system in general Josef Bacik
2020-02-04 10:14 ` Qu Wenruo
-- strict thread matches above, loose matches on Subject: below --
2020-02-04 18:18 [PATCH 0/3][v2] Add comments describing how space reservation works Josef Bacik
2020-02-04 18:18 ` [PATCH 1/3] btrfs: add a comment describing block-rsvs Josef Bacik
2020-02-07 15:11 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e184b283-dffd-898c-53c9-681d399ace98@suse.com \
--to=nborisov@suse.com \
--cc=josef@toxicpanda.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox