From: Liu Bo <bo.li.liu@oracle.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>, jeffm@suse.com
Subject: Re: btrfs metadata reclaim behavior/performance characteristics
Date: Mon, 22 May 2017 15:57:49 -0700 [thread overview]
Message-ID: <20170522225749.GA15438@lim.localdomain> (raw)
In-Reply-To: <93c1a20e-bb0b-13f0-b601-069ea02c3898@suse.com>
On Sun, May 21, 2017 at 03:45:02PM +0300, Nikolay Borisov wrote:
>
>
> On 19.05.2017 21:32, Liu Bo wrote:
> > On Fri, May 19, 2017 at 12:54:59PM +0300, Nikolay Borisov wrote:
> >>> From: Liu Bo <bo.li.liu@oracle.com>
> >>>
> >>> Subject: [PATCH] Btrfs: skip commit transaction if we don't have enough pinned bytes
> >>>
> >>> We commit transaction in order to reclaim space from pinned bytes because it could process delayed refs, and in may_commit_transaction(), we check first if pinned bytes are enough for the required space, we then check if that plus bytes reserved for delayed insert are enough for the required space.
> >>>
> >>> This changes the code to the above logic.
> >>>
> >>> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> >>> ---
> >>> fs/btrfs/extent-tree.c | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> >>> index e390451c72e6..bded1ddd1bb6 100644
> >>> --- a/fs/btrfs/extent-tree.c
> >>> +++ b/fs/btrfs/extent-tree.c
> >>> @@ -4837,7 +4837,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
> >>>
> >>> spin_lock(&delayed_rsv->lock);
> >>> if (percpu_counter_compare(&space_info->total_bytes_pinned,
> >>> - bytes - delayed_rsv->size) >= 0) {
> >>> + bytes - delayed_rsv->size) < 0) {
> >>> spin_unlock(&delayed_rsv->lock);
> >>> return -ENOSPC;
> >>> }
> >>>
> >>
> >> Your patch does make a very big difference. Here are a couple of runs of
> >> slow-rm:
> >>
> >>
> >>
> >> root@ubuntu-virtual:~# ./slow-rm.sh
> >> Created 837 files before returning error, time taken 3
> >> Created 920 files before returning error, time taken 3
> >> Created 949 files before returning error, time taken 3
> >> Created 930 files before returning error, time taken 3
> >> Created 1101 files before returning error, time taken 4
> >> Created 1082 files before returning error, time taken 4
> >> Created 1608 files before returning error, time taken 5
> >> Created 1735 files before returning error, time taken 5
> >> rming took 1 seconds
> >>
> >> root@ubuntu-virtual:~# ./slow-rm.sh
> >> Created 801 files before returning error, time taken 3
> >> Created 829 files before returning error, time taken 3
> >> Created 983 files before returning error, time taken 3
> >> Created 978 files before returning error, time taken 3
> >> Created 1023 files before returning error, time taken 3
> >> Created 1126 files before returning error, time taken 3
> >> Created 1538 files before returning error, time taken 4
> >> Created 1737 files before returning error, time taken 5
> >> rming took 2 seconds
> >>
> >> root@ubuntu-virtual:~# ./slow-rm.sh
> >> Created 875 files before returning error, time taken 3
> >> Created 891 files before returning error, time taken 3
> >> Created 969 files before returning error, time taken 4
> >> Created 1002 files before returning error, time taken 4
> >> Created 1039 files before returning error, time taken 4
> >> Created 1051 files before returning error, time taken 4
> >> Created 1191 files before returning error, time taken 4
> >> Created 2137 files before returning error, time taken 8
> >> rming took 2 seconds
> >>
> >> So rming is a lot faster, but we create less files all in all and get
> >> ENOSPC earlier. This means that most of the time bytes_pinned is not
> >> enough to satisfy the allocation hence we are hitting the second
> >> percpu_counter comparison.
> >>
> >
> > Right, it's sort of my expected bahavior because all 1K buffered IO ends up
> > being inline extent, it's likely to run out of metadata space very soon.
>
> Are you going to send this as an official patch to the ML ?
>
Yes, here is the link,
https://patchwork.kernel.org/patch/9737947/
Thanks,
-liubo
> >
> >> Also, the reason why the previous links showed 0 for bytes_pinned was
> >> due to me having completely forgotten that bytes_pinned is a percpu
> >> counter, hence my stap script wasn't actually reading it correctly.
> >
> > I see, bytes_pinned in space_info is different from the percpu one, they're
> > updated at different time, but overall the percpu one is the the preciser
> > counter.
> >
> > -liubo
> >
prev parent reply other threads:[~2017-05-22 22:59 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-18 8:40 btrfs metadata reclaim behavior/performance characteristics Nikolay Borisov
2017-05-18 14:45 ` Liu Bo
2017-05-18 15:41 ` Nikolay Borisov
2017-05-18 21:47 ` Liu Bo
2017-05-19 9:54 ` Nikolay Borisov
2017-05-19 18:32 ` Liu Bo
2017-05-21 12:45 ` Nikolay Borisov
2017-05-22 22:57 ` Liu Bo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170522225749.GA15438@lim.localdomain \
--to=bo.li.liu@oracle.com \
--cc=jeffm@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).