From: Omar Sandoval <osandov@osandov.com>
To: Ethan Lien <ethanlien@synology.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2] btrfs: use customized batch size for total_bytes_pinned
Date: Thu, 12 Jul 2018 15:19:36 -0700 [thread overview]
Message-ID: <20180712221936.GA8064@vader> (raw)
In-Reply-To: <20180711155936.11511-1-ethanlien@synology.com>
On Wed, Jul 11, 2018 at 11:59:36PM +0800, Ethan Lien wrote:
> In commit b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
> pinned bytes") we use total_bytes_pinned to track how many bytes we are
> going to free in this transaction. When we are close to ENOSPC, we check it
> and know if we can make the allocation by commit the current transaction.
> For every data/metadata extent we are going to free, we add
> total_bytes_pinned in btrfs_free_extent() and btrfs_free_tree_block(), and
> release it in unpin_extent_range() when we finish the transaction. So this
> is a variable we frequently update but rarely read - just the suitable
> use of percpu_counter. But in previous commit we update total_bytes_pinned
> by default 32 batch size, making every update essentially a spin lock
> protected update. Since every spin lock/unlock operation involves syncing
> a globally used variable and some kind of barrier in a SMP system, this is
> more expensive than using total_bytes_pinned as a simple atomic64_t. So
> fix this by using a customized batch size. Since we only read
> total_bytes_pinned when we are close to ENOSPC and fail to alloc new chunk,
> we can use a really large batch size and have nearly no penalty in most
> cases.
>
>
> [Test]
> We test the patch on a 4-cores x86 machine:
> 1. falloate a 16GiB size test file.
> 2. take snapshot (so all following writes will be cow write).
> 3. run a 180 sec, 4 jobs, 4K random write fio on test file.
>
> We also add a temporary lockdep class on percpu_counter's spin lock used
> by total_bytes_pinned to track lock_stat.
>
>
> [Results]
> unpatched:
> lock_stat version 0.4
> -----------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total waittime-avg acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
>
> total_bytes_pinned_percpu: 82 82
> 0.21 0.61 29.46 0.36 298340
> 635973 0.09 11.01 173476.25 0.27
>
>
> patched:
> lock_stat version 0.4
> -----------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total waittime-avg acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
>
> total_bytes_pinned_percpu: 1 1
> 0.62 0.62 0.62 0.62 13601
> 31542 0.14 9.61 11016.90 0.35
>
>
> [Analysis]
> Since the spin lock only protect a single in-memory variable, the
> contentions (number of lock acquisitions that had to wait) in both
> unpatched and patched version are low. But when we see acquisitions and
> acq-bounces, we get much lower counts in patched version. Here the most
> important metric is acq-bounces. It means how many times the lock get
> transferred between different cpus, so the patch can really recude
> cacheline bouncing of spin lock (also the global counter of percpu_counter)
> in a SMP system.
>
> Fixes: b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
> pinned bytes")
>
> Signed-off-by: Ethan Lien <ethanlien@synology.com>
> ---
>
> V2:
> Rewrite commit comments.
> Add lock_stat test.
> Pull dirty_metadata_bytes out to a separate patch.
>
> fs/btrfs/ctree.h | 1 +
> fs/btrfs/extent-tree.c | 46 ++++++++++++++++++++++++++++--------------
> 2 files changed, 32 insertions(+), 15 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 118346aceea9..df682a521635 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -422,6 +422,7 @@ struct btrfs_space_info {
> * time the transaction commits.
> */
> struct percpu_counter total_bytes_pinned;
> + s32 total_bytes_pinned_batch;
Can this just be a constant instead of adding it to space_info?
next prev parent reply other threads:[~2018-07-12 22:31 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 15:59 [PATCH v2] btrfs: use customized batch size for total_bytes_pinned Ethan Lien
2018-07-12 7:07 ` Nikolay Borisov
2018-07-12 17:13 ` ethanlien
2018-07-12 22:19 ` Omar Sandoval [this message]
2018-07-13 2:27 ` ethanlien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180712221936.GA8064@vader \
--to=osandov@osandov.com \
--cc=ethanlien@synology.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).