From: Omar Sandoval <osandov@osandov.com>
To: Ethan Lien <ethanlien@synology.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2] btrfs: use customized batch size for total_bytes_pinned
Date: Thu, 12 Jul 2018 15:19:36 -0700 [thread overview]
Message-ID: <20180712221936.GA8064@vader> (raw)
In-Reply-To: <20180711155936.11511-1-ethanlien@synology.com>
On Wed, Jul 11, 2018 at 11:59:36PM +0800, Ethan Lien wrote:
> In commit b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
> pinned bytes") we use total_bytes_pinned to track how many bytes we are
> going to free in this transaction. When we are close to ENOSPC, we check it
> and know if we can make the allocation by commit the current transaction.
> For every data/metadata extent we are going to free, we add
> total_bytes_pinned in btrfs_free_extent() and btrfs_free_tree_block(), and
> release it in unpin_extent_range() when we finish the transaction. So this
> is a variable we frequently update but rarely read - just the suitable
> use of percpu_counter. But in previous commit we update total_bytes_pinned
> by default 32 batch size, making every update essentially a spin lock
> protected update. Since every spin lock/unlock operation involves syncing
> a globally used variable and some kind of barrier in a SMP system, this is
> more expensive than using total_bytes_pinned as a simple atomic64_t. So
> fix this by using a customized batch size. Since we only read
> total_bytes_pinned when we are close to ENOSPC and fail to alloc new chunk,
> we can use a really large batch size and have nearly no penalty in most
> cases.
>
>
> [Test]
> We test the patch on a 4-cores x86 machine:
> 1. falloate a 16GiB size test file.
> 2. take snapshot (so all following writes will be cow write).
> 3. run a 180 sec, 4 jobs, 4K random write fio on test file.
>
> We also add a temporary lockdep class on percpu_counter's spin lock used
> by total_bytes_pinned to track lock_stat.
>
>
> [Results]
> unpatched:
> lock_stat version 0.4
> -----------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total waittime-avg acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
>
> total_bytes_pinned_percpu: 82 82
> 0.21 0.61 29.46 0.36 298340
> 635973 0.09 11.01 173476.25 0.27
>
>
> patched:
> lock_stat version 0.4
> -----------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total waittime-avg acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
>
> total_bytes_pinned_percpu: 1 1
> 0.62 0.62 0.62 0.62 13601
> 31542 0.14 9.61 11016.90 0.35
>
>
> [Analysis]
> Since the spin lock only protect a single in-memory variable, the
> contentions (number of lock acquisitions that had to wait) in both
> unpatched and patched version are low. But when we see acquisitions and
> acq-bounces, we get much lower counts in patched version. Here the most
> important metric is acq-bounces. It means how many times the lock get
> transferred between different cpus, so the patch can really recude
> cacheline bouncing of spin lock (also the global counter of percpu_counter)
> in a SMP system.
>
> Fixes: b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
> pinned bytes")
>
> Signed-off-by: Ethan Lien <ethanlien@synology.com>
> ---
>
> V2:
> Rewrite commit comments.
> Add lock_stat test.
> Pull dirty_metadata_bytes out to a separate patch.
>
> fs/btrfs/ctree.h | 1 +
> fs/btrfs/extent-tree.c | 46 ++++++++++++++++++++++++++++--------------
> 2 files changed, 32 insertions(+), 15 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 118346aceea9..df682a521635 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -422,6 +422,7 @@ struct btrfs_space_info {
> * time the transaction commits.
> */
> struct percpu_counter total_bytes_pinned;
> + s32 total_bytes_pinned_batch;
Can this just be a constant instead of adding it to space_info?
next prev parent reply other threads:[~2018-07-12 22:31 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 15:59 [PATCH v2] btrfs: use customized batch size for total_bytes_pinned Ethan Lien
2018-07-12 7:07 ` Nikolay Borisov
2018-07-12 17:13 ` ethanlien
2018-07-12 22:19 ` Omar Sandoval [this message]
2018-07-13 2:27 ` ethanlien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180712221936.GA8064@vader \
--to=osandov@osandov.com \
--cc=ethanlien@synology.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.