[Cluster-devel] [GFS2 PATCH v1 0/2] Improve throughput through rgrp sharing

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH v1 0/2] Improve throughput through rgrp sharing
Date: Wed, 18 Apr 2018 09:58:36 -0700	[thread overview]
Message-ID: <20180418165838.8342-1-rpeterso@redhat.com> (raw)

This is a preliminary patch set, but the results are promising. I've been
testing it pretty hard in a variety of circumstances, and it seems to be
fairly solid, so I thought I'd get it out here for people to review.

On 17 January 2018, I posted an experimental patch set meant to improve
intra-node resource group sharing, titled "GFS2: Rework rgrp glock
congestion functions for intra-node". It improved rgrp contention by
simply distributing contentious processes to use different rgrps. In
RHEL6 we used "try locks" which basically accomplished the same thing.

Steve Whitehouse suggested a better approach: to actually share the same
rgrps within a node. This patch set implements Steve's suggestion.

The first patch introduces a new glock locking mode called EXSH, meaning
exclusively shared within one node. To all other nodes (and to DLM) the
glock looks and acts like it is held EX. But to the node that has it
locked, it may be shared among processes like an SH lock.

The second patch adds hooks to the rgrp code to use the new glock locking
mode. A new rwsem, rd_sem, ensures exclusive use of the rgrp when it is
needed. Whenever an rgrp is added to a transaction, the rwsem is taken and
it is queued to the transaction. When the transaction is ended, every rwsem
for all rgrps queued to that transaction are unlocked.

Preliminary performance testing using iozone looks very promising.
With 16 simultaneous writers, GFS2 performs 6 times faster with the patch.
Even with 4 writers, overall performance is doubled:

                                               7.5 kernel      Patched kernel
                                               --------------  --------------
Children see throughput for  1 initial writers 525062.81 kB/s  527972.50 kB/s
Parent sees throughput for  1 initial writers  525049.74 kB/s  527971.69 kB/s

Children see throughput for  2 initial writers 612600.62 kB/s  603398.75 kB/s
Parent sees throughput for  2 initial writers  600944.08 kB/s  603140.65 kB/s

Children see throughput for  4 initial writers 596730.64 kB/s  694901.31 kB/s
Parent sees throughput for  4 initial writers  232777.32 kB/s  472287.19 kB/s

Children see throughput for  6 initial writers 574034.05 kB/s  739531.62 kB/s
Parent sees throughput for  6 initial writers  160751.73 kB/s  515363.98 kB/s

Children see throughput for  8 initial writers 644463.33 kB/s  727810.48 kB/s
Parent sees throughput for  8 initial writers  155939.49 kB/s  559100.85 kB/s

Children see throughput for 10 initial writers 613880.30 kB/s  736029.91 kB/s
Parent sees throughput for 10 initial writers  174366.86 kB/s  663429.43 kB/s

Children see throughput for 12 initial writers 610206.54 kB/s  744490.04 kB/s
Parent sees throughput for 12 initial writers  150910.72 kB/s  682414.33 kB/s

Children see throughput for 14 initial writers 625055.97 kB/s  804518.57 kB/s
Parent sees throughput for 14 initial writers  129122.67 kB/s  781340.39 kB/s

Children see throughput for 16 initial writers 627972.96 kB/s  794149.06 kB/s
Parent sees throughput for 16 initial writers  124565.02 kB/s  764981.28 kB/s

There are still some fairness/parallelism issues. It's not perfect.
But when multiple processes are sharing the same resource, I'm not sure
how much better we can go without separating them to their own rgrps.
The statistics indicate this is well worth pursuing.
---
Bob Peterson (2):
  GFS2: Introduce EXSH (exclusively shared on one node)
  GFS2: Take advantage of new EXSH glock mode for rgrps

 fs/gfs2/bmap.c       |  2 +-
 fs/gfs2/dir.c        |  2 +-
 fs/gfs2/glock.c      | 12 +++++++-
 fs/gfs2/glock.h      | 16 +++++++---
 fs/gfs2/glops.c      |  3 +-
 fs/gfs2/incore.h     | 12 +++++---
 fs/gfs2/inode.c      |  4 +--
 fs/gfs2/lock_dlm.c   |  5 +++-
 fs/gfs2/rgrp.c       | 84 ++++++++++++++++++++++++++++++++++++++++++++++++----
 fs/gfs2/rgrp.h       |  5 ++++
 fs/gfs2/super.c      |  2 +-
 fs/gfs2/trace_gfs2.h |  2 ++
 fs/gfs2/trans.c      | 16 ++++++++++
 fs/gfs2/xattr.c      |  6 ++--
 14 files changed, 147 insertions(+), 24 deletions(-)

-- 
2.14.3

next             reply	other threads:[~2018-04-18 16:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-18 16:58 Bob Peterson [this message]
2018-04-18 16:58 ` [Cluster-devel] [GFS2 PATCH 1/2] GFS2: Introduce EXSH (exclusively shared on one node) Bob Peterson
2018-04-18 19:13   ` Steven Whitehouse
2018-04-18 19:32     ` Bob Peterson
2018-04-19  8:29       ` Steven Whitehouse
2018-04-18 16:58 ` [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Take advantage of new EXSH glock mode for rgrps Bob Peterson
2018-04-18 19:25   ` Steven Whitehouse
2018-04-18 19:39     ` Bob Peterson
2018-04-19  8:43       ` Steven Whitehouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180418165838.8342-1-rpeterso@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).