From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH v1 0/2] Improve throughput through rgrp sharing
Date: Wed, 18 Apr 2018 09:58:36 -0700 [thread overview]
Message-ID: <20180418165838.8342-1-rpeterso@redhat.com> (raw)
This is a preliminary patch set, but the results are promising. I've been
testing it pretty hard in a variety of circumstances, and it seems to be
fairly solid, so I thought I'd get it out here for people to review.
On 17 January 2018, I posted an experimental patch set meant to improve
intra-node resource group sharing, titled "GFS2: Rework rgrp glock
congestion functions for intra-node". It improved rgrp contention by
simply distributing contentious processes to use different rgrps. In
RHEL6 we used "try locks" which basically accomplished the same thing.
Steve Whitehouse suggested a better approach: to actually share the same
rgrps within a node. This patch set implements Steve's suggestion.
The first patch introduces a new glock locking mode called EXSH, meaning
exclusively shared within one node. To all other nodes (and to DLM) the
glock looks and acts like it is held EX. But to the node that has it
locked, it may be shared among processes like an SH lock.
The second patch adds hooks to the rgrp code to use the new glock locking
mode. A new rwsem, rd_sem, ensures exclusive use of the rgrp when it is
needed. Whenever an rgrp is added to a transaction, the rwsem is taken and
it is queued to the transaction. When the transaction is ended, every rwsem
for all rgrps queued to that transaction are unlocked.
Preliminary performance testing using iozone looks very promising.
With 16 simultaneous writers, GFS2 performs 6 times faster with the patch.
Even with 4 writers, overall performance is doubled:
7.5 kernel Patched kernel
-------------- --------------
Children see throughput for 1 initial writers 525062.81 kB/s 527972.50 kB/s
Parent sees throughput for 1 initial writers 525049.74 kB/s 527971.69 kB/s
Children see throughput for 2 initial writers 612600.62 kB/s 603398.75 kB/s
Parent sees throughput for 2 initial writers 600944.08 kB/s 603140.65 kB/s
Children see throughput for 4 initial writers 596730.64 kB/s 694901.31 kB/s
Parent sees throughput for 4 initial writers 232777.32 kB/s 472287.19 kB/s
Children see throughput for 6 initial writers 574034.05 kB/s 739531.62 kB/s
Parent sees throughput for 6 initial writers 160751.73 kB/s 515363.98 kB/s
Children see throughput for 8 initial writers 644463.33 kB/s 727810.48 kB/s
Parent sees throughput for 8 initial writers 155939.49 kB/s 559100.85 kB/s
Children see throughput for 10 initial writers 613880.30 kB/s 736029.91 kB/s
Parent sees throughput for 10 initial writers 174366.86 kB/s 663429.43 kB/s
Children see throughput for 12 initial writers 610206.54 kB/s 744490.04 kB/s
Parent sees throughput for 12 initial writers 150910.72 kB/s 682414.33 kB/s
Children see throughput for 14 initial writers 625055.97 kB/s 804518.57 kB/s
Parent sees throughput for 14 initial writers 129122.67 kB/s 781340.39 kB/s
Children see throughput for 16 initial writers 627972.96 kB/s 794149.06 kB/s
Parent sees throughput for 16 initial writers 124565.02 kB/s 764981.28 kB/s
There are still some fairness/parallelism issues. It's not perfect.
But when multiple processes are sharing the same resource, I'm not sure
how much better we can go without separating them to their own rgrps.
The statistics indicate this is well worth pursuing.
---
Bob Peterson (2):
GFS2: Introduce EXSH (exclusively shared on one node)
GFS2: Take advantage of new EXSH glock mode for rgrps
fs/gfs2/bmap.c | 2 +-
fs/gfs2/dir.c | 2 +-
fs/gfs2/glock.c | 12 +++++++-
fs/gfs2/glock.h | 16 +++++++---
fs/gfs2/glops.c | 3 +-
fs/gfs2/incore.h | 12 +++++---
fs/gfs2/inode.c | 4 +--
fs/gfs2/lock_dlm.c | 5 +++-
fs/gfs2/rgrp.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++----
fs/gfs2/rgrp.h | 5 ++++
fs/gfs2/super.c | 2 +-
fs/gfs2/trace_gfs2.h | 2 ++
fs/gfs2/trans.c | 16 ++++++++++
fs/gfs2/xattr.c | 6 ++--
14 files changed, 147 insertions(+), 24 deletions(-)
--
2.14.3
next reply other threads:[~2018-04-18 16:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-18 16:58 Bob Peterson [this message]
2018-04-18 16:58 ` [Cluster-devel] [GFS2 PATCH 1/2] GFS2: Introduce EXSH (exclusively shared on one node) Bob Peterson
2018-04-18 19:13 ` Steven Whitehouse
2018-04-18 19:32 ` Bob Peterson
2018-04-19 8:29 ` Steven Whitehouse
2018-04-18 16:58 ` [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Take advantage of new EXSH glock mode for rgrps Bob Peterson
2018-04-18 19:25 ` Steven Whitehouse
2018-04-18 19:39 ` Bob Peterson
2018-04-19 8:43 ` Steven Whitehouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180418165838.8342-1-rpeterso@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).