[Cluster-devel] [PATCH 0/2] GFS2: inplace_reserve performance improvements

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

From: Mark Syms <mark.syms@citrix.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 0/2] GFS2: inplace_reserve performance improvements
Date: Thu, 20 Sep 2018 15:52:11 +0100	[thread overview]
Message-ID: <1537455133-48589-1-git-send-email-mark.syms@citrix.com> (raw)

While testing GFS2 as a storage repository for virtual machines we
discovered a number of scenarios where the performance was being
pathologically poor.

The scenarios are simplfied to the following -

  * On a single host in the cluster grow a number of files to a
    significant proportion of the filesystems LUN size, exceeding the
    hosts preferred resource group allocation. This can be replicated
    by using fio and writing to 20 different files with a script like

[test-files]
directory=gfs2/a:gfs2/b:gfs2/c:gfs2/d:gfs2/e:gfs2/f:gfs2/g:gfs2/h:gfs2/i:gfs2/j:gfs2/k:gfs2/l:gfs2/m:gfs2/n:gfs2/o:gfs2/p:gfs2/q:gfs2/r:gfs2/s:gfs2/t
nrfiles=1
size=20G
bs=512k
rw=write
buffered=0
ioengine=libaio
fallocate=none
numjobs=20

    After starting off at network wire speed this will rapidly degrade
    with the fio process reporting large sys time.

    This was diagnosed to all the processes contending on the glock in
    gfs2_inplace_reserve having all selected the same resource
    group. Patch 1 addresses this with an optional module parameter
    which enables behaviour to "randomly" skip a selected resource
    group in the first two passes in gfs_inplace_reserve in order to
    spread the processes out.

    Worth noting that this would probably also be addressed if the
    comment in Documentation/gfs2-glocks.txt about eventually making
    glock EX locally shared was made to happen. However, this looks
    like it would require quite a bit of coordination and design so
    this stop-gap helps in the meantime.

  * With two or more hosts growing files at high data rates the
    throughput drops to a small proportion of the maximum storage
    I/O. This is the several VMs all writing to the filesystem
    scenario. Sometimes this test would run through clean at 80-90% of
    storage wire speed but at other times the performance would drop
    on one or more hosts to a small number of KiB/s.

    This was diagnosed to the different hosts repeatedly bouncing
    resource group glocks between them as different hosts selected
    the same resource group (having exhausted the preferred groups).

    Patch 2 addresses this by -
      * adding a hold delay to the resource group glock if there are
        local waiters, following the pattern already in place for
        inodes, this should also provide more data for
        gfs2_rgrp_congested to work on.
      * remembering when we were last asked to demote the lock on a
        resource group
      * in the first two passes in gfs2_inplace_reserve avoiding
        resource groups where we have been asked to demote the glock
        within the last second

Mark Syms (1):
  GFS2: Avoid recently demoted rgrps.

Tim Smith (1):
  Add some randomisation to the GFS2 resource group allocator

 fs/gfs2/glock.c      |  7 +++++--
 fs/gfs2/incore.h     |  2 ++
 fs/gfs2/main.c       |  1 +
 fs/gfs2/rgrp.c       | 49 +++++++++++++++++++++++++++++++++++++++++++++----
 fs/gfs2/trace_gfs2.h | 12 +++++++++---
 5 files changed, 62 insertions(+), 9 deletions(-)

-- 
1.8.3.1

next             reply	other threads:[~2018-09-20 14:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-20 14:52 Mark Syms [this message]
2018-09-20 14:52 ` [Cluster-devel] [PATCH 1/2] Add some randomisation to the GFS2 resource group allocator Mark Syms
2018-09-20 14:52 ` [Cluster-devel] [PATCH 2/2] GFS2: Avoid recently demoted rgrps Mark Syms
2018-09-20 17:17 ` [Cluster-devel] [PATCH 0/2] GFS2: inplace_reserve performance improvements Bob Peterson
2018-09-20 17:47   ` Mark Syms
2018-09-20 18:16     ` Steven Whitehouse
2018-09-28 12:23     ` Bob Peterson
2018-09-28 12:36       ` Mark Syms
2018-09-28 12:50         ` Mark Syms
2018-09-28 13:18           ` Steven Whitehouse
2018-09-28 13:43             ` Tim Smith
2018-09-28 13:59               ` Bob Peterson
2018-09-28 14:11                 ` Mark Syms
2018-09-28 15:09                 ` Tim Smith
2018-09-28 15:09               ` Steven Whitehouse
2018-09-28 12:55         ` Bob Peterson
2018-09-28 13:56           ` Mark Syms
2018-10-02 13:50             ` Mark Syms

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1537455133-48589-1-git-send-email-mark.syms@citrix.com \
    --to=mark.syms@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).