linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	Song Liu <song@kernel.org>
Cc: Shaohua Li <shli@kernel.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>,
	Logan Gunthorpe <logang@deltatee.com>
Subject: [PATCH v1 0/8] Improve Raid5 Lock Contention
Date: Thu,  7 Apr 2022 10:45:03 -0600	[thread overview]
Message-ID: <20220407164511.8472-1-logang@deltatee.com> (raw)

Hi,

I've been doing some work trying to improve the bulk write performance of
raid5 on large systems with fast NVMe drives. The bottleneck appears
largely to be lock contention on the hash_lock and device_lock. This
series improves the situation slightly by addressing a couple of low
hanging fruit ways to take the lock fewer times in the request path.

Patch 5 adjusts how batching works by keeping a reference to the
previous stripe_head in raid5_make_request(). Under most situtations,
this removes the need to take the hash_lock in stripe_add_to_batch_list()
which should reduce the number of times the lock is taken by a factor of
about 2.

Patch 8 pivots the way raid5_make_request() works. Before the patch, the
code must find the stripe_head for every 4KB page in the request, so each
stripe head must be found once for every data disk. The patch changes this
so that all the data disks can be added to a stripe_head at once and the
number of times the stripe_head must be found (and thus the number of
times the hash_lock is taken) should be reduced by a factor roughly equal
to the number of data disks.

The remaining patches are just cleanup and prep patches for those two
patches.

Doing apples to apples testing this series on a small VM with 5 ram disks,
I saw a bandwidth increase of roughly 14% and lock contentions on the
hash_lock (as reported by lock stat) reduced by more than a factor of
5 (though is still significantly contended).

Testing on larger systems with NVMe drives saw similar small bandwidth
increases from 3% to 20% depending on the parameters. Oddly small arrays
had larger gains, likely due to them having lower starting bandwidths; I
would have expected larger gains with larger arrays (seeing there
should have been even fewer locks taken in raid5_make_request()).

Logan

--

Logan Gunthorpe (8):
  md/raid5: Refactor raid5_make_request loop
  md/raid5: Move stripe_add_to_batch_list() call out of add_stripe_bio()
  md/raid5: Move common stripe count increment code into __find_stripe()
  md/raid5: Make common label for schedule/retry in raid5_make_request()
  md/raid5: Keep a reference to last stripe_head for batch
  md/raid5: Refactor add_stripe_bio()
  md/raid5: Check all disks in a stripe_head for reshape progress
  md/raid5: Pivot raid5_make_request()

 drivers/md/raid5.c | 442 +++++++++++++++++++++++++++------------------
 drivers/md/raid5.h |   1 +
 2 files changed, 270 insertions(+), 173 deletions(-)


base-commit: 3123109284176b1532874591f7c81f3837bbdc17
--
2.30.2

             reply	other threads:[~2022-04-07 17:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-07 16:45 Logan Gunthorpe [this message]
2022-04-07 16:45 ` [PATCH v1 1/8] md/raid5: Refactor raid5_make_request loop Logan Gunthorpe
2022-04-08  5:58   ` Christoph Hellwig
2022-04-07 16:45 ` [PATCH v1 2/8] md/raid5: Move stripe_add_to_batch_list() call out of add_stripe_bio() Logan Gunthorpe
2022-04-08  6:00   ` Christoph Hellwig
2022-04-07 16:45 ` [PATCH v1 3/8] md/raid5: Move common stripe count increment code into __find_stripe() Logan Gunthorpe
2022-04-08  6:01   ` Christoph Hellwig
2022-04-07 16:45 ` [PATCH v1 4/8] md/raid5: Make common label for schedule/retry in raid5_make_request() Logan Gunthorpe
2022-04-08  6:03   ` Christoph Hellwig
2022-04-07 16:45 ` [PATCH v1 5/8] md/raid5: Keep a reference to last stripe_head for batch Logan Gunthorpe
2022-04-07 16:45 ` [PATCH v1 6/8] md/raid5: Refactor add_stripe_bio() Logan Gunthorpe
2022-04-08  6:05   ` Christoph Hellwig
2022-04-07 16:45 ` [PATCH v1 7/8] md/raid5: Check all disks in a stripe_head for reshape progress Logan Gunthorpe
2022-04-07 16:45 ` [PATCH v1 8/8] md/raid5: Pivot raid5_make_request() Logan Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220407164511.8472-1-logang@deltatee.com \
    --to=logang@deltatee.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=sbates@raithlin.com \
    --cc=shli@kernel.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).