Re: performance collapse: 9 mio IOPS to 1.5 mio with MD RAID0

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Tobias Oberstein <tobias.oberstein@gmail.com>
To: Stan Hoeppner <stan@hardwarefreak.org>, linux-raid@vger.kernel.org
Subject: Re: performance collapse: 9 mio IOPS to 1.5 mio with MD RAID0
Date: Thu, 26 Jan 2017 09:35:17 +0100	[thread overview]
Message-ID: <686aaca3-e79e-ee7b-06e8-fbd6d2d196e1@gmail.com> (raw)
In-Reply-To: <1d766141-465b-34f2-dbd6-b7f71ecc344c@hardwarefreak.org>

Am 26.01.2017 um 00:01 schrieb Stan Hoeppner:
> On 01/25/2017 05:45 AM, Tobias Oberstein wrote:
>> Hi,
>>
>> I have a storage consisting of 8 NVMe drives (16 logical drives) that
>> I verified (FIO) is able to do >9 million 4kB random read IOPS if I
>> run FIO on the set of individual NVMes.
>>
>> However, when I create a MD (RAID-0) over the 16 NVMes and run the
>> same tests, performance collapses:
>>
>> ioengine=sync, invidual NVMes: IOPS=9191k
>> ioengine=sync, MD (RAID-0) over NVMes: IOPS=1562k
>>
>> Using ioengine=psync, the performance collapse isn't as dramatic, but
>> still very signifcant:
>>
>> ioengine=sync, invidual NVMes: IOPS=9395k
>> ioengine=sync, MD (RAID-0) over NVMes: IOPS=4117k
>>
>> --
>>
>> All detail results (including runs under Linux perf) and FIO control
>> files are here
>>
>> https://github.com/oberstet/scratchbox/tree/master/cruncher/sync-engines-perf
>>
>>
>
> You don't need 1024 jobs to fill the request queues.  Just out of

Here is scaling of measured IOPS depending on concurrency for sync engines

https://github.com/oberstet/scratchbox/raw/master/cruncher/Performance%20Results%20-%20NVMe%20Scaling%20with%20IO%20Concurrency.pdf

Note: the top numbers are lower than in what I posted above, because 
these measurements were still done with 512 bytes sectors (not with 4k 
as above).

> curiosity, what are the fio results when using fewer jobs and a greater
> queue depth, say one job per core, 88 total, with a queue depth of 32?

iodeapth doesn't apply to synchronous io engines (ioengine=sync/psync/..)

>
> osq_lock appears to be a per cpu opportunistic spinlock.  Might be of
> benefit to try even fewer jobs for fewer active cores.

with reduced concurrency, I am not able to saturate the storage anymore

in essence, 1 Xeon core is needed per 120k IOPS with ioengine=sync even 
on the set of raw NVMe devices (no MD RAID).

for comparison: libaio achieves 600k IOPS / core

SPDK claims (I haven't measured that myself .. Intel claim):
1.8 mio IOPS / core

Cheers,
/Tobias

     prev parent reply	other threads:[~2017-01-26  8:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-25 11:45 performance collapse: 9 mio IOPS to 1.5 mio with MD RAID0 Tobias Oberstein
2017-01-25 23:01 ` Stan Hoeppner
2017-01-26  8:35   ` Tobias Oberstein [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=686aaca3-e79e-ee7b-06e8-fbd6d2d196e1@gmail.com \
    --to=tobias.oberstein@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=stan@hardwarefreak.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).