From: Tobias Oberstein <tobias.oberstein@gmail.com>
To: Stan Hoeppner <stan@hardwarefreak.org>, linux-raid@vger.kernel.org
Subject: Re: performance collapse: 9 mio IOPS to 1.5 mio with MD RAID0
Date: Thu, 26 Jan 2017 09:35:17 +0100 [thread overview]
Message-ID: <686aaca3-e79e-ee7b-06e8-fbd6d2d196e1@gmail.com> (raw)
In-Reply-To: <1d766141-465b-34f2-dbd6-b7f71ecc344c@hardwarefreak.org>
Am 26.01.2017 um 00:01 schrieb Stan Hoeppner:
> On 01/25/2017 05:45 AM, Tobias Oberstein wrote:
>> Hi,
>>
>> I have a storage consisting of 8 NVMe drives (16 logical drives) that
>> I verified (FIO) is able to do >9 million 4kB random read IOPS if I
>> run FIO on the set of individual NVMes.
>>
>> However, when I create a MD (RAID-0) over the 16 NVMes and run the
>> same tests, performance collapses:
>>
>> ioengine=sync, invidual NVMes: IOPS=9191k
>> ioengine=sync, MD (RAID-0) over NVMes: IOPS=1562k
>>
>> Using ioengine=psync, the performance collapse isn't as dramatic, but
>> still very signifcant:
>>
>> ioengine=sync, invidual NVMes: IOPS=9395k
>> ioengine=sync, MD (RAID-0) over NVMes: IOPS=4117k
>>
>> --
>>
>> All detail results (including runs under Linux perf) and FIO control
>> files are here
>>
>> https://github.com/oberstet/scratchbox/tree/master/cruncher/sync-engines-perf
>>
>>
>
> You don't need 1024 jobs to fill the request queues. Just out of
Here is scaling of measured IOPS depending on concurrency for sync engines
https://github.com/oberstet/scratchbox/raw/master/cruncher/Performance%20Results%20-%20NVMe%20Scaling%20with%20IO%20Concurrency.pdf
Note: the top numbers are lower than in what I posted above, because
these measurements were still done with 512 bytes sectors (not with 4k
as above).
> curiosity, what are the fio results when using fewer jobs and a greater
> queue depth, say one job per core, 88 total, with a queue depth of 32?
iodeapth doesn't apply to synchronous io engines (ioengine=sync/psync/..)
>
> osq_lock appears to be a per cpu opportunistic spinlock. Might be of
> benefit to try even fewer jobs for fewer active cores.
with reduced concurrency, I am not able to saturate the storage anymore
in essence, 1 Xeon core is needed per 120k IOPS with ioengine=sync even
on the set of raw NVMe devices (no MD RAID).
for comparison: libaio achieves 600k IOPS / core
SPDK claims (I haven't measured that myself .. Intel claim):
1.8 mio IOPS / core
Cheers,
/Tobias
prev parent reply other threads:[~2017-01-26 8:35 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-25 11:45 performance collapse: 9 mio IOPS to 1.5 mio with MD RAID0 Tobias Oberstein
2017-01-25 23:01 ` Stan Hoeppner
2017-01-26 8:35 ` Tobias Oberstein [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=686aaca3-e79e-ee7b-06e8-fbd6d2d196e1@gmail.com \
--to=tobias.oberstein@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=stan@hardwarefreak.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).