Re: [SPDK] 答复: 答复: SPDK fio benchmark support

From: Walker, Benjamin <benjamin.walker at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] 答复:  答复:  SPDK fio benchmark support
Date: Fri, 17 Jun 2016 16:59:43 +0000	[thread overview]
Message-ID: <1466182782.26925.73.camel@intel.com> (raw)
In-Reply-To: PS1PR0401MB14990B6A90163B7D66E0AE14AA570@PS1PR0401MB1499.apcprd04.prod.outlook.com

[-- Attachment #1: Type: text/plain, Size: 13226 bytes --]

On Fri, 2016-06-17 at 00:57 +0000, Li Tianxiang wrote:
> Thx,
> Actually，my question is Could one thread & one qpair achieve the
> max  performance？

Yes, on the devices we've tested with (Intel). There are people from basically all of the major SSD
vendors on this mailing list too and as far as I'm aware, they can get the full performance from
their devices with SPDK using only one thread and one queue pair. It ultimately depends on the
device, though. The one queue pair per thread model is just how we chose to do things in our perf
tool as well - a user application can choose any mapping of queues to devices to threads that they
want.

> Perf application adopts  the manner of one 
> threads per namespace，and many NVMe SSDs not support 
> multi-ns yet.

I suspect on most (or all) devices, namespaces are purely logical abstractions and have nothing to
do with physical NAND dies or channels. Therefore, one namespace or many namespaces won't have any
impact on performance.

> 
> It makes sense that perf gets the performance better or comparable to 
> Kernel driver,because the Submission Queues(or Qpair) in NVMe are 
> still software queues, Venders still  hide the real parallelism as before.

SPDK gets better performance than the kernel for a number of reasons (polling, zero copy, no
syscalls, no locking, streamlined I/O path), but I'm not sure what you are saying here. The queue
pair we expose from our NVMe driver really is the hardware queue from the NVMe controller - it is
not a software abstraction that we're presenting. The receiving end of the queue we are exposing is
really being processed by the NVMe controller ASIC on the other side. You can even elect to allocate
the queue memory from the NVMe controller's memory, if the controller supports that feature. The
queues are not physical NAND channels, if that's what you mean, but NAND doesn't support block-
granularity rewriting, so there must be some flash translation layer to translate logical block
writes to open physical locations on the NAND.

I also disagree that vendors are hiding any parallelism. If you think of an SSD as having two
distinct processing steps - a flash translation layer and a backend NAND layer - and you assume that
the available parallelism is bounded by that backend NAND layer, then it doesn't really matter if
the flash translation layer is parallel or not, as long as it can fully saturate the backend with
tasks. I would posit that you can't even experimentally tell the difference between an SSD with a
parallel flash translation layer and one with a serialized one, as long as they are both able to
keep all of the NAND channels full.

> 
> If there is any general benchmarks such as fio, the result could be more 
> convincible.
> And the mentioned fio plug-in needs to add multi jobs support.

Is there some inherent reason that fio is a more convincing test than our perf tool? I'm typically
convinced by benchmarks that are open and easy to understand. Our perf tool is ~1100 lines of code
and fio contains ~71k. I'm a big fan of fio as a benchmarking tool and I recommend its use, but
isn't it much easier to audit exactly what our perf tool is doing to build trust?

At some point we'll definitely add multi-core support to our fio plugin though, and if someone
volunteers to do it sooner we'd happily accept the patch.

> 
> -----
> It could be interesting if NVMe CMD sets could manage the mapping
> between Qpair and real parallelism.

If by "real parallelism" you mean actual NAND channels, it's impossible to expose those directly and
still present a block interface. The rewrite rules are totally different.

> -----
> 
> 
> 发件人: SPDK <spdk-bounces(a)lists.01.org> 代表 Walker, Benjamin <benjamin.walker(a)intel.com>
> 发送时间: 2016年6月16日 15:40
> 收件人: spdk(a)lists.01.org
> 主题: Re: [SPDK] 答复: SPDK fio benchmark support
>  
> The perf tool allocates 1 queue pair per thread (worker_fn is the thread entry point in the code)
> and there is 1 thread per core. Our findings on all of the devices we've tested has been that
> using multiple queue pairs does not increase the total performance of the SSD. The full quoted SSD
> hardware performance can be achieved with a single queue pair. This finding, however, is
> completely device specific. Different SSD controllers may have different behavior.
> 
> Using multiple queue pairs is a big advantage for the host software (i.e. SPDK) though. Using 1
> queue pair per thread allows the host software to submit I/O in parallel with no locks.
> 
> On Thu, 2016-06-16 at 14:24 +0000, Li Tianxiang wrote:
> > Hi，
> > 
> > I noticed that the perf  app.  only  allocates one qpair.
> > 
> > using : 
> > "ns_ctx->u.nvme.qpair = spdk_nvme_ctrlr_alloc_io_qpair(ns_ctx->entry->u.nvme.ctrlr, 0);"
> > 
> > Does more  qpair help to the  performance?
> > 
> > And P3700  has about 30 submission  queues default.
> > 
> > 
> > ________________________________________
> > 发件人: SPDK <spdk-bounces(a)lists.01.org> 代表 Walker, Benjamin <benjamin.walker(a)intel.com>
> > 发送时间: 2016年6月15日 23:29
> > 收件人: Storage Performance Development Kit; sbradshaw(a)micron.com; Robert.Cleveland(a)skhms.com
> > 主题: Re: [SPDK] SPDK fio benchmark support
> > 
> > The perf application should scale linearly with CPU cores - so if you get 1.6 million IOPS with
> > 2 SSDs on 1 core, you should get 3.2 million IOPS with 4 SSDs on 2 cores. Since you aren't,
> > there must be something else impacting the performance. Can you provide the following
> > information:
> > 
> > 1) The Xeon E5 SKU you are using
> > 2) The number of CPU sockets and which socket the devices are attached to (all 4 need to be
> > attached to the same socket that you are running perf on).
> > 3) The command line parameters you are passing to the perf tool.
> > 
> > Thanks,
> > Ben
> > 
> > -----Original Message-----
> > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Raj (Rajinikanth) Pandurangan
> > Sent: Wednesday, June 15, 2016 3:16 PM
> > To: Storage Performance Development Kit <spdk(a)lists.01.org>; sbradshaw(a)micron.com; Robert.Clevel
> > and(a)skhms.com
> > Subject: Re: [SPDK] SPDK fio benchmark support
> > 
> > Hello Ben,
> > 
> > Thanks for the details.
> > 
> > I have Intel Xeon E5 system and able to saturate 2 Samsung NVMe SSDs (~1.6 million) on 1
> > core.   IOPS are same even when I had 4 SSDs on 1 core.  But I expect the IOPS to go up when I
> > introduce extra cores with 4 SSDs and it did not.
> > 
> > Thanks,
> > -Raj P
> > 
> > -----Original Message-----
> > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
> > Sent: Wednesday, June 15, 2016 1:33 PM
> > To: Raj (Rajinikanth) Pandurangan; sbradshaw(a)micron.com; spdk(a)lists.01.org; Robert.Cleveland(a)skh
> > ms.com
> > Subject: Re: [SPDK] SPDK fio benchmark support
> > 
> > The fio plugin for SPDK is new and was only tested for a single job. We'd welcome patches to
> > make it span more cores when given more jobs if someone is interested in doing the work here. At
> > some point, I'm sure we'll get to it ourselves, but it's lower priority than the other work
> > we're doing right now (NVMf target). The perf example we provide definitely does scale to many
> > cores though, and it can optionally use libaio as its backend, so it is a great alternative to
> > using fio that works today with no extra effort.
> > 
> > As far as expected performance scaling across cores with our perf tool, are you seeing the
> > maximum quoted IOPS for all of the device(s) on the system with only a single core? If you are,
> > adding more cores won't improve performance. If you are not getting the performance you expect
> > from the hardware I'd love to know the details of the devices and the platform.
> > 
> > To provide a quick example from my own system (and none of the following should be treated as
> > official benchmarking numbers for any reason), I have a Haswell Xeon E7 with 8 Intel P3700 NVMe
> > SSDs attached. Running the perf tool on my system yields about 3 million 4k read IOPs on 1 core,
> > which is the equivalent of 6 or 7 of my SSDs. Moving to two cores gives me the full hardware
> > spec IOPS for 8 SSDs. If I only test against 4 SSDs I get the same performance (the maximum that
> > the SSDs can provide) no matter how many cores I give the tool.
> > 
> > Thanks,
> > Ben
> > 
> > -----Original Message-----
> > From: Raj (Rajinikanth) Pandurangan [mailto:rajini.pandu(a)samsung.com]
> > Sent: Wednesday, June 15, 2016 11:52 AM
> > To: Walker, Benjamin <benjamin.walker(a)intel.com>; sbradshaw(a)micron.com; spdk(a)lists.01.org; Rober
> > t.Cleveland(a)skhms.com
> > Cc: Raj (Rajinikanth) Pandurangan <rajini.pandu(a)samsung.com>
> > Subject: RE: [SPDK] SPDK fio benchmark support
> > 
> > Hello All,
> > 
> > 1. I tried to run SPDK fio with multiple nvme drives.  But for some reason, I don't see more
> > than 1 core being used even when I increased numjobs?
> > 
> > Here is my config file.
> > =============================================================
> > 
> > [global]
> > ioengine=/spdk-master_temp_experiment/examples/nvme/fio_plugin/fio_plugin
> > group_reporting=1
> > direct=1
> > verify=0
> > time_based=1
> > ramp_time=0
> > runtime=10
> > 
> > 
> > [test1]
> > iodepth=128
> > rw=randread
> > bs=4k
> > #filename=0000.09.00.0/1:0000.07.00.0/1:0000.0a.00.0/1
> > filename=0000.09.00.0/1
> > numjobs=2
> > 
> > [test2]
> > iodepth=128
> > rw=randread
> > bs=4k
> > numjobs=1
> > filename=0000.08.00.0/1
> > ================================================================
> > 
> > Has anyone succeeded using multiple cores?
> > 
> > 2. I have also tried with 'perf' and I noticed that even though multiple cores were being used
> > but performance numbers didn't scale.  Seemed like similar IOPS when I used single core.
> > 
> > Any insights would really be appreciated.
> > 
> > Thanks,
> > -Raj P
> > 
> > -----Original Message-----
> > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
> > Sent: Friday, February 26, 2016 2:30 PM
> > To: sbradshaw(a)micron.com; spdk(a)ml01.01.org; Robert.Cleveland(a)skhms.com
> > Subject: Re: [SPDK] SPDK fio benchmark support
> > 
> > As a follow up to this, I pushed some improvements to SPDK's setup scripts so that we now more
> > easily support vfio. That makes it a lot easier to switch back and forth between the unvme
> > driver and the SPDK driver. We'd happily accept more patches to improve vfio if anyone has
> > suggestions.
> > 
> > Thanks all.
> > 
> > On Thu, 2016-02-18 at 18:59 +0000, Sam Bradshaw (sbradshaw) wrote:
> > Hi Robert,
> > 
> > We built a fio ioengine plugin for benchmarking SPDK as well as our own userspace NVMe
> > driver.  Source and documentation are available here:  https://github.com/MicronSSD/unvme
> > 
> > (SPDK fio plugin is ioengine/spdk_fio.c)
> > 
> > If you have any questions on how to use it or interpret benchmark results, feel free to ask.
> > 
> > -Sam
> > 
> > From: SPDK [mailto:spdk-bounces(a)ml01.01.org] On Behalf Of Robert Cleveland
> > Sent: Thursday, February 18, 2016 10:24 AM
> > To: spdk(a)lists.01.org
> > Subject: [SPDK] SPDK fio benchmark support
> > 
> > Hello all,
> > 
> > Has anyone done any work to make it easy to benchmark polling mode driver with something like
> > FIO?
> > 
> > Thanks,
> > Robert Cleveland
> > The information contained in this e-mail is considered confidential of SK hynix memory solutions
> > Inc. and intended only for the persons addressed or copied in this e-mail. Any unauthorized use,
> > dissemination of the information, or copying of this message is strictly prohibited. If you are
> > not the intended recipient, please contact the sender immediately and permanently delete the
> > original and any copies of this email.
> > 
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
> > https://lists.01.org/mailman/listinfo/spdk
> > 
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk