From mboxrd@z Thu Jan  1 00:00:00 1970
Content-Type: multipart/mixed; boundary="===============4930566950001746096=="
MIME-Version: 1.0
From: Walker, Benjamin <benjamin.walker at intel.com>
Subject: =?utf-8?q?Re=3A_=5BSPDK=5D_=E7=AD=94=E5=A4=8D=3A__=E7=AD=94=E5=A4=8D=3A__?=
 =?utf-8?q?SPDK_fio_benchmark_support?=
Date: Fri, 17 Jun 2016 16:59:43 +0000
Message-ID: <1466182782.26925.73.camel@intel.com>
In-Reply-To: PS1PR0401MB14990B6A90163B7D66E0AE14AA570@PS1PR0401MB1499.apcprd04.prod.outlook.com
List-ID: <spdk@lists.01.org>
To: spdk@lists.01.org

--===============4930566950001746096==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

On Fri, 2016-06-17 at 00:57 +0000, Li Tianxiang wrote:
> Thx,
> Actually=EF=BC=8Cmy question is Could one thread & one qpair achieve the
> max =C2=A0performance=EF=BC=9F

Yes, on the devices we've tested with (Intel). There are people from basica=
lly all of the major SSD
vendors on this mailing list too and as far as I'm aware, they can get the =
full performance from
their devices with SPDK using only one thread and one queue pair. It ultima=
tely depends on the
device, though. The one queue pair per thread model is just how we chose to=
 do things in our perf
tool as well - a user application can choose any mapping of queues to devic=
es to threads that they
want.

> Perf application adopts =C2=A0the manner of one=C2=A0
> threads per namespace=EF=BC=8Cand many NVMe SSDs not support=C2=A0
> multi-ns yet.

I suspect on most (or all) devices, namespaces are purely logical abstracti=
ons and have nothing to
do with physical NAND dies or channels. Therefore, one namespace or many na=
mespaces won't have any
impact on performance.

> =

> It makes sense that perf gets the performance better or comparable to=C2=
=A0
> Kernel driver,because the Submission Queues(or Qpair) in NVMe are=C2=A0
> still software queues,=C2=A0Venders still =C2=A0hide the real=C2=A0parall=
elism as before.

SPDK gets better performance than the kernel for a number of reasons (polli=
ng, zero copy, no
syscalls, no locking, streamlined I/O path), but I'm not sure what you are =
saying here. The queue
pair we expose from our NVMe driver really is the hardware queue from the N=
VMe controller - it is
not a software abstraction that we're presenting. The receiving end of the =
queue we are exposing is
really being processed by the NVMe controller ASIC on the other side. You c=
an even elect to allocate
the queue memory from the NVMe controller's memory, if the controller suppo=
rts that feature. The
queues are not physical NAND channels, if that's what you mean, but NAND do=
esn't support block-
granularity rewriting, so there must be some flash translation layer to tra=
nslate logical block
writes to open physical locations on the NAND.

I also disagree that vendors are hiding any parallelism. If you think of an=
 SSD as having two
distinct processing steps - a flash translation layer and a backend NAND la=
yer - and you assume that
the available parallelism is bounded by that backend NAND layer, then it do=
esn't really matter if
the flash translation layer is parallel or not, as long as it can fully sat=
urate the backend with
tasks. I would posit that you can't even experimentally tell the difference=
 between an SSD with a
parallel flash translation layer and one with a serialized one, as long as =
they are both able to
keep all of the NAND channels full.

> =

> If there is any general benchmarks such as fio, the result could be more=
=C2=A0
> convincible.
> And=C2=A0the mentioned fio plug-in needs to add multi jobs support.

Is there some inherent reason that fio is a more convincing test than our p=
erf tool? I'm typically
convinced by benchmarks that are open and easy to understand. Our perf tool=
 is ~1100 lines of code
and fio contains ~71k. I'm a big fan of fio as a benchmarking tool and I re=
commend its use, but
isn't it much easier to audit exactly what our perf tool is doing to build =
trust?

At some point we'll definitely add multi-core support to our fio plugin tho=
ugh, and if someone
volunteers to do it sooner we'd happily accept the patch.

> =

> -----
> It could be interesting if=C2=A0NVMe CMD sets could manage the mapping
> between Qpair and real parallelism.

If by "real parallelism" you mean actual NAND channels, it's impossible to =
expose those directly and
still present a block interface. The rewrite rules are totally different.

> -----
> =

> =

> =E5=8F=91=E4=BB=B6=E4=BA=BA: SPDK <spdk-bounces(a)lists.01.org> =E4=BB=A3=
=E8=A1=A8 Walker, Benjamin <benjamin.walker(a)intel.com>
> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2016=E5=B9=B46=E6=9C=8816=E6=97=A5 =
15:40
> =E6=94=B6=E4=BB=B6=E4=BA=BA: spdk(a)lists.01.org
> =E4=B8=BB=E9=A2=98: Re: [SPDK] =E7=AD=94=E5=A4=8D: SPDK fio benchmark sup=
port
> =C2=A0
> The perf tool allocates 1 queue pair per thread (worker_fn is the thread =
entry point in the code)
> and there is 1 thread per core. Our findings on all of the devices we've =
tested has been that
> using multiple queue pairs does not increase the total performance of the=
 SSD. The full quoted SSD
> hardware performance can be achieved with a single queue pair. This findi=
ng, however, is
> completely device specific. Different SSD controllers may have different =
behavior.
> =

> Using multiple queue pairs is a big advantage for the host software (i.e.=
 SPDK) though. Using 1
> queue pair per thread allows the host software to submit I/O in parallel =
with no locks.
> =

> On Thu, 2016-06-16 at 14:24 +0000, Li Tianxiang wrote:
> > Hi=EF=BC=8C
> > =

> > I noticed that the perf=C2=A0=C2=A0app.=C2=A0=C2=A0only=C2=A0=C2=A0allo=
cates one qpair.
> > =

> > using :=C2=A0
> > "ns_ctx->u.nvme.qpair =3D spdk_nvme_ctrlr_alloc_io_qpair(ns_ctx->entry-=
>u.nvme.ctrlr, 0);"
> > =

> > Does more=C2=A0=C2=A0qpair help to the=C2=A0=C2=A0performance?
> > =

> > And P3700=C2=A0=C2=A0has about 30 submission=C2=A0=C2=A0queues default.
> > =

> > =

> > ________________________________________
> > =E5=8F=91=E4=BB=B6=E4=BA=BA: SPDK <spdk-bounces(a)lists.01.org> =E4=BB=
=A3=E8=A1=A8 Walker, Benjamin <benjamin.walker(a)intel.com>
> > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2016=E5=B9=B46=E6=9C=8815=E6=97=
=A5 23:29
> > =E6=94=B6=E4=BB=B6=E4=BA=BA: Storage Performance Development Kit; sbrad=
shaw(a)micron.com; Robert.Cleveland(a)skhms.com
> > =E4=B8=BB=E9=A2=98: Re: [SPDK] SPDK fio benchmark support
> > =

> > The perf application should scale linearly with CPU cores - so if you g=
et 1.6 million IOPS with
> > 2 SSDs on 1 core, you should get 3.2 million IOPS with 4 SSDs on 2 core=
s. Since you aren't,
> > there must be something else impacting the performance. Can you provide=
 the following
> > information:
> > =

> > 1) The Xeon E5 SKU you are using
> > 2) The number of CPU sockets and which socket the devices are attached =
to (all 4 need to be
> > attached to the same socket that you are running perf on).
> > 3) The command line parameters you are passing to the perf tool.
> > =

> > Thanks,
> > Ben
> > =

> > -----Original Message-----
> > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Raj (Rajin=
ikanth) Pandurangan
> > Sent: Wednesday, June 15, 2016 3:16 PM
> > To: Storage Performance Development Kit <spdk(a)lists.01.org>; sbradsha=
w(a)micron.com; Robert.Clevel
> > and(a)skhms.com
> > Subject: Re: [SPDK] SPDK fio benchmark support
> > =

> > Hello Ben,
> > =

> > Thanks for the details.
> > =

> > I have Intel Xeon E5 system and able to saturate 2 Samsung NVMe SSDs (~=
1.6 million) on 1
> > core.=C2=A0=C2=A0=C2=A0IOPS are same even when I had 4 SSDs on 1 core.=
=C2=A0=C2=A0But I expect the IOPS to go up when I
> > introduce extra cores with 4 SSDs and it did not.
> > =

> > Thanks,
> > -Raj P
> > =

> > -----Original Message-----
> > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Be=
njamin
> > Sent: Wednesday, June 15, 2016 1:33 PM
> > To: Raj (Rajinikanth) Pandurangan; sbradshaw(a)micron.com; spdk(a)lists=
.01.org; Robert.Cleveland(a)skh
> > ms.com
> > Subject: Re: [SPDK] SPDK fio benchmark support
> > =

> > The fio plugin for SPDK is new and was only tested for a single job. We=
'd welcome patches to
> > make it span more cores when given more jobs if someone is interested i=
n doing the work here. At
> > some point, I'm sure we'll get to it ourselves, but it's lower priority=
 than the other work
> > we're doing right now (NVMf target). The perf example we provide defini=
tely does scale to many
> > cores though, and it can optionally use libaio as its backend, so it is=
 a great alternative to
> > using fio that works today with no extra effort.
> > =

> > As far as expected performance scaling across cores with our perf tool,=
 are you seeing the
> > maximum quoted IOPS for all of the device(s) on the system with only a =
single core? If you are,
> > adding more cores won't improve performance. If you are not getting the=
 performance you expect
> > from the hardware I'd love to know the details of the devices and the p=
latform.
> > =

> > To provide a quick example from my own system (and none of the followin=
g should be treated as
> > official benchmarking numbers for any reason), I have a Haswell Xeon E7=
 with 8 Intel P3700 NVMe
> > SSDs attached. Running the perf tool on my system yields about 3 millio=
n 4k read IOPs on 1 core,
> > which is the equivalent of 6 or 7 of my SSDs. Moving to two cores gives=
 me the full hardware
> > spec IOPS for 8 SSDs. If I only test against 4 SSDs I get the same perf=
ormance (the maximum that
> > the SSDs can provide) no matter how many cores I give the tool.
> > =

> > Thanks,
> > Ben
> > =

> > -----Original Message-----
> > From: Raj (Rajinikanth) Pandurangan [mailto:rajini.pandu(a)samsung.com]
> > Sent: Wednesday, June 15, 2016 11:52 AM
> > To: Walker, Benjamin <benjamin.walker(a)intel.com>; sbradshaw(a)micron.=
com; spdk(a)lists.01.org; Rober
> > t.Cleveland(a)skhms.com
> > Cc: Raj (Rajinikanth) Pandurangan <rajini.pandu(a)samsung.com>
> > Subject: RE: [SPDK] SPDK fio benchmark support
> > =

> > Hello All,
> > =

> > 1. I tried to run SPDK fio with multiple nvme drives.=C2=A0=C2=A0But fo=
r some reason, I don't see more
> > than 1 core being used even when I increased numjobs?
> > =

> > Here is my config file.
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > =

> > [global]
> > ioengine=3D/spdk-master_temp_experiment/examples/nvme/fio_plugin/fio_pl=
ugin
> > group_reporting=3D1
> > direct=3D1
> > verify=3D0
> > time_based=3D1
> > ramp_time=3D0
> > runtime=3D10
> > =

> > =

> > [test1]
> > iodepth=3D128
> > rw=3Drandread
> > bs=3D4k
> > #filename=3D0000.09.00.0/1:0000.07.00.0/1:0000.0a.00.0/1
> > filename=3D0000.09.00.0/1
> > numjobs=3D2
> > =

> > [test2]
> > iodepth=3D128
> > rw=3Drandread
> > bs=3D4k
> > numjobs=3D1
> > filename=3D0000.08.00.0/1
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > =

> > Has anyone succeeded using multiple cores?
> > =

> > 2. I have also tried with 'perf' and I noticed that even though multipl=
e cores were being used
> > but performance numbers didn't scale.=C2=A0=C2=A0Seemed like similar IO=
PS when I used single core.
> > =

> > Any insights would really be appreciated.
> > =

> > Thanks,
> > -Raj P
> > =

> > -----Original Message-----
> > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Be=
njamin
> > Sent: Friday, February 26, 2016 2:30 PM
> > To: sbradshaw(a)micron.com; spdk(a)ml01.01.org; Robert.Cleveland(a)skhm=
s.com
> > Subject: Re: [SPDK] SPDK fio benchmark support
> > =

> > As a follow up to this, I pushed some improvements to SPDK's setup scri=
pts so that we now more
> > easily support vfio. That makes it a lot easier to switch back and fort=
h between the unvme
> > driver and the SPDK driver. We'd happily accept more patches to improve=
 vfio if anyone has
> > suggestions.
> > =

> > Thanks all.
> > =

> > On Thu, 2016-02-18 at 18:59 +0000, Sam Bradshaw (sbradshaw) wrote:
> > Hi Robert,
> > =

> > We built a fio ioengine plugin for benchmarking SPDK as well as our own=
 userspace NVMe
> > driver.=C2=A0=C2=A0Source and documentation are available here:=C2=A0=
=C2=A0https://github.com/MicronSSD/unvme
> > =

> > (SPDK fio plugin is ioengine/spdk_fio.c)
> > =

> > If you have any questions on how to use it or interpret benchmark resul=
ts, feel free to ask.
> > =

> > -Sam
> > =

> > From: SPDK [mailto:spdk-bounces(a)ml01.01.org] On Behalf Of Robert Clev=
eland
> > Sent: Thursday, February 18, 2016 10:24 AM
> > To: spdk(a)lists.01.org
> > Subject: [SPDK] SPDK fio benchmark support
> > =

> > Hello all,
> > =

> > Has anyone done any work to make it easy to benchmark polling mode driv=
er with something like
> > FIO?
> > =

> > Thanks,
> > Robert Cleveland
> > The information contained in this e-mail is considered confidential of =
SK hynix memory solutions
> > Inc. and intended only for the persons addressed or copied in this e-ma=
il. Any unauthorized use,
> > dissemination of the information, or copying of this message is strictl=
y prohibited. If you are
> > not the intended recipient, please contact the sender immediately and p=
ermanently delete the
> > original and any copies of this email.
> > =

> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
> > https://lists.01.org/mailman/listinfo/spdk
> > =

> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
--===============4930566950001746096==--