From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============8931437452531106384==" MIME-Version: 1.0 From: Walker, Benjamin Subject: Re: [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being called sometimes Date: Wed, 06 Jul 2016 20:50:46 +0000 Message-ID: <1467838245.5999.214.camel@intel.com> In-Reply-To: 1022B12D-0679-43E6-A193-95EA33D222BE@xeograph.com List-ID: To: spdk@lists.01.org --===============8931437452531106384== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Wed, 2016-07-06 at 20:33 +0000, Will Del Genio wrote: > Andrey, > =C2=A0 > I was able to step into the spdk_nvme_qpair_process_completions() functio= n with gdb and found the > reason is isn=E2=80=99t returning any completions is because this check i= s failing at line 469: if (cpl- > >status.p !=3D qpair->phase) > =C2=A0 > Relevant gdb info here: > Thread 4 "xg:nvmeIo:9" hit Breakpoint 5, spdk_nvme_qpair_process_completi= ons > (qpair=3D0x7ffff0a5baa0, max_completions=3D0) at nvme_qpair.c:463 > 463=C2=A0=C2=A0 in nvme_qpair.c > (gdb) p qpair->cpl[qpair->cq_head] > $11 =3D {cdw0 =3D 0, rsvd1 =3D 0, sqhd =3D 51, sqid =3D 1, cid =3D 27, st= atus =3D {p =3D 0, sc =3D 0, sct =3D 0, rsvd2 > =3D 0, m =3D 0, dnr =3D 0}} > (gdb) p qpair->phase > $12 =3D 1 '\001' > =C2=A0 > What does this mean?=C2=A0 Does this information help at all? The NVMe hardware queue pairs consist of two arrays - one of commands and t= he other of responses - and a set of head and tail indices. The arrays are circular, so you can loo= p back around to the beginning. Each entry in the array contains a phase bit, which is either 1 = or 0. On the first pass through the array, new entries in the queue are marked by setting their pha= se to 1. On the next pass through the array, new elements are marked by setting their phase bit to 0,= etc. The current iteration's expected phase value is stored in qpair->phase. So the code you= are looking at on lines 467-470 is basically saying: 1) Grab the completion entry at the head of the queue 2) Check its phase bit. If it hasn't toggled, there is no new completion, s= o exit All that means is that there are no completions outstanding for any command= s according to the SSD, which doesn't narrow much down. At the point where you are broken in, can y= ou just dump out the whole qpair structure? Something like "p *qpair" should do it. That way I c= an see if there are any commands actually pending at the device and what state the device is in. My= expectation is that there aren't any commands outstanding. Can you also provide some background as to what kind of I/O you're submitti= ng (read or write, size, queue depth, etc.) when this occurs? > -Will > =C2=A0 > From: SPDK on behalf of Andrey Kuzmin > Reply-To: Storage Performance Development Kit > Date: Wednesday, July 6, 2016 at 2:35 PM > To: Storage Performance Development Kit > Subject: Re: [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being ca= lled sometimes > = > =C2=A0 > = > =C2=A0 > = > On Wed, Jul 6, 2016, 20:56 Will Del Genio wrot= e: > = > Andrey, > That sounds exactly like what we are experiencing, however we=E2=80=99re = working off the spdk codebase > that was current as of last week and are still experiencing the issue.=C2= =A0 Do you know what the > resource allocation fault was and how we might be able to determine if th= at is still occurring? > = > = > = > I'll take a look at commit log, both SPDK and mine, and will get back to = you. > = > =C2=A0 > = > Regards, > = > Andrey > = > Ben, > We=E2=80=99re ASSERTing that the result of spdk_nvme_ns_cmd_read() =3D=3D= 0.=C2=A0 If I set our queue depth high > enough it will fail that assertion, as would be expected.=C2=A0 Whatever = other failure we=E2=80=99re > experiencing does not seem to be causing spdk_nvme_ns_cmd_read() to retur= n an error code. > =C2=A0 > Also I performed some tests with the spdk perf tool and was not able to r= eplicate our problem.=C2=A0 It > ran fine at various queue depths and core masks.=C2=A0 When the qd was se= t too high, it failed > gracefully with an error message.=C2=A0 This is all as expected. > =C2=A0 > I=E2=80=99d like to continue down the path of investigating if some resou= rce allocation or something else > is failing silently for us.=C2=A0 Any specific ideas? > =C2=A0 > Thanks! > Will > =C2=A0 > From: SPDK on behalf of Andrey Kuzmin > Reply-To: Storage Performance Development Kit > Date: Wednesday, July 6, 2016 at 12:01 PM > To: Storage Performance Development Kit > Subject: Re: [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being ca= lled sometimes > = > = > = > =C2=A0 > = > On Wed, Jul 6, 2016 at 6:35 PM, Walker, Benjamin wrote: > = > Hi Will, > = > =C2=A0 > = > Since I can't see the code for your application I'd like to try and repro= duce the problem with > code that I have some visibility into. Are you able to reproduce the prob= lem using our perf tool > (examples/nvme/perf)? If you aren't, this is likely a problem with your t= est application and not > SPDK.=C2=A0 > = > = > =C2=A0 > = > I had been witnessing a similar issue with an earlier SPDK release, back = around Feb, where the > submit call was failing due to the resource allocation fault and neither = returning an error nor > invoking the callback, but my issue has been fixed in the recent release = (I can't recall the > actual commit, but there definitely was one dealing exactly with the caus= e). > =C2=A0 > = > =C2=A0 > = > Based on the symptoms, my best guess is that your memory pool ran out of = request objects. The > first thing to check is whether spdk_nvme_ns_cmd_read failed. If it fails= , it won't call the > callback. You can check for failure by looking at the return value - see = the documentation here. > Your application allocates this memory pool up front - all of our example= s allocate 8k requests > (see line 1097 in examples/nvme/perf/perf.c) You need to allocate a large= enough pool to handle > the maximum number of outstanding requests you plan to have. We recently = added a "hello_world" > style example for the NVMe driver at=C2=A0https://github.com/spdk/spdk/tr= ee/master/examples/nvme/hello_ > world=C2=A0with tons of comments. One of the comments explains this memor= y pool in detail. > = > =C2=A0 > = > That memory pool allocation is a bit of a wart on our otherwise clean API= . We're looking at > different strategies to clean that up. Let me know what the result of the= debugging is and I'll > shoot you some more ideas to try if necessary. > = > = > =C2=A0 > = > Are there any plans regarding the global request pool rework? > = > =C2=A0 > = > Regards, > = > Andrey > = > =C2=A0 > = > =C2=A0 > = > Thanks, > = > Ben > = > =C2=A0 > = > On Tue, 2016-07-05 at 21:03 +0000, Will Del Genio wrote: > = > Hello, > We have written a test application that is utilizing the spdk library to = benchmark a set of 3 > Intel P3700 drives and a single 750 drive (concurrently).=C2=A0 We=E2=80= =99ve done some testing using fio and > the kernel nvme drivers and have had no problem achieving the claimed IOP= s (4k random read) of all > drives on our system. > =C2=A0 > What we have found during our testing is that spdk will sometimes start t= o silently fail to call > the callback passed to spdk_nvme_ns_cmd_read in the following situations: > 1.=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Testing a single drive and passing= in 0 for max_completions to > spdk_nvme_qpair_process_completions().=C2=A0 We haven=E2=80=99t seen any = issues with single drive testing when > max_completions was > 0. > 2.=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Testing all four drives at once wi= ll result in one drive failing to receive callbacks, > seemingly regardless of what number we pass for max_completions (1 throug= h 128). > =C2=A0 > Here are other observations we=E2=80=99ve made > -When the callbacks fail to be called for a drive, they fail to be called= for the remaining > duration of the test. > -The drive that =E2=80=98fails=E2=80=99 when testing 4 drives concurrentl= y varies from test to test. > -=E2=80=98failure=E2=80=99 of a drive seems to be correlated with the num= ber of outstanding read operations, > though it is not a strict correlation. > =C2=A0 > Our system is a dual socket=C2=A0 E5-2630 v3.=C2=A0 One drive is on a PCI= slot for CPU 0 and the other 3 are > on PCI slots on CPU 1.=C2=A0 The master/slave threads are on the the same= cpu socket as the nvme device > they are talking to. > =C2=A0 > We=E2=80=99d like to know what is causing this issue and what we can do t= o help investigate the problem.=C2=A0 > What other information can we provide?=C2=A0 Is there some part of the sp= dk code that we can look at to > help determine the cause? > =C2=A0 > Thanks, > Will > =C2=A0 > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > = > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > = > = > =C2=A0 > = > = > = > = > = > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > = > -- > = > Regards, > Andrey > = > = > = > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk --===============8931437452531106384==--