From: Walker, Benjamin <benjamin.walker at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] SPDK aio examples
Date: Fri, 17 Jun 2016 21:57:00 +0000 [thread overview]
Message-ID: <1466200619.26925.125.camel@intel.com> (raw)
In-Reply-To: 700345AA-DD95-4ADC-AF38-34451A3F29FF@playstation.sony.com
[-- Attachment #1: Type: text/plain, Size: 7607 bytes --]
On Fri, 2016-06-17 at 20:52 +0000, Bhadauria, Varun wrote:
> Thanks Ben
>
> Can you also possibly shed some light on the expected behavior when more than one I/Os are
> erroneously submitted on the same qpair? Do the spdk_nvme_ns_cmd_read/write*() return a specific
> error value in this case?
>
You can submit many I/O per queue pair at the same time as long as you do it from a single thread,
and you can submit I/O to different queue pairs on different threads simultaneously with no locks.
Are you asking what happens when I/O is submitted simultaneously from different threads to the same
queue pair? In that case, you run the risk of corrupting the memory state of the queue. The queue is
implemented as an array in memory with a head and a tail pointer. Submitting an I/O to the queue
places a command into the next slot, increments the head pointer, and rings a doorbell register to
tell the device new commands are present. If you do this from two threads simultaneously, they'd
both be copying into the same spot and ringing the doorbell, meaning the device may receive part of
one command and part of another. The code is in lib/nvme/nvme_qpair.c:nvme_qpair_submit_tracker if
you want to look.
There is no expected error value for this case - the behavior is simply undefined. In order to catch
a user doing this, we'd have to look at some shared state (which means a lock) and the whole purpose
of queue pairs is to avoid locking.
> Also doesn the spdk_nvme_qpair_process_completions() for a qpair needs to be invoked from the same
> thread that is responsible for issuing i/o on the qpair?
Yes - you need to call that function from the same thread that you submitted the I/O on. It's fairly
obvious that you can only call spdk_nvme_qpair_process_completions on a particular queue pair from 1
thread at a time, but it isn't as obvious why you can't reap your completions on a different thread
than your submissions, so let me try and explain that.
We define two objects, a request and a tracker, that are placed on lists. A request represents a
single user call to submit an I/O. A tracker is an entry on the hardware queue. We allow more
requests outstanding than available trackers. Submissions and completions manipulate the lists of
free requests and trackers using a simple linked list, which is not thread safe. Further, each time
a completion happens and frees up a tracker, we check if there are any pending requests and submit
them. If we find any on the completion side but we're on a different thread and the submission path,
this would be equivalent to doing submissions from two threads simultaneously.
I'm not sure this technical challenge couldn't be overcome, but I am fairly confident that you don't
actually want to do this in your software anyway. Not only is it more complicated, but you end up
thrashing your CPU cache. The request objects are sitting nicely in your L1 or L2 CPU cache from
submission, so when you complete on the same core it is ideal.
>
> When any outstanding completions that are processed as a result of calling
> spdk_nvme_qpair_process_completions(), does a request’s call back called on the same core ?
Yes - whatever thread you call spdk_nvme_qpair_process_completions on, for each completion it finds
it will call that callback immediately inside of the current thread. So all of the callbacks for
completions found will have been called by the time spdk_nvme_qpair_process_completions returns. The
code is in lib/nvme/nvme_qpair.c:spdk_nvme_qpair_process_completions() - you can see it just loop
over the completion entries and call nvme_qpair_complete_tracker for each one. Inside of
nvme_qpair_complete_tracker, it calls the callback function.
>
> Is it always necessary to call spdk_nvme_qpair_process_completions() to process completions?
Yes - there are no interrupts or backgrounds threads so the driver will only execute in response to
calls from the user.
>
> Regards,
> Varun Bhadauria
>
>
>
>
>
>
>
>
> On 6/17/16, 10:24 AM, "SPDK on behalf of Walker, Benjamin" <spdk-bounces(a)lists.01.org on behalf of
> benjamin.walker(a)intel.com> wrote:
>
> >
> > On Wed, 2016-06-15 at 23:56 +0000, Bhadauria, Varun wrote:
> > >
> > > Hello Ben
> > >
> > > Thank you for the clarification. I was under the false impression that Linux AIO can be made
> > > to
> > > use SPDK under the hood which is clearly not the case since they will have to go through the
> > > filesystem.
> > I'm sure someone could wrap the AIO interface around the SPDK driver for the specific case where
> > the
> > user is opening a block device directly with O_DIRECT. It's nearly a 1:1 translation for that
> > case.
> > Unfortunately, most people use Linux AIO on files instead of block devices.
> >
> > >
> > > BTW are there any known early filesystem implementation besides ceph’s rocksdb based bluestore
> > > FS
> > > which use SPDK.
> > The only publicly announced one that I'm aware of is Bluestore inside of Ceph. As long as SPDK
> > continues to be valuable, I fully expect many filesystems with different designs to appear over
> > time. If you have a particular use case where you'd like some sort of filesystem-like layer on
> > top
> > of SPDK, I'd love to hear about it. At a minimum, it's useful to collect requirements from a
> > number
> > of sources.
> >
> > >
> > >
> > > Regards,
> > > Varun Bhadauria
> > >
> > >
> > > On 6/15/16, 4:37 PM, "SPDK on behalf of Walker, Benjamin" <spdk-bounces(a)lists.01.org on behalf
> > > of
> > > benjamin.walker(a)intel.com> wrote:
> > >
> > > >
> > > >
> > > > Can you explain a bit more about why you want to use AIO? Are you referring to Linux AIO or
> > > > POSIX AIO? If you want to do a performance comparison of Linux AIO and the SPDK NVMe driver
> > > > then
> > > > the perf tool is your best bet.
> > > >
> > > > You can run the perf tool against a block device using Linux AIO by binding your NVMe device
> > > > to
> > > > the kernel ("./scripts/setup.sh reset" will hand them all back to the kernel) and then doing
> > > > something like:
> > > >
> > > > ./perf -q 1 -s 4096 -w read -t 10 /dev/nvme0n1 /dev/nvme1n1
> > > >
> > > > -----Original Message-----
> > > > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Bhadauria, Varun
> > > > Sent: Wednesday, June 15, 2016 4:30 PM
> > > > To: Storage Performance Development Kit <spdk(a)lists.01.org>
> > > > Subject: [SPDK] SPDK air examples
> > > >
> > > > Hello
> > > >
> > > > Are there any SPDK examples which use AIO? Perf.c has very little documentation in the
> > > > usage
> > > > for AIO.
> > > >
> > > > Regards,
> > > > Varun Bhadauria
> > > >
> > > >
> > > > _______________________________________________
> > > > SPDK mailing list
> > > > SPDK(a)lists.01.org
> > > > https://lists.01.org/mailman/listinfo/spdk
> > > > _______________________________________________
> > > > SPDK mailing list
> > > > SPDK(a)lists.01.org
> > > > https://lists.01.org/mailman/listinfo/spdk
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
next reply other threads:[~2016-06-17 21:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-17 21:57 Walker, Benjamin [this message]
-- strict thread matches above, loose matches on Subject: below --
2016-06-22 17:35 [SPDK] SPDK aio examples Walker, Benjamin
2016-06-22 16:50 Bhadauria, Varun
2016-06-17 20:52 Bhadauria, Varun
2016-06-17 17:24 Walker, Benjamin
2016-06-15 23:56 Bhadauria, Varun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1466200619.26925.125.camel@intel.com \
--to=spdk@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.