All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladislav Bolkhovitin <vst at vlnb.net>
To: spdk@lists.01.org
Subject: Re: [SPDK] SCST Usermode iSCSI Storage Server now handles Intel SPDK backing storage
Date: Wed, 06 Sep 2017 16:32:04 -0700	[thread overview]
Message-ID: <59B08574.5020902@vlnb.net> (raw)
In-Reply-To: CALiN7ryf3J9ej0EWFBxKT_te4wgVPJDWw1JSQ+AimOHsnr0bwQ@mail.gmail.com

[-- Attachment #1: Type: text/plain, Size: 4196 bytes --]


David Butterfield wrote on 09/06/2017 03:48 PM:
> On Tue, Sep 5, 2017 at 9:25 PM, Vladislav Bolkhovitin <vst(a)vlnb.net> wrote:
>> The only note would be that, as I have already mentioned before, tcmu does data copy
>> between user mode module and kernel, so usage with SCST zero-copy scst_user instead
>> would be more performance efficient.
> 
> Yes, it's better not to have to copy the data; but I'm not sure that's
> the limiting factor for TCMU performance.
> 
> A ring buffer mediates communication in the TCMU datapath between
> tcm_user (in the kernel) and libtcmu (in usermode).  One fairly
> fundamental characteristic of the TCMU model is that the granularity
> of transaction through the ring buffer is the CDB.  There is overhead
> cost to access and maintain the ring four times per SCSI command
> (Request+Response) * (Sender+Receiver).
> 
> Concerning me more than that is the problem of timely scheduling of
> the threads on each side of the ring.  One might expect at least one
> wakeup per SCSI command, because whichever side of the ring is faster
> to process a command must inevitably sleep waiting for the slower
> side.
> 
> In practice it averages fewer than one wakeup per command (with
> sufficient queue-depth) because multiple commands can accumulate in
> the ring during the scheduling delay for the first command, and the
> entire backlog can be processed in one wakeup.  But you only get such
> batching in return for enduring thread scheduling latency on the
> datapath (with its own issues).
> 
> It is too complicated to determine from analysis alone how all the
> factors combine into overall performance behavior under various
> loading conditions -- the only way to really know is to observe and
> measure it.  How many IOPS can get through that ring, and what happens
> if the load is not quite 100%, or the load is light at queue-depth 2
> or even 1?  Or when the required protocol work is heavier on the
> kernel side versus heavier on the usermode side?
> 
> TCMU has had some time to gain usermode clients.  Finding even *one*
> such client -- that has been well-measured under a variety of
> conditions and demonstrated to work reliably with high performance --
> would prove that it is possible to do through the TCMU API,
> substantially reducing the concern.  There may be an example out
> there, but I looked around a couple of months ago and did not find
> anything except "we haven't done performance tuning yet".  But the
> concern is toward factors that are inherent in the TCMU model, not
> amenable to simple "performance tuning at the end".  Given
> CDB-granularity, I expect the TCMU IOPS bottleneck is going to be
> around that ring.
> 
> In contrast to the CDB-ring model, Usermode SCST uses socket(2) and
> related system calls for communication with the iSCSI initiator --
> these socket calls are where the datapath crosses between the kernel
> and usermode.  Here the granularity of transaction between the two can
> theoretically be as large as the socket buffer size -- much larger
> than one SCSI command.
> 
> Especially when using SPDK for backing storage, another step is to
> re-implement the network I/O using DPDK calls, eliminating the socket
> I/O calls altogether (I expect that to be straightforward in
> iscsi-scst/kernel/nthread.c).  Then the entire datapath would be in
> usermode (down to the I/O instructions, I think).
> 
> (Caveat: this analysis is based only on considering the TCMU model,
> not any actual performance experimentation with TCMU)

I see, interesting analyze. Just one correction, netlink sockets are used for
kernel-user mode communication in iSCSI-SCST, and used only to establish connection,
then everything is done entirely inside the kernel (in user space in your port).
Scst_user uses IOCTL-based interface, with 2 calls per CDB that could be batched too.
Everything inside single thread context, no extra inter-threads switches. In your user
space port it could be translated to just a regular function call leading to very
interesting marriage between SPDK frontend and existing user mode SCST backends :)

Vlad


             reply	other threads:[~2017-09-06 23:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-06 23:32 Vladislav Bolkhovitin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2017-09-06 22:48 [SPDK] SCST Usermode iSCSI Storage Server now handles Intel SPDK backing storage David Butterfield
2017-09-06  3:25 Vladislav Bolkhovitin
2017-09-05  7:25 David Butterfield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59B08574.5020902@vlnb.net \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.