netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Douglas Gilbert <dougg@torque.net>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: James.Smart@Emulex.Com, linux-scsi@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] Netlink and user-space buffer pointers
Date: Thu, 20 Apr 2006 16:18:12 -0400	[thread overview]
Message-ID: <4447EC84.7030502@torque.net> (raw)
In-Reply-To: <4447C8C2.30909@cs.wisc.edu>

Mike Christie wrote:
> James Smart wrote:
> 
>>Mike Christie wrote:
>>
>>>For the tasks you want to do for the fc class is performance critical?
>>
>>No, it should not be.
>>
>>
>>>If not, you could do what the iscsi class (for the netdev people this is
>>>drivers/scsi/scsi_transport_iscsi.c) does and just suffer a couple
>>>copies. For iscsi we do this in userspace to send down a login pdu:
>>>
>>>    /*
>>>     * xmitbuf is a buffer that is large enough for the iscsi_event,
>>>     * iscsi pdu (hdr_size) and iscsi pdu data (data_size)
>>>     */
>>
>>Well, the real difference is that the payload of the "message" is actually
>>the payload of the SCSI command or ELS/CT Request. Thus, the payload may
> 
> 
> I am not sure I follow. For iscsi, everything after the iscsi_event
> struct can be the iscsi request that is to be transmitted. The payload
> will not normally be Mbytes but it is not a couple if bytes.
> 
> 
>>range in size from a few hundred bytes to several kbytes (> 1 page) to
>>Mbyte's in size. Rather than buffer all of this, and push it over the
>>socket,
>>thus the extra copies - it would best to have the LLDD simply DMA the
>>payload like on a typical SCSI command.  Additionally, there will be
>>response data that can be several kbytes in length.
>>
> 
> 
> Once you have got the buffer to the class, the class can create a
> scatterlist to DMA from for the LLD. I thought. iscsi does not do this
> just because it is software right now. For qla4xxx we do not need
> something like what you are talking about (see below for what I was
> thinking about for the initiators). If you are saying the extra step of
> the copy is plain dumb, I agree, but this happens (you have to suffer
> some copy and cannot do dio) for sg io as well in some cases. I think
> for the sg driver the copy_*_user is the default.

Mike,
Indirect IO is the default in the sg driver because:
  - it has always been thus
  - the sg driver is less constrained (e.g. max number
    of scatg elements is a bigger issue with dio)
  - the only alignment to worry about is byte
    alignment (some folks would like bit alignment
    but you can't please everybody)
  - there is no need for the sg driver to pin user
    pages in memory (as there is with direct IO and
    mmaped-IO)

> Instead of netlink for scsi commands and transport requests....

With a netlink based pass through one might:
  - improve on the SG_IO ioctl and add things like
    tags that are currently missing
  - introduce a proper SCSI task management function
    pass through (no request queue please)
  - make other pass throughs for SAS: SMP and STP
  - have an alternative to sysfs for various control
    functions in a HBA (e.g. in SAS: link and hard
    reset) and fetching performance data from a HBA

Apart from how to get data efficiently between the HBA
and the user space, another major issue is the flexibility
of the bind() in s_netlink (storage netlink??).

> For scsi commands could we just use sg io, or is there something special
> about the command you want to send? If you can use sg io for scsi
> commands, maybe for transport level requests (in my example iscsi pdu)
> we could modify something like sg/bsg/block layer scsi_ioctl.c to send
> down transport requests to the classes and encapsulate them in some new
> struct transport_requests or use the existing struct request but do that
> thing people keep taling about using the request/request_queue for
> message passing.

Some SG_IO ioctl users want up to 32 MB in one transaction
and others want their data fast. Many pass through users
view the kernel as an impediment (not so much as "the way"
as "in the way").

Doug Gilbert

      parent reply	other threads:[~2006-04-20 20:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1145306661.4151.0.camel@localhost.localdomain>
     [not found] ` <20060418160121.GA2707@us.ibm.com>
2006-04-19 12:57   ` [RFC] Netlink and user-space buffer pointers James Smart
2006-04-19 16:22     ` Patrick McHardy
2006-04-19 17:08       ` James Smart
2006-04-19 17:16         ` Patrick McHardy
2006-04-19 16:26     ` Stephen Hemminger
2006-04-19 17:05       ` James Smart
2006-04-19 21:32     ` Mike Christie
2006-04-20 14:33       ` James Smart
2006-04-20 17:45         ` Mike Christie
2006-04-20 17:52           ` Mike Christie
2006-04-20 17:58           ` Mike Christie
2006-04-20 20:03           ` James Smart
2006-04-20 20:35             ` Mike Christie
2006-04-20 20:40               ` Mike Christie
2006-04-20 23:44             ` Andrew Vasquez
2006-04-20 20:18           ` Douglas Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4447EC84.7030502@torque.net \
    --to=dougg@torque.net \
    --cc=James.Smart@Emulex.Com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).