public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@suse.de>
To: akataria@vmware.com
Cc: Dmitry Torokhov <dtor@vmware.com>,
	Matthew Wilcox <matthew@wil.cx>,
	Roland Dreier <rdreier@cisco.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Robert Love <robert.w.love@intel.com>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	Mike Christie <michaelc@cs.wisc.edu>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rolf Eike Beer <eike-kernel@sf-tec.de>,
	Maxime Austruy <maustruy@vmware.com>
Subject: Re: [PATCH] SCSI driver for VMware's virtual HBA.
Date: Thu, 03 Sep 2009 15:03:02 -0500	[thread overview]
Message-ID: <1252008182.3941.61.camel@mulgrave.site> (raw)
In-Reply-To: <1251911789.23106.25.camel@ank32.eng.vmware.com>

On Wed, 2009-09-02 at 10:16 -0700, Alok Kataria wrote:
> On Wed, 2009-09-02 at 08:06 -0700, James Bottomley wrote:
> > On Tue, 2009-09-01 at 19:55 -0700, Alok Kataria wrote:
> > > On Tue, 2009-09-01 at 11:15 -0700, James Bottomley wrote:
> > > > On Tue, 2009-09-01 at 10:41 -0700, Alok Kataria wrote:
> > > > > > lguest uses the sg_ring abstraction.  Xen and KVM were certainly looking
> > > > > > at this too.
> > > > > 
> > > > > I don't see the sg_ring abstraction that you are talking about. Can you
> > > > > please give me some pointers. 
> > > > 
> > > > it's in drivers/lguest ... apparently it's vring now and the code is in
> > > > driver/virtio
> > > > 
> > > > > Also regarding Xen and KVM I think they are using the xenbus/vbus
> > > > > interface, which is quite different than what we do here. 
> > > > 
> > > > Not sure about Xen ... KVM uses virtio above.
> > > > 
> > > > > > 
> > > > > > > And anyways how large is the DMA code that we are worrying about here ?
> > > > > > > Only about 300-400 LOC ? I don't think we might want to over-design for
> > > > > > > such small gains.
> > > > > > 
> > > > > > So even if you have different DMA code, the remaining thousand or so
> > > > > > lines would be in common.  That's a worthwhile improvement.
> > > 
> > > I don't see how, the rest of the code comprises of IO/MMIO  space & ring
> > > processing which is very different in each of the implementations. What
> > > is left is the setup and initialization code which obviously depends on
> > > the implementation of the driver data structures. 
> > 
> > Are there benchmarks comparing the two approaches?
> 
> Benchmarks comparing what ? 

Your approach versus virtio.

> > 
> > > > > And not just that, different HV-vendors can have different features,
> > > > > like say XYZ can come up tomorrow and implement the multiple rings
> > > > > interface so the feature set doesn't remain common and we will have less
> > > > > code to share in the not so distant future.
> > > > 
> > > > Multiple rings is really just a multiqueue abstraction.  That's fine,
> > > > but it needs a standard multiqueue control plane.
> > > > 
> > > > The desire to one up the competition by adding a new whiz bang feature
> > > > to which you code a special interface is very common in the storage
> > > > industry.  The counter pressure is that consumers really like these
> > > > things standardised.  That's what the transport class abstraction is all
> > > > about.
> > > > 
> > > > We also seem to be off on a tangent about hypervisor interfaces.  I'm
> > > > actually more interested in the utility of an SRP abstraction or at
> > > > least something SAM based.  It seems that in your driver you don't quite
> > > > do the task management functions as SAM requests, but do them over your
> > > > own protocol abstractions.
> > > 
> > > Okay,  I think I need to take a step back here and understand what
> > > actually are you asking for.
> > > 
> > > 1. What do you mean by the "transport class abstraction" ? 
> > > Do you mean that the way we communicate with the hypervisor needs to be
> > > standardized ?
> > 
> > Not really.  Transport classes are designed to share code and provide a
> > uniform control plane when the underlying implementation is different.
> > 
> > > 2. Are you saying that we should use the virtio ring mechanism to handle
> > > our request and completion rings ? 
> > 
> > That's an interesting question.  Virtio is currently the standard linux
> > guest<=>hypervisor communication mechanism, but if you have comparative
> > benchmarks showing that virtual hardware emulation is faster, it doesn't
> > need to remain so.
> 
> It is a standard that KVM and lguest are using. I don't think it needs
> any benchamrks to show if a particular approach is faster or not.

It's a useful datapoint especially since the whole object of
paravirtualised drivers is supposed to be speed vs full hardware
emulation.
 
> VMware has supported paravirtualized devices in backend for more than an
> year now (may be more, don't quote me on this), and the backend is
> common across different guest OS's. Virtual hardware emulation helps us
> give a common interface to different GOS's, whereas virtio binds this
> heavily to Linux usage. And please note that the backend implementation
> for our virtual device was done before virtio was integrated in
> mainline.

Virtio mainline integration dates from October 2007.  The mailing list
discussions obviously predate that by several months.

> Also, from your statements above it seems that you think we are
> proposing to change the standard communication mechanism (between guest
> & hypervisor) for Linux. For the record that's not the case, the
> standard that the Linux based VM's are using does not need to be
> changed. This pvscsi driver is used for a new SCSI HBA, how does it
> matter if this SCSI HBA is actually a virtual HBA and implemented by the
> hypervisor in software. 
> 
> > 
> > >   We can not do that. Our backend expects that each slot on the ring is
> > > in a particular format. Where as vring expects that each slot on the
> > > vring is in the vring_desc format.
> > 
> > Your backend is a software server, surely?
> 
> Yes it is, but the backend is as good as written in stone, as it is
> being supported by our various products which are out in the market. The
> pvscsi driver that I proposed for mainlining has also been in existence
> for some time now and was being used/tested heavily. Earlier we used to
> distribute it as part of our open-vm-tools project, and it is now that
> we are proposing to integrate it with mainline.
> 
> So if you are hinting that since the backend is software, it can be
> changed the answer is no. The reason being, their are existing
> implementations that have that device support and we still want newer
> guests to make use of that backend implementation. 
> 
> > > 3. Also, the way we communicate with the hypervisor backend is that the
> > > driver writes to our device IO registers in a particular format. The
> > > format that we follow is to first write the command on the
> > > COMMAND_REGISTER and then write a stream of data words in the
> > > DATA_REGISTER, which is a normal device interface.
> > > The reason I make this point is to highlight we are not making any
> > > hypercalls instead we communicate with the hypervisor by writing to
> > > IO/Memory mapped regions.  So from that perspective the driver has no
> > > knowledge that its is talking to a software backend (aka device
> > > emulation) instead it is very similar to how a driver talks to a silicon
> > > device.  The backend expects things in a certain way and we cannot
> > > really change that interface ( i.e. the ABI shared between Device driver
> > > and Device Emulation).
> > > 
> > > So sharing code with vring or virtio is not something that works well
> > > with our backend. The VMware PVSCSI driver is simply a virtual HBA and
> > > shouldn't be looked at any differently.
> > > 
> > > Is their anything else that you are asking us to standardize ?
> > 
> > I'm not really asking you to standardise anything (yet).  I was more
> > probing for why you hadn't included any of the SCSI control plane
> > interfaces and what lead you do produce a different design from the
> > current patterns in virtual I/O.  I think what I'm hearing is "Because
> > we didn't look at how modern SCSI drivers are constructed" and "Because
> > we didn't look at how virtual I/O is currently done in Linux".  That's
> > OK (it's depressingly familiar in drivers),
> 
> I am sorry that's not the case, the reason we have different design as I
> have mentioned above is because we want a generic mechanism which works
> for all/most of the GOS's out their and doesn't need to be specific to
> Linux.

Slightly confused now ... you're saying you did look at the transport
class and virtio?  But you chose not to do a virtio like interface (for
reasons which I'm still not clear on) ... I didn't manage to extract
anything about why no transport class from the foregoing.

James

> >  but now we get to figure out
> > what, if anything, makes sense from a SCSI control plane to a hypervisor
> > interface and whether this approach to hypervisor interfaces is better
> > or worse than virtio.
> 
> I guess these points are answered above. Let me know if their is still
> something amiss. 
> 
> Thanks,
> Alok
> 
> > 
> > James
> > 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2009-09-03 20:03 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-27 23:17 [PATCH] SCSI driver for VMware's virtual HBA Alok Kataria
2009-08-28  6:03 ` Rolf Eike Beer
2009-08-31 17:26   ` Alok Kataria
2009-08-31 18:51     ` Rolf Eike Beer
2009-08-31 21:54       ` Alok Kataria
2009-08-31 17:28 ` Alok Kataria
2009-08-31 18:00   ` James Bottomley
2009-08-31 21:53     ` Alok Kataria
2009-09-01 14:23       ` James Bottomley
2009-09-01 16:08         ` Alok Kataria
2009-09-01 16:13           ` Matthew Wilcox
2009-09-01 16:20             ` Boaz Harrosh
2009-09-01 16:47               ` Alok Kataria
2009-09-01 14:26       ` James Bottomley
2009-09-01 11:12     ` Bart Van Assche
2009-09-01 14:17       ` James Bottomley
2009-09-01 16:12       ` Roland Dreier
2009-09-01 16:16         ` Matthew Wilcox
2009-09-01 16:33           ` Dmitry Torokhov
2009-09-01 16:52             ` James Bottomley
2009-09-01 16:59               ` Alok Kataria
2009-09-01 17:25                 ` James Bottomley
2009-09-01 17:41                   ` Alok Kataria
2009-09-01 18:15                     ` James Bottomley
2009-09-02  2:55                       ` Alok Kataria
2009-09-02 15:06                         ` James Bottomley
2009-09-02 17:16                           ` Alok Kataria
2009-09-03 20:03                             ` James Bottomley [this message]
2009-09-03 20:31                               ` Dmitry Torokhov
2009-09-03 21:21                                 ` Ric Wheeler
2009-09-03 21:41                                   ` Dmitry Torokhov
2009-09-04  3:28                               ` Alok Kataria
2009-09-01 17:25               ` Roland Dreier
2009-09-01 17:40                 ` James Bottomley
2009-09-01 17:54                   ` Alok Kataria
2009-09-01 18:38                     ` Christoph Hellwig
2009-09-02  9:50                       ` Bart Van Assche
2009-09-01 16:34         ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1252008182.3941.61.camel@mulgrave.site \
    --to=james.bottomley@suse.de \
    --cc=akataria@vmware.com \
    --cc=akpm@linux-foundation.org \
    --cc=bvanassche@acm.org \
    --cc=dtor@vmware.com \
    --cc=eike-kernel@sf-tec.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=maustruy@vmware.com \
    --cc=michaelc@cs.wisc.edu \
    --cc=randy.dunlap@oracle.com \
    --cc=rdreier@cisco.com \
    --cc=robert.w.love@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox