virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found] <1252458903.24914.73.camel@ank32.eng.vmware.com>
@ 2009-09-09 21:00 ` Anthony Liguori
  2009-09-09 21:51   ` Alok Kataria
       [not found]   ` <1252533093.24633.40.camel@ank32.eng.vmware.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Anthony Liguori @ 2009-09-09 21:00 UTC (permalink / raw)
  To: akataria
  Cc: linux-scsi@vger.kernel.org, pv-drivers, LKML, virtualization,
	Chetan.Loke@Emulex.Com, James Bottomley, Rolf Eike Beer,
	Andrew Morton

Hi Alok,

Joining this a bit late as this was just brought to my attention.  It 
would have been nice to CC the virtualization mailing list.  Please do 
in future submissions.

Alok Kataria wrote:
> VMware PVSCSI driver - v4.
> 
/
>  
> diff --git a/drivers/scsi/pvscsi.h b/drivers/scsi/pvscsi.h
> new file mode 100644
> index 0000000..96bb655
> --- /dev/null
> +++ b/drivers/scsi/pvscsi.h
> @@ -0,0 +1,395 @@
> +/*
> + * VMware PVSCSI header file
> + *
> + * Copyright (C) 2008-2009, VMware, Inc. All Rights Reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by the
> + * Free Software Foundation; version 2 of the License and no later version.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT.  See the GNU General Public License for more
> + * details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
> + *
> + * Maintained by: Alok N Kataria <akataria@vmware.com>
> + *
> + */
> +
> +#ifndef _PVSCSI_H_
> +#define _PVSCSI_H_
> +
> +#include <linux/types.h>
> +
> +#define PVSCSI_DRIVER_VERSION_STRING   "1.0.0.0"
> +
> +#define PVSCSI_MAX_NUM_SG_ENTRIES_PER_SEGMENT 128
> +
> +#define MASK(n)        ((1 << (n)) - 1)        /* make an n-bit mask */
> +
> +#define PCI_VENDOR_ID_VMWARE		0x15AD
> +#define PCI_DEVICE_ID_VMWARE_PVSCSI	0x07C0
> +
> +/*
> + * host adapter status/error codes
> + */
> +typedef enum {
> +   BTSTAT_SUCCESS       = 0x00,  /* CCB complete normally with no errors */
> +   BTSTAT_LINKED_COMMAND_COMPLETED           = 0x0a,
> +   BTSTAT_LINKED_COMMAND_COMPLETED_WITH_FLAG = 0x0b,
> +   BTSTAT_DATA_UNDERRUN = 0x0c,
> +   BTSTAT_SELTIMEO      = 0x11,  /* SCSI selection timeout */
> +   BTSTAT_DATARUN       = 0x12,  /* data overrun/underrun */
> +   BTSTAT_BUSFREE       = 0x13,  /* unexpected bus free */
> +   BTSTAT_INVPHASE      = 0x14,  /* invalid bus phase or sequence requested by target */
> +   BTSTAT_LUNMISMATCH   = 0x17,  /* linked CCB has different LUN from first CCB */
> +   BTSTAT_SENSFAILED    = 0x1b,  /* auto request sense failed */
> +   BTSTAT_TAGREJECT     = 0x1c,  /* SCSI II tagged queueing message rejected by target */
> +   BTSTAT_BADMSG        = 0x1d,  /* unsupported message received by the host adapter */
> +   BTSTAT_HAHARDWARE    = 0x20,  /* host adapter hardware failed */
> +   BTSTAT_NORESPONSE    = 0x21,  /* target did not respond to SCSI ATN, sent a SCSI RST */
> +   BTSTAT_SENTRST       = 0x22,  /* host adapter asserted a SCSI RST */
> +   BTSTAT_RECVRST       = 0x23,  /* other SCSI devices asserted a SCSI RST */
> +   BTSTAT_DISCONNECT    = 0x24,  /* target device reconnected improperly (w/o tag) */
> +   BTSTAT_BUSRESET      = 0x25,  /* host adapter issued BUS device reset */
> +   BTSTAT_ABORTQUEUE    = 0x26,  /* abort queue generated */
> +   BTSTAT_HASOFTWARE    = 0x27,  /* host adapter software error */
> +   BTSTAT_HATIMEOUT     = 0x30,  /* host adapter hardware timeout error */
> +   BTSTAT_SCSIPARITY    = 0x34,  /* SCSI parity error detected */
> +} HostBusAdapterStatus;
> +
> +/*
> + * Register offsets.
> + *
> + * These registers are accessible both via i/o space and mm i/o.
> + */
> +
> +enum PVSCSIRegOffset {
> +	PVSCSI_REG_OFFSET_COMMAND        =    0x0,
> +	PVSCSI_REG_OFFSET_COMMAND_DATA   =    0x4,
> +	PVSCSI_REG_OFFSET_COMMAND_STATUS =    0x8,
> +	PVSCSI_REG_OFFSET_LAST_STS_0     =  0x100,
> +	PVSCSI_REG_OFFSET_LAST_STS_1     =  0x104,
> +	PVSCSI_REG_OFFSET_LAST_STS_2     =  0x108,
> +	PVSCSI_REG_OFFSET_LAST_STS_3     =  0x10c,
> +	PVSCSI_REG_OFFSET_INTR_STATUS    = 0x100c,
> +	PVSCSI_REG_OFFSET_INTR_MASK      = 0x2010,
> +	PVSCSI_REG_OFFSET_KICK_NON_RW_IO = 0x3014,
> +	PVSCSI_REG_OFFSET_DEBUG          = 0x3018,
> +	PVSCSI_REG_OFFSET_KICK_RW_IO     = 0x4018,
> +};
> +
> +/*
> + * Virtual h/w commands.
> + */
> +
> +enum PVSCSICommands {
> +	PVSCSI_CMD_FIRST             = 0, /* has to be first */
> +
> +	PVSCSI_CMD_ADAPTER_RESET     = 1,
> +	PVSCSI_CMD_ISSUE_SCSI        = 2,
> +	PVSCSI_CMD_SETUP_RINGS       = 3,
> +	PVSCSI_CMD_RESET_BUS         = 4,
> +	PVSCSI_CMD_RESET_DEVICE      = 5,
> +	PVSCSI_CMD_ABORT_CMD         = 6,
> +	PVSCSI_CMD_CONFIG            = 7,
> +	PVSCSI_CMD_SETUP_MSG_RING    = 8,
> +	PVSCSI_CMD_DEVICE_UNPLUG     = 9,
> +
> +	PVSCSI_CMD_LAST              = 10  /* has to be last */
> +};
> +
> +/*
> + * Command descriptor for PVSCSI_CMD_RESET_DEVICE --
> + */
> +
> +typedef struct PVSCSICmdDescResetDevice {
> +	u32	target;
> +	u8	lun[8];
> +} __packed PVSCSICmdDescResetDevice;
> +
> +/*
> + * Command descriptor for PVSCSI_CMD_ABORT_CMD --
> + *
> + * - currently does not support specifying the LUN.
> + * - _pad should be 0.
> + */
> +
> +typedef struct PVSCSICmdDescAbortCmd {
> +	u64	context;
> +	u32	target;
> +	u32	_pad;
> +} __packed PVSCSICmdDescAbortCmd;
> +
> +/*
> + * Command descriptor for PVSCSI_CMD_SETUP_RINGS --
> + *
> + * Notes:
> + * - reqRingNumPages and cmpRingNumPages need to be power of two.
> + * - reqRingNumPages and cmpRingNumPages need to be different from 0,
> + * - reqRingNumPages and cmpRingNumPages need to be inferior to
> + *   PVSCSI_SETUP_RINGS_MAX_NUM_PAGES.
> + */
> +
> +#define PVSCSI_SETUP_RINGS_MAX_NUM_PAGES        32
> +typedef struct PVSCSICmdDescSetupRings {
> +	u32	reqRingNumPages;
> +	u32	cmpRingNumPages;
> +	u64	ringsStatePPN;
> +	u64	reqRingPPNs[PVSCSI_SETUP_RINGS_MAX_NUM_PAGES];
> +	u64	cmpRingPPNs[PVSCSI_SETUP_RINGS_MAX_NUM_PAGES];
> +} __packed PVSCSICmdDescSetupRings;
> +
> +/*
> + * Command descriptor for PVSCSI_CMD_SETUP_MSG_RING --
> + *
> + * Notes:
> + * - this command was not supported in the initial revision of the h/w
> + *   interface. Before using it, you need to check that it is supported by
> + *   writing PVSCSI_CMD_SETUP_MSG_RING to the 'command' register, then
> + *   immediately after read the 'command status' register:
> + *       * a value of -1 means that the cmd is NOT supported,
> + *       * a value != -1 means that the cmd IS supported.
> + *   If it's supported the 'command status' register should return:
> + *      sizeof(PVSCSICmdDescSetupMsgRing) / sizeof(u32).
> + * - this command should be issued _after_ the usual SETUP_RINGS so that the
> + *   RingsState page is already setup. If not, the command is a nop.
> + * - numPages needs to be a power of two,
> + * - numPages needs to be different from 0,
> + * - _pad should be zero.
> + */
> +
> +#define PVSCSI_SETUP_MSG_RING_MAX_NUM_PAGES  16
> +
> +typedef struct PVSCSICmdDescSetupMsgRing {
> +	u32	numPages;
> +	u32	_pad;
> +	u64	ringPPNs[PVSCSI_SETUP_MSG_RING_MAX_NUM_PAGES];
> +} __packed PVSCSICmdDescSetupMsgRing;
> +
> +enum PVSCSIMsgType {
> +	PVSCSI_MSG_DEV_ADDED          = 0,
> +	PVSCSI_MSG_DEV_REMOVED        = 1,
> +	PVSCSI_MSG_LAST               = 2,
> +};
> +
> +/*
> + * Msg descriptor.
> + *
> + * sizeof(struct PVSCSIRingMsgDesc) == 128.
> + *
> + * - type is of type enum PVSCSIMsgType.
> + * - the content of args depend on the type of event being delivered.
> + */
> +
> +typedef struct PVSCSIRingMsgDesc {
> +	u32	type;
> +	u32	args[31];
> +} __packed PVSCSIRingMsgDesc;
> +
> +typedef struct PVSCSIMsgDescDevStatusChanged {
> +	u32	type;  /* PVSCSI_MSG_DEV _ADDED / _REMOVED */
> +	u32	bus;
> +	u32	target;
> +	u8	lun[8];
> +	u32	pad[27];
> +} __packed PVSCSIMsgDescDevStatusChanged;
> +
> +/*
> + * Rings state.
> + *
> + * - the fields:
> + *    . msgProdIdx,
> + *    . msgConsIdx,
> + *    . msgNumEntriesLog2,
> + *   .. are only used once the SETUP_MSG_RING cmd has been issued.
> + * - '_pad' helps to ensure that the msg related fields are on their own
> + *   cache-line.
> + */
> +
> +typedef struct PVSCSIRingsState {
> +	u32	reqProdIdx;
> +	u32	reqConsIdx;
> +	u32	reqNumEntriesLog2;
> +
> +	u32	cmpProdIdx;
> +	u32	cmpConsIdx;
> +	u32	cmpNumEntriesLog2;
> +
> +	u8	_pad[104];
> +
> +	u32	msgProdIdx;
> +	u32	msgConsIdx;
> +	u32	msgNumEntriesLog2;
> +} __packed PVSCSIRingsState;

All of this can be hidden behind a struct virtqueue.  You could then 
introduce a virtio-vmwring that implemented this ABI.

You could then separate out the actual scsi logic into a separate 
virtio-scsi.c driver.

There are numerous advantages to this layering.  You get to remain ABI 
compatible with yourself.  You can potentially reuse the ring logic in 
other drivers.  Other VMMs can use different ring transports without 
introducing new drivers.  For instance, in KVM, we would use virtio-pci 
+ virtio-scsi instead of using virtio-vmwring + virtio-scsi.

The virtio infrastructure has been backported to various old kernels too 
so there really is no reason not to use it.

We have the interfaces in virtio to do notification suppression and MSI 
so I don't think there's any real limitation that it would impose on you.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
  2009-09-09 21:00 ` [PATCH] SCSI driver for VMware's virtual HBA - V4 Anthony Liguori
@ 2009-09-09 21:51   ` Alok Kataria
       [not found]   ` <1252533093.24633.40.camel@ank32.eng.vmware.com>
  1 sibling, 0 replies; 10+ messages in thread
From: Alok Kataria @ 2009-09-09 21:51 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Chetan.Loke@Emulex.Com, James Bottomley,
	Rolf Eike Beer, Andrew Morton

Hi Anthony,

On Wed, 2009-09-09 at 14:00 -0700, Anthony Liguori wrote:
> Hi Alok,
> 
> Joining this a bit late as this was just brought to my attention.  It 
> would have been nice to CC the virtualization mailing list.  Please do 
> in future submissions.

Sure.

> 
> Alok Kataria wrote:
> > VMware PVSCSI driver - v4.
> > 
> /
> >  
> > diff --git a/drivers/scsi/pvscsi.h b/drivers/scsi/pvscsi.h
> > new file mode 100644
> > index 0000000..96bb655
> > --- /dev/null
> > +++ b/drivers/scsi/pvscsi.h
> > @@ -0,0 +1,395 @@
> > +/*
> > + * VMware PVSCSI header file
> > + *
> > + * Copyright (C) 2008-2009, VMware, Inc. All Rights Reserved.
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms of the GNU General Public License as published by the
> > + * Free Software Foundation; version 2 of the License and no later version.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> > + * NON INFRINGEMENT.  See the GNU General Public License for more
> > + * details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
> > + *
> > + * Maintained by: Alok N Kataria <akataria@vmware.com>
> > + *
> > + */
> > +
> > +#ifndef _PVSCSI_H_
> > +#define _PVSCSI_H_
> > +
> > +#include <linux/types.h>
> > +
> > +#define PVSCSI_DRIVER_VERSION_STRING   "1.0.0.0"
> > +
> > +#define PVSCSI_MAX_NUM_SG_ENTRIES_PER_SEGMENT 128
> > +
> > +#define MASK(n)        ((1 << (n)) - 1)        /* make an n-bit mask */
> > +
> > +#define PCI_VENDOR_ID_VMWARE		0x15AD
> > +#define PCI_DEVICE_ID_VMWARE_PVSCSI	0x07C0
> > +
> > +/*
> > + * host adapter status/error codes
> > + */
> > +typedef enum {
> > +   BTSTAT_SUCCESS       = 0x00,  /* CCB complete normally with no errors */
> > +   BTSTAT_LINKED_COMMAND_COMPLETED           = 0x0a,
> > +   BTSTAT_LINKED_COMMAND_COMPLETED_WITH_FLAG = 0x0b,
> > +   BTSTAT_DATA_UNDERRUN = 0x0c,
> > +   BTSTAT_SELTIMEO      = 0x11,  /* SCSI selection timeout */
> > +   BTSTAT_DATARUN       = 0x12,  /* data overrun/underrun */
> > +   BTSTAT_BUSFREE       = 0x13,  /* unexpected bus free */
> > +   BTSTAT_INVPHASE      = 0x14,  /* invalid bus phase or sequence requested by target */
> > +   BTSTAT_LUNMISMATCH   = 0x17,  /* linked CCB has different LUN from first CCB */
> > +   BTSTAT_SENSFAILED    = 0x1b,  /* auto request sense failed */
> > +   BTSTAT_TAGREJECT     = 0x1c,  /* SCSI II tagged queueing message rejected by target */
> > +   BTSTAT_BADMSG        = 0x1d,  /* unsupported message received by the host adapter */
> > +   BTSTAT_HAHARDWARE    = 0x20,  /* host adapter hardware failed */
> > +   BTSTAT_NORESPONSE    = 0x21,  /* target did not respond to SCSI ATN, sent a SCSI RST */
> > +   BTSTAT_SENTRST       = 0x22,  /* host adapter asserted a SCSI RST */
> > +   BTSTAT_RECVRST       = 0x23,  /* other SCSI devices asserted a SCSI RST */
> > +   BTSTAT_DISCONNECT    = 0x24,  /* target device reconnected improperly (w/o tag) */
> > +   BTSTAT_BUSRESET      = 0x25,  /* host adapter issued BUS device reset */
> > +   BTSTAT_ABORTQUEUE    = 0x26,  /* abort queue generated */
> > +   BTSTAT_HASOFTWARE    = 0x27,  /* host adapter software error */
> > +   BTSTAT_HATIMEOUT     = 0x30,  /* host adapter hardware timeout error */
> > +   BTSTAT_SCSIPARITY    = 0x34,  /* SCSI parity error detected */
> > +} HostBusAdapterStatus;
> > +
> > +/*
> > + * Register offsets.
> > + *
> > + * These registers are accessible both via i/o space and mm i/o.
> > + */
> > +
> > +enum PVSCSIRegOffset {
> > +	PVSCSI_REG_OFFSET_COMMAND        =    0x0,
> > +	PVSCSI_REG_OFFSET_COMMAND_DATA   =    0x4,
> > +	PVSCSI_REG_OFFSET_COMMAND_STATUS =    0x8,
> > +	PVSCSI_REG_OFFSET_LAST_STS_0     =  0x100,
> > +	PVSCSI_REG_OFFSET_LAST_STS_1     =  0x104,
> > +	PVSCSI_REG_OFFSET_LAST_STS_2     =  0x108,
> > +	PVSCSI_REG_OFFSET_LAST_STS_3     =  0x10c,
> > +	PVSCSI_REG_OFFSET_INTR_STATUS    = 0x100c,
> > +	PVSCSI_REG_OFFSET_INTR_MASK      = 0x2010,
> > +	PVSCSI_REG_OFFSET_KICK_NON_RW_IO = 0x3014,
> > +	PVSCSI_REG_OFFSET_DEBUG          = 0x3018,
> > +	PVSCSI_REG_OFFSET_KICK_RW_IO     = 0x4018,
> > +};
> > +
> > +/*
> > + * Virtual h/w commands.
> > + */
> > +
> > +enum PVSCSICommands {
> > +	PVSCSI_CMD_FIRST             = 0, /* has to be first */
> > +
> > +	PVSCSI_CMD_ADAPTER_RESET     = 1,
> > +	PVSCSI_CMD_ISSUE_SCSI        = 2,
> > +	PVSCSI_CMD_SETUP_RINGS       = 3,
> > +	PVSCSI_CMD_RESET_BUS         = 4,
> > +	PVSCSI_CMD_RESET_DEVICE      = 5,
> > +	PVSCSI_CMD_ABORT_CMD         = 6,
> > +	PVSCSI_CMD_CONFIG            = 7,
> > +	PVSCSI_CMD_SETUP_MSG_RING    = 8,
> > +	PVSCSI_CMD_DEVICE_UNPLUG     = 9,
> > +
> > +	PVSCSI_CMD_LAST              = 10  /* has to be last */
> > +};
> > +
> > +/*
> > + * Command descriptor for PVSCSI_CMD_RESET_DEVICE --
> > + */
> > +
> > +typedef struct PVSCSICmdDescResetDevice {
> > +	u32	target;
> > +	u8	lun[8];
> > +} __packed PVSCSICmdDescResetDevice;
> > +
> > +/*
> > + * Command descriptor for PVSCSI_CMD_ABORT_CMD --
> > + *
> > + * - currently does not support specifying the LUN.
> > + * - _pad should be 0.
> > + */
> > +
> > +typedef struct PVSCSICmdDescAbortCmd {
> > +	u64	context;
> > +	u32	target;
> > +	u32	_pad;
> > +} __packed PVSCSICmdDescAbortCmd;
> > +
> > +/*
> > + * Command descriptor for PVSCSI_CMD_SETUP_RINGS --
> > + *
> > + * Notes:
> > + * - reqRingNumPages and cmpRingNumPages need to be power of two.
> > + * - reqRingNumPages and cmpRingNumPages need to be different from 0,
> > + * - reqRingNumPages and cmpRingNumPages need to be inferior to
> > + *   PVSCSI_SETUP_RINGS_MAX_NUM_PAGES.
> > + */
> > +
> > +#define PVSCSI_SETUP_RINGS_MAX_NUM_PAGES        32
> > +typedef struct PVSCSICmdDescSetupRings {
> > +	u32	reqRingNumPages;
> > +	u32	cmpRingNumPages;
> > +	u64	ringsStatePPN;
> > +	u64	reqRingPPNs[PVSCSI_SETUP_RINGS_MAX_NUM_PAGES];
> > +	u64	cmpRingPPNs[PVSCSI_SETUP_RINGS_MAX_NUM_PAGES];
> > +} __packed PVSCSICmdDescSetupRings;
> > +
> > +/*
> > + * Command descriptor for PVSCSI_CMD_SETUP_MSG_RING --
> > + *
> > + * Notes:
> > + * - this command was not supported in the initial revision of the h/w
> > + *   interface. Before using it, you need to check that it is supported by
> > + *   writing PVSCSI_CMD_SETUP_MSG_RING to the 'command' register, then
> > + *   immediately after read the 'command status' register:
> > + *       * a value of -1 means that the cmd is NOT supported,
> > + *       * a value != -1 means that the cmd IS supported.
> > + *   If it's supported the 'command status' register should return:
> > + *      sizeof(PVSCSICmdDescSetupMsgRing) / sizeof(u32).
> > + * - this command should be issued _after_ the usual SETUP_RINGS so that the
> > + *   RingsState page is already setup. If not, the command is a nop.
> > + * - numPages needs to be a power of two,
> > + * - numPages needs to be different from 0,
> > + * - _pad should be zero.
> > + */
> > +
> > +#define PVSCSI_SETUP_MSG_RING_MAX_NUM_PAGES  16
> > +
> > +typedef struct PVSCSICmdDescSetupMsgRing {
> > +	u32	numPages;
> > +	u32	_pad;
> > +	u64	ringPPNs[PVSCSI_SETUP_MSG_RING_MAX_NUM_PAGES];
> > +} __packed PVSCSICmdDescSetupMsgRing;
> > +
> > +enum PVSCSIMsgType {
> > +	PVSCSI_MSG_DEV_ADDED          = 0,
> > +	PVSCSI_MSG_DEV_REMOVED        = 1,
> > +	PVSCSI_MSG_LAST               = 2,
> > +};
> > +
> > +/*
> > + * Msg descriptor.
> > + *
> > + * sizeof(struct PVSCSIRingMsgDesc) == 128.
> > + *
> > + * - type is of type enum PVSCSIMsgType.
> > + * - the content of args depend on the type of event being delivered.
> > + */
> > +
> > +typedef struct PVSCSIRingMsgDesc {
> > +	u32	type;
> > +	u32	args[31];
> > +} __packed PVSCSIRingMsgDesc;
> > +
> > +typedef struct PVSCSIMsgDescDevStatusChanged {
> > +	u32	type;  /* PVSCSI_MSG_DEV _ADDED / _REMOVED */
> > +	u32	bus;
> > +	u32	target;
> > +	u8	lun[8];
> > +	u32	pad[27];
> > +} __packed PVSCSIMsgDescDevStatusChanged;
> > +
> > +/*
> > + * Rings state.
> > + *
> > + * - the fields:
> > + *    . msgProdIdx,
> > + *    . msgConsIdx,
> > + *    . msgNumEntriesLog2,
> > + *   .. are only used once the SETUP_MSG_RING cmd has been issued.
> > + * - '_pad' helps to ensure that the msg related fields are on their own
> > + *   cache-line.
> > + */
> > +
> > +typedef struct PVSCSIRingsState {
> > +	u32	reqProdIdx;
> > +	u32	reqConsIdx;
> > +	u32	reqNumEntriesLog2;
> > +
> > +	u32	cmpProdIdx;
> > +	u32	cmpConsIdx;
> > +	u32	cmpNumEntriesLog2;
> > +
> > +	u8	_pad[104];
> > +
> > +	u32	msgProdIdx;
> > +	u32	msgConsIdx;
> > +	u32	msgNumEntriesLog2;
> > +} __packed PVSCSIRingsState;
> 
> All of this can be hidden behind a struct virtqueue.  You could then 
> introduce a virtio-vmwring that implemented this ABI.
> 
> You could then separate out the actual scsi logic into a separate 
> virtio-scsi.c driver.
> 
> There are numerous advantages to this layering.  You get to remain ABI 
> compatible with yourself.  You can potentially reuse the ring logic in 
> other drivers.  Other VMMs can use different ring transports without 
> introducing new drivers.  For instance, in KVM, we would use virtio-pci 
> + virtio-scsi instead of using virtio-vmwring + virtio-scsi.

I see your point, but the ring logic or the ABI that we use to
communicate between the hypervisor and guest is not shared between our
storage and network drivers. As a result, I don't see any benefit of
separating out this ring handling mechanism, on the contrary it might
just add some overhead of translating between various layers for our
SCSI driver.

Having said that, I will like to add that yes if in some future
iteration of our paravirtualized drivers, if we decide to share this
ring mechanism for our various device drivers this might be an
interesting proposition. 

Thanks,
Alok

> 
> The virtio infrastructure has been backported to various old kernels too 
> so there really is no reason not to use it.
> 
> We have the interfaces in virtio to do notification suppression and MSI 
> so I don't think there's any real limitation that it would impose on you.
> 
> Regards,
> 
> Anthony Liguori

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]   ` <1252533093.24633.40.camel@ank32.eng.vmware.com>
@ 2009-09-09 22:12     ` Anthony Liguori
       [not found]     ` <4AA8284A.3010908@codemonkey.ws>
  1 sibling, 0 replies; 10+ messages in thread
From: Anthony Liguori @ 2009-09-09 22:12 UTC (permalink / raw)
  To: akataria
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Chetan.Loke@Emulex.Com, James Bottomley,
	Rolf Eike Beer, Andrew Morton

Alok Kataria wrote:
> I see your point, but the ring logic or the ABI that we use to
> communicate between the hypervisor and guest is not shared between our
> storage and network drivers. As a result, I don't see any benefit of
> separating out this ring handling mechanism, on the contrary it might
> just add some overhead of translating between various layers for our
> SCSI driver.
>   

But if you separate out the ring logic, it allows the scsi logic to be 
shared by other paravirtual device drivers.  This is significant and 
important from a Linux point of view.

There is almost nothing vmware specific about the vast majority of your 
code.  If you split out the scsi logic, then you will receive help 
debugging, adding future features, and improving performance from other 
folks interested in this.  In the long term, it will make your life 
much, much easier by making the driver relevant to a wider audience :-)

> Having said that, I will like to add that yes if in some future
> iteration of our paravirtualized drivers, if we decide to share this
> ring mechanism for our various device drivers this might be an
> interesting proposition. 
>   

I am certainly not the block subsystem maintainer, but I'd hate to see 
this merged without any attempt at making the code more reusable.  If 
adding the virtio layering is going to somehow hurt performance, break 
your ABI, or in some way limit you, then that's something to certainly 
discuss and would be valid concerns.  That said, I don't think it's a 
huge change to your current patch and I don't see any obvious problems 
it would cause.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]     ` <4AA8284A.3010908@codemonkey.ws>
@ 2009-09-09 23:08       ` Christoph Hellwig
       [not found]       ` <20090909230854.GA4495@infradead.org>
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2009-09-09 23:08 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Andrew Morton, linux-scsi@vger.kernel.org, pv-drivers@vmware.com,
	LKML, virtualization, Chetan.Loke@Emulex.Com, James Bottomley,
	akataria, Rolf Eike Beer

On Wed, Sep 09, 2009 at 05:12:26PM -0500, Anthony Liguori wrote:
> Alok Kataria wrote:
>> I see your point, but the ring logic or the ABI that we use to
>> communicate between the hypervisor and guest is not shared between our
>> storage and network drivers. As a result, I don't see any benefit of
>> separating out this ring handling mechanism, on the contrary it might
>> just add some overhead of translating between various layers for our
>> SCSI driver.
>>   
>
> But if you separate out the ring logic, it allows the scsi logic to be  
> shared by other paravirtual device drivers.  This is significant and  
> important from a Linux point of view.

As someone who has been hacking on a virtio scsi prototype I don't think
it's a good idea.  The vmware driver is a horrible design and I don't
think it should be merged.  Besides beeing a ugly driver and ABI we
really should not support this kind of closed protocol development.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]       ` <20090909230854.GA4495@infradead.org>
@ 2009-09-09 23:34         ` Anthony Liguori
       [not found]         ` <4AA83B88.2020104@codemonkey.ws>
  2009-09-10 23:43         ` Alok Kataria
  2 siblings, 0 replies; 10+ messages in thread
From: Anthony Liguori @ 2009-09-09 23:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, linux-scsi@vger.kernel.org, pv-drivers@vmware.com,
	LKML, virtualization, Chetan.Loke@Emulex.Com, James Bottomley,
	akataria, Rolf Eike Beer

Christoph Hellwig wrote:
> On Wed, Sep 09, 2009 at 05:12:26PM -0500, Anthony Liguori wrote:
>   
>> Alok Kataria wrote:
>>     
>>> I see your point, but the ring logic or the ABI that we use to
>>> communicate between the hypervisor and guest is not shared between our
>>> storage and network drivers. As a result, I don't see any benefit of
>>> separating out this ring handling mechanism, on the contrary it might
>>> just add some overhead of translating between various layers for our
>>> SCSI driver.
>>>   
>>>       
>> But if you separate out the ring logic, it allows the scsi logic to be  
>> shared by other paravirtual device drivers.  This is significant and  
>> important from a Linux point of view.
>>     
>
> As someone who has been hacking on a virtio scsi prototype I don't think
> it's a good idea.  The vmware driver is a horrible design and I don't
> think it should be merged.

What are the issues with the design compared to how you're approaching 
virtio-scsi?

>   Besides beeing a ugly driver and ABI we
> really should not support this kind of closed protocol development.
>   

I don't see how a VMM that doesn't share the source code for it's 
backends or doesn't implement standard ABIs is any different than the 
hundreds of hardware vendors that behave exactly the same way.

We haven't even been successful in getting the Xen folks to present 
their work on lkml before shipping it to their users.  Why would we 
expect more from VMware if we're willing to merge the Xen stuff?

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]         ` <4AA83B88.2020104@codemonkey.ws>
@ 2009-09-09 23:54           ` Jeremy Fitzhardinge
       [not found]           ` <4AA84020.6040504@goop.org>
  1 sibling, 0 replies; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2009-09-09 23:54 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Christoph Hellwig, Chetan.Loke@Emulex.Com,
	Rolf Eike Beer, James Bottomley, Andrew Morton, akataria

On 09/09/09 16:34, Anthony Liguori wrote:
> We haven't even been successful in getting the Xen folks to present 
> their work on lkml before shipping it to their users.  Why would we 
> expect more from VMware if we're willing to merge the Xen stuff?
>   

The Xen code may be out of tree, but it has always been readily
available from a well-known place.  I don't think its comparable.

    J

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]           ` <4AA84020.6040504@goop.org>
@ 2009-09-10  0:11             ` Anthony Liguori
  0 siblings, 0 replies; 10+ messages in thread
From: Anthony Liguori @ 2009-09-10  0:11 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Christoph Hellwig, Chetan.Loke@Emulex.Com,
	Rolf Eike Beer, James Bottomley, Andrew Morton, akataria

Jeremy Fitzhardinge wrote:
> On 09/09/09 16:34, Anthony Liguori wrote:
>   
>> We haven't even been successful in getting the Xen folks to present 
>> their work on lkml before shipping it to their users.  Why would we 
>> expect more from VMware if we're willing to merge the Xen stuff?
>>   
>>     
>
> The Xen code may be out of tree, but it has always been readily
> available from a well-known place.  I don't think its comparable.
>   

Once an ABI is set in stone, there's very little that can be done on 
lkml to review the ABI in any serious way.

VMware has repeatedly done this in the past.  Ship a product with their 
own drivers, then submit something to lkml saying that the ABI cannot be 
changed because it's present in a shipping product.

I'll admit, Xen hasn't been as bad as VMware in this regard, but it's 
certainly not perfect either.

Regards,

Anthony Liguori

>     J
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]     ` <4AA8284A.3010908@codemonkey.ws>
  2009-09-09 23:08       ` Christoph Hellwig
       [not found]       ` <20090909230854.GA4495@infradead.org>
@ 2009-09-10 23:43       ` Alok Kataria
       [not found]       ` <1252626223.3899.64.camel@ank32.eng.vmware.com>
  3 siblings, 0 replies; 10+ messages in thread
From: Alok Kataria @ 2009-09-10 23:43 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Chetan.Loke@Emulex.Com, James Bottomley,
	Rolf Eike Beer, Andrew Morton

Hi Anthony,

On Wed, 2009-09-09 at 15:12 -0700, Anthony Liguori wrote:
> Alok Kataria wrote:
> > I see your point, but the ring logic or the ABI that we use to
> > communicate between the hypervisor and guest is not shared between our
> > storage and network drivers. As a result, I don't see any benefit of
> > separating out this ring handling mechanism, on the contrary it might
> > just add some overhead of translating between various layers for our
> > SCSI driver.
> >   
> 
> But if you separate out the ring logic, it allows the scsi logic to be 
> shared by other paravirtual device drivers.  This is significant and 
> important from a Linux point of view.
> 
> There is almost nothing vmware specific about the vast majority of your 
> code.  If you split out the scsi logic, then you will receive help 
> debugging, adding future features, and improving performance from other 
> folks interested in this.  In the long term, it will make your life 
> much, much easier by making the driver relevant to a wider audience :-)

From what you are saying, it seems that for that matter any physical
SCSI HBA's driver could be converted to use the virtio interface;
doesn't each and every driver have something like a ring/queue & I/O
register mechanism to talk to the device ?

Also, why would you add overhead for translation layers between APIs or
data structures just for the sake of it ? I guess you would say it helps
code re-usability, but I fail to see how much of a benefit that is. The
vast majority of the 1500 odd lines of the driver are still very
specific and tied to our PCI device and register interface.

I will just like to re-iterate this once again, this driver should be  
treated no different than a hardware SCSI HBA driver, albeit a very  
simple HBA. We export a PCI device as any other physical HBA, also the
driver talks to the device (emulation) through device IO regisers
without any hypercalls.

As for the virt_scsi layer, we can evaluate it whenever it is ready for
upstream and then take a more informed decision whether we should switch
to using it.

> 
> > Having said that, I will like to add that yes if in some future
> > iteration of our paravirtualized drivers, if we decide to share this
> > ring mechanism for our various device drivers this might be an
> > interesting proposition. 
> >   
> 
> I am certainly not the block subsystem maintainer, but I'd hate to see 
> this merged without any attempt at making the code more reusable.  If 
> adding the virtio layering is going to somehow hurt performance, break 
> your ABI, or in some way limit you, then that's something to certainly 
> discuss and would be valid concerns.  That said, I don't think it's a 
> huge change to your current patch and I don't see any obvious problems 
> it would cause.
> 

I will also like to add that, this is just a driver and is isolated from
the rest of the core of the kernel. The driver is not doing anything
improper either and is using the SCSI stack the way that any other SCSI
driver would use it. 
In earlier cases, when there were changes to the core kernel parts (e.g.
VMI, hypervisor_init- the tsc freq part) VMware did work with the
community to come up with generic interfaces. 

In this case though, I don't think the advantages of using the virtio
interfaces are justified as yet. As and when the virt-scsi layer is
implemented we can re-evaluate our design and use that layer instead.

Holding inclusion of pvscsi driver until the development of virt-scsi
interface is completed doesn't sound right to me. 

Thanks,
Alok

> Regards,
> 
> Anthony Liguori

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]       ` <20090909230854.GA4495@infradead.org>
  2009-09-09 23:34         ` Anthony Liguori
       [not found]         ` <4AA83B88.2020104@codemonkey.ws>
@ 2009-09-10 23:43         ` Alok Kataria
  2 siblings, 0 replies; 10+ messages in thread
From: Alok Kataria @ 2009-09-10 23:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Chetan.Loke@Emulex.Com, James Bottomley,
	Anthony Liguori, Rolf Eike Beer, Andrew Morton

Hi Christoph, 

On Wed, 2009-09-09 at 16:08 -0700, Christoph Hellwig wrote:
> On Wed, Sep 09, 2009 at 05:12:26PM -0500, Anthony Liguori wrote:
> > Alok Kataria wrote:
> >> I see your point, but the ring logic or the ABI that we use to
> >> communicate between the hypervisor and guest is not shared between our
> >> storage and network drivers. As a result, I don't see any benefit of
> >> separating out this ring handling mechanism, on the contrary it might
> >> just add some overhead of translating between various layers for our
> >> SCSI driver.
> >>   
> >
> > But if you separate out the ring logic, it allows the scsi logic to be  
> > shared by other paravirtual device drivers.  This is significant and  
> > important from a Linux point of view.
> 
> As someone who has been hacking on a virtio scsi prototype I don't think
> it's a good idea.

Can I take a look at your work on virtio scsi ?

>   The vmware driver is a horrible design and I don't
> think it should be merged. 

Can you be more specific ? What problems do you see in the driver or the
ABI.

>  Besides beeing a ugly driver and ABI we
> really should not support this kind of closed protocol development.
> 

Alok

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Pv-drivers] [PATCH] SCSI driver for VMware's virtual HBA - V4.
       [not found]       ` <1252626223.3899.64.camel@ank32.eng.vmware.com>
@ 2009-09-14  3:05         ` Alok Kataria
  0 siblings, 0 replies; 10+ messages in thread
From: Alok Kataria @ 2009-09-14  3:05 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi@vger.kernel.org, pv-drivers@vmware.com, LKML,
	virtualization, Chetan.Loke@Emulex.Com, Anthony Liguori,
	Rolf Eike Beer, Andrew Morton

On Thu, 2009-09-10 at 16:43 -0700, Alok Kataria wrote:
> Hi Anthony,
> 
> On Wed, 2009-09-09 at 15:12 -0700, Anthony Liguori wrote:
> > Alok Kataria wrote:
> > > I see your point, but the ring logic or the ABI that we use to
> > > communicate between the hypervisor and guest is not shared between our
> > > storage and network drivers. As a result, I don't see any benefit of
> > > separating out this ring handling mechanism, on the contrary it might
> > > just add some overhead of translating between various layers for our
> > > SCSI driver.
> > >   
> > 
> > But if you separate out the ring logic, it allows the scsi logic to be 
> > shared by other paravirtual device drivers.  This is significant and 
> > important from a Linux point of view.
> > 
> > There is almost nothing vmware specific about the vast majority of your 
> > code.  If you split out the scsi logic, then you will receive help 
> > debugging, adding future features, and improving performance from other 
> > folks interested in this.  In the long term, it will make your life 
> > much, much easier by making the driver relevant to a wider audience :-)
> 
> >From what you are saying, it seems that for that matter any physical
> SCSI HBA's driver could be converted to use the virtio interface;
> doesn't each and every driver have something like a ring/queue & I/O
> register mechanism to talk to the device ?
> 
> Also, why would you add overhead for translation layers between APIs or
> data structures just for the sake of it ? I guess you would say it helps
> code re-usability, but I fail to see how much of a benefit that is. The
> vast majority of the 1500 odd lines of the driver are still very
> specific and tied to our PCI device and register interface.
> 
> I will just like to re-iterate this once again, this driver should be  
> treated no different than a hardware SCSI HBA driver, albeit a very  
> simple HBA. We export a PCI device as any other physical HBA, also the
> driver talks to the device (emulation) through device IO regisers
> without any hypercalls.
> 
> As for the virt_scsi layer, we can evaluate it whenever it is ready for
> upstream and then take a more informed decision whether we should switch
> to using it.
> 
> > 
> > > Having said that, I will like to add that yes if in some future
> > > iteration of our paravirtualized drivers, if we decide to share this
> > > ring mechanism for our various device drivers this might be an
> > > interesting proposition. 
> > >   
> > 
> > I am certainly not the block subsystem maintainer, but I'd hate to see 
> > this merged without any attempt at making the code more reusable.  If 
> > adding the virtio layering is going to somehow hurt performance, break 
> > your ABI, or in some way limit you, then that's something to certainly 
> > discuss and would be valid concerns.  That said, I don't think it's a 
> > huge change to your current patch and I don't see any obvious problems 
> > it would cause.
> > 
> 
> I will also like to add that, this is just a driver and is isolated from
> the rest of the core of the kernel. The driver is not doing anything
> improper either and is using the SCSI stack the way that any other SCSI
> driver would use it. 
> In earlier cases, when there were changes to the core kernel parts (e.g.
> VMI, hypervisor_init- the tsc freq part) VMware did work with the
> community to come up with generic interfaces. 
> 
> In this case though, I don't think the advantages of using the virtio
> interfaces are justified as yet. As and when the virt-scsi layer is
> implemented we can re-evaluate our design and use that layer instead.
> 
> Holding inclusion of pvscsi driver until the development of virt-scsi
> interface is completed doesn't sound right to me. 
> 

Hi James, 

Please let us know your views on how should we proceed on this ? In my
previous mail, I have tried to state my reservations against going the
virt-scsi way just as yet. Also please note that we emulate a pure PCI
device hence I don't see a reason why our driver should be treated any
different than a driver for a physical HBA. 

Thanks,
Alok

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-09-14  3:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1252458903.24914.73.camel@ank32.eng.vmware.com>
2009-09-09 21:00 ` [PATCH] SCSI driver for VMware's virtual HBA - V4 Anthony Liguori
2009-09-09 21:51   ` Alok Kataria
     [not found]   ` <1252533093.24633.40.camel@ank32.eng.vmware.com>
2009-09-09 22:12     ` Anthony Liguori
     [not found]     ` <4AA8284A.3010908@codemonkey.ws>
2009-09-09 23:08       ` Christoph Hellwig
     [not found]       ` <20090909230854.GA4495@infradead.org>
2009-09-09 23:34         ` Anthony Liguori
     [not found]         ` <4AA83B88.2020104@codemonkey.ws>
2009-09-09 23:54           ` Jeremy Fitzhardinge
     [not found]           ` <4AA84020.6040504@goop.org>
2009-09-10  0:11             ` Anthony Liguori
2009-09-10 23:43         ` Alok Kataria
2009-09-10 23:43       ` Alok Kataria
     [not found]       ` <1252626223.3899.64.camel@ank32.eng.vmware.com>
2009-09-14  3:05         ` [Pv-drivers] " Alok Kataria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).