xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Paul Durrant <Paul.Durrant@citrix.com>
Cc: "jgross@suse.com" <jgross@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"jbeulich@suse.com" <jbeulich@suse.com>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands
Date: Mon, 29 Feb 2016 09:55:30 -0500	[thread overview]
Message-ID: <20160229145530.GE16364@char.us.oracle.com> (raw)
In-Reply-To: <a68313d9e0964d188015d4946c483c83@AMSPEX02CL03.citrite.net>

On Mon, Feb 29, 2016 at 09:13:41AM +0000, Paul Durrant wrote:
> > -----Original Message-----
> > From: Bob Liu [mailto:bob.liu@oracle.com]
> > Sent: 29 February 2016 03:37
> > To: xen-devel@lists.xen.org
> > Cc: Ian Jackson; jbeulich@suse.com; Roger Pau Monne; jgross@suse.com;
> > Paul Durrant; konrad.wilk@oracle.com; Bob Liu
> > Subject: [RFC PATCH] xen-block: introduces extra request to pass-through
> > SCSI commands
> > 
> > 1) What is this patch about?
> > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> > extra request which is used to pass through SCSI commands.
> > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > It can be extended easily to transmit other per-request/bio data from
> > frontend
> > to backend e.g Data Integrity Field per bio.
> > 
> > 2) Why we need this?
> > Currently only raw data segments are transmitted from blkfront to blkback,
> > which
> > means some advanced features are lost.
> >  * Guest knows nothing about features of the real backend storage.
> > 	For example, on bare-metal environment INQUIRY SCSI command
> > can be used
> > 	to query storage device information. If it's a SSD or flash device we
> > 	can have the option to use the device as a fast cache.
> > 	But this can't happen in current domU guests, because blkfront only
> > 	knows it's just a normal virtual disk
> > 
> 
> That's the sort of information that should be advertised via xenstore then. There already feature flags for specific things but if some form of throughput/latency information is meaningful to a frontend stack then perhaps that can be advertised too.

Certainly could be put on the XenStore. Do you envision this being done
pre guest creation (so toolstack does it), or the backend finds this
and populates the XenStore keys?

Or that the frontend writes an XenStore key 'scsi-inq=vpd80' and the backend
responds by populating an 'scsi-inq-vpd80=' <binary blob>'? If so can
the XenStore accept binary payloads? Can it be more than 4K?


> 
> >  * Failover Clusters in Windows
> > 	Failover clusters require SCSI-3 persistent reservation target disks,
> > 	but now this can't work in domU.
> > 
> 
> That's true but allowing arbitrary SCSI messages through is not the way forward IMO. Just because Windows thinks every HBA is SCSI doesn't mean other OS do so I think reservation/release should have dedicated messages in the blkif protocol if it's desirable to support clustering in the frontend.

Could you expand a bit on the 'dedicated message' you have in mind please?

> 
> > 3) Known issues:
> >  * Security issues, how to 'validate' this extra request payload.
> >    E.g SCSI operates on LUN bases (the whole disk) while we really just want
> > to
> >    operate on partitions
> > 
> >  * Can't pass SCSI commands through if the backend storage driver is bio-
> > based
> >    instead of request-based.
> > 
> > 4) Alternative approach: Using PVSCSI instead:
> >  * Doubt PVSCSI can support as many type of backend storage devices as
> > Xen-block.
> > 
> 
> LIO can interface to any block device in much the same way blkback does IIRC.

But it can't do multipath or LVMs - which is an integral component.

Anyhow that is more of a implementation specific quirk.
> 
> >  * Much longer path:
> >    ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> PVSCSI-
> > backend -> Target framework(LIO?) ->
> > 
> >    With xen-block we only need:
> >    ioctl() -> blkfront -> blkback ->
> > 
> 
> ...and what happens if the block device that blkback is talking to is a SCSI LUN?
> 
> That latter path is also not true for Windows. You've got all the SCSI translation logic in the frontend when using blkif so that first path would collapse to:
> 
> Disk driver -> (SCSI) HBA Driver -> xen-scsiback -> LIO -> backstore -> XXX

I don't know if it matters on the length of the path for say SCSI INQ. It isn't like
that is performance specific. Neither are the clustering SCSI commands.

> 
> >  * xen-block has been existed for many years, widely used and more stable.
> > 
> 
> It's definitely widely used, but it has had stability issues in recent times.

Oh? Could you send the bug-reports to me and Roger, CC xen-devel and LKML please ?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-02-29 14:55 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-29  3:37 [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands Bob Liu
2016-02-29  8:12 ` Juergen Gross
2016-02-29 15:05   ` Konrad Rzeszutek Wilk
2016-02-29 15:34     ` Juergen Gross
2016-02-29  9:13 ` Paul Durrant
2016-02-29 14:55   ` Konrad Rzeszutek Wilk [this message]
2016-02-29 15:28     ` Paul Durrant
2016-02-29 15:35       ` Roger Pau Monné
2016-02-29 16:48         ` Konrad Rzeszutek Wilk
2016-02-29 16:56           ` Paul Durrant
2016-02-29 16:14 ` Ian Jackson
2016-02-29 16:29   ` Ian Jackson
2016-02-29 23:45     ` Bob Liu
2016-02-29 23:45     ` Bob Liu
2016-03-01 18:08       ` Ian Jackson
2016-03-02  7:39         ` Juergen Gross
2016-03-02  7:57           ` Bob Liu
2016-03-02 11:40             ` Ian Jackson
2016-03-02 11:46               ` Paul Durrant
2016-03-02 12:00                 ` Juergen Gross
2016-03-02 12:28               ` Bob Liu
2016-03-02 14:44                 ` Ian Jackson
     [not found]                   ` <20160302172257.GC27821@char.us.oracle.com>
2016-03-03 11:54                     ` Paul Durrant
2016-03-03 12:03                       ` Ian Jackson
2016-03-03 12:25                         ` Juergen Gross
2016-03-03 14:07                           ` Konrad Rzeszutek Wilk
2016-03-03 14:19                             ` Paul Durrant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160229145530.GE16364@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).