From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:56544)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1cC0lw-0008LG-3Z
	for qemu-devel@nongnu.org; Wed, 30 Nov 2016 04:01:49 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1cC0ls-0003NP-8C
	for qemu-devel@nongnu.org; Wed, 30 Nov 2016 04:01:48 -0500
Received: from mail-wj0-x241.google.com ([2a00:1450:400c:c01::241]:36578)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <stefanha@gmail.com>) id 1cC0lr-0003Mr-Uu
	for qemu-devel@nongnu.org; Wed, 30 Nov 2016 04:01:44 -0500
Received: by mail-wj0-x241.google.com with SMTP id jb2so21254484wjb.3
	for <qemu-devel@nongnu.org>; Wed, 30 Nov 2016 01:01:43 -0800 (PST)
Date: Wed, 30 Nov 2016 09:01:40 +0000
From: Stefan Hajnoczi <stefanha@gmail.com>
Message-ID: <20161130090140.GB6816@stefanha-x1.localdomain>
References: <CAAo6VWO3vfSJoCRtHwFN7=Z3iAyH4VwfA5+DpqVdNcbNQGoVgg@mail.gmail.com>
	<40265568-6388-e302-0bbf-a08a6746a686@redhat.com>
	<5DCF5E88-0BC6-443B-B557-9A72D32A4D49@veritas.com>
	<20161124111135.GC9117@stefanha-x1.localdomain>
	<CA46BA74-3882-4319-AE2E-033D7FCC56A6@veritas.com>
	<20161124160856.GB13535@stefanha-x1.localdomain>
	<3B08C602-0033-4FCD-AE83-F0962322A7F7@veritas.com>
	<20161125113508.GC4939@stefanha-x1.localdomain>
	<995D8EE5-E13C-45C1-A08C-887772D0DEA6@veritas.com>
	<20161128141756.GC6411@stefanha-x1.localdomain>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="xXmbgvnjoT4axfJE"
Content-Disposition: inline
In-Reply-To: <20161128141756.GC6411@stefanha-x1.localdomain>
Subject: Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add
 Veritas HyperScale VxHS block device support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Ketan Nilangekar <Ketan.Nilangekar@veritas.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, ashish mittal <ashmit602@gmail.com>, "Daniel P. Berrange" <berrange@redhat.com>, Jeff Cody <jcody@redhat.com>, qemu-devel <qemu-devel@nongnu.org>, Kevin Wolf <kwolf@redhat.com>, Markus Armbruster <armbru@redhat.com>, Fam Zheng <famz@redhat.com>, Ashish Mittal <Ashish.Mittal@veritas.com>, Abhijit Dey <Abhijit.Dey@veritas.com>, Buddhi Madhav <Buddhi.Madhav@veritas.com>, "Venkatesha M.G." <Venkatesha.Mg@veritas.com>, Nitin Jerath <Nitin.Jerath@veritas.com>, Gaurav Bhandarkar <Gaurav.Bhandarkar@veritas.com>, Abhishek Kane <Abhishek.Kane@veritas.com>, Ketan Mahajan <Ketan.Mahajan@veritas.com>, Niranjan Pendharkar <Niranjan.Pendharkar@veritas.com>, Nirendra Awasthi <Nirendra.Awasthi@veritas.com>


--xXmbgvnjoT4axfJE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Nov 28, 2016 at 02:17:56PM +0000, Stefan Hajnoczi wrote:
> Please take a look at vhost-user-scsi, which folks from Nutanix are
> currently working on.  See "[PATCH v2 0/3] Introduce vhost-user-scsi and
> sample application" on qemu-devel.  It is a true zero-copy local I/O tap
> because it shares guest RAM.  This is more efficient than cross memory
> attach's single memory copy.  It does not require running the server as
> root.  This is the #1 thing you should evaluate for your final
> architecture.
>=20
> vhost-user-scsi works on the virtio-scsi emulation level.  That means
> the server must implement the virtio-scsi vring and device emulation.
> It is not a block driver.  By hooking in at this level you can achieve
> the best performance but you lose all QEMU block layer functionality and
> need to implement your own SCSI target.  You also need to consider live
> migration.

To clarify why I think vhost-user-scsi is best suited to your
requirements for performance:

With vhost-user-scsi the qnio server would be notified by kvm.ko via
eventfd when the VM submits new I/O requests to the virtio-scsi HBA.
The QEMU process is completely bypassed for I/O request submission and
the qnio server processes the SCSI command instead.  This avoids the
context switch to QEMU and then to the qnio server.  With cross memory
attach QEMU first needs to process the I/O request and hand it to
libqnio before the qnio server can be scheduled.

The vhost-user-scsi qnio server has shared memory access to guest RAM
and is therefore able to do zero-copy I/O into guest buffers.  Cross
memory attach always incurs a memory copy.

Using this high-performance architecture requires significant changes
though.  vhost-user-scsi hooks into the stack at a different layer so a
QEMU block driver is not used at all.  QEMU also wouldn't use libqnio.
Instead everything will live in your qnio server process (not part of
QEMU).

You'd have to rethink the resiliency strategy because you currently rely
on the QEMU block driver connecting to a different qnio server if the
local qnio server fails.  In the vhost-user-scsi world it's more like
having a phyiscal SCSI adapter - redundancy and multipathing are used to
achieve resiliency.

For example, virtio-scsi HBA #1 would connect to the local qnio server
process.  virtio-scsi HBA #2 would connect to another local process
called the "proxy process" which forwards requests to a remote qnio
server (using libqnio?).  If HBA #1 fails then I/O is sent to HBA #2
instead.  The path can reset back to HBA #1 once that becomes
operational again.

If the qnio server is supposed to run in a VM instead of directly in the
host environment then it's worth looking at the vhost-pci work that Wei
Wang <wei.w.wang@intel.com> is working on.  The email thread is called
"[PATCH v2 0/4] *** vhost-user spec extension for vhost-pci ***".  The
idea here is to allow inter-VM virtio device emulation so that instead
of terminating the virtio-scsi device in the qnio server process on the
host, you can terminate it inside another VM with good performance
characteristics.

Stefan

--xXmbgvnjoT4axfJE
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQEcBAEBAgAGBQJYPpV0AAoJEJykq7OBq3PII0MIAJUt1q4V26PKZfeUEQwoGgWu
LWZCmaNFyJjvcjleE3Cf2/Fn8UrOZT1o6PhxwPDV3qxt1LNkNfRRLvGi1AXVQsjj
E8WfVSa2xHvxVFqYDpCRR20nyTH20joCPm7VjaUpNw/tDHBAb4IgjlKmYg177px9
+hIpZDCP6BZjda8LBoi+HZhtmh6Ghvy3A/iED7Z6fXoQQ4fpskXbQ/wGxmP/O3lA
wAN230xZ16ZKFn9JagO7KtsccsZI1SbjwiR6XWSFbCwlNGKJ1qLYxmL0QPMO8TqS
PRyb9pgDji+IU05aLURSujAc5TTtPWpKEiGzt6X4THrfB/c5mY9JqeCsuAso78Q=
=tdvw
-----END PGP SIGNATURE-----

--xXmbgvnjoT4axfJE--