From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56544) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cC0lw-0008LG-3Z for qemu-devel@nongnu.org; Wed, 30 Nov 2016 04:01:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cC0ls-0003NP-8C for qemu-devel@nongnu.org; Wed, 30 Nov 2016 04:01:48 -0500 Received: from mail-wj0-x241.google.com ([2a00:1450:400c:c01::241]:36578) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cC0lr-0003Mr-Uu for qemu-devel@nongnu.org; Wed, 30 Nov 2016 04:01:44 -0500 Received: by mail-wj0-x241.google.com with SMTP id jb2so21254484wjb.3 for ; Wed, 30 Nov 2016 01:01:43 -0800 (PST) Date: Wed, 30 Nov 2016 09:01:40 +0000 From: Stefan Hajnoczi Message-ID: <20161130090140.GB6816@stefanha-x1.localdomain> References: <40265568-6388-e302-0bbf-a08a6746a686@redhat.com> <5DCF5E88-0BC6-443B-B557-9A72D32A4D49@veritas.com> <20161124111135.GC9117@stefanha-x1.localdomain> <20161124160856.GB13535@stefanha-x1.localdomain> <3B08C602-0033-4FCD-AE83-F0962322A7F7@veritas.com> <20161125113508.GC4939@stefanha-x1.localdomain> <995D8EE5-E13C-45C1-A08C-887772D0DEA6@veritas.com> <20161128141756.GC6411@stefanha-x1.localdomain> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="xXmbgvnjoT4axfJE" Content-Disposition: inline In-Reply-To: <20161128141756.GC6411@stefanha-x1.localdomain> Subject: Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ketan Nilangekar Cc: Paolo Bonzini , ashish mittal , "Daniel P. Berrange" , Jeff Cody , qemu-devel , Kevin Wolf , Markus Armbruster , Fam Zheng , Ashish Mittal , Abhijit Dey , Buddhi Madhav , "Venkatesha M.G." , Nitin Jerath , Gaurav Bhandarkar , Abhishek Kane , Ketan Mahajan , Niranjan Pendharkar , Nirendra Awasthi --xXmbgvnjoT4axfJE Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 28, 2016 at 02:17:56PM +0000, Stefan Hajnoczi wrote: > Please take a look at vhost-user-scsi, which folks from Nutanix are > currently working on. See "[PATCH v2 0/3] Introduce vhost-user-scsi and > sample application" on qemu-devel. It is a true zero-copy local I/O tap > because it shares guest RAM. This is more efficient than cross memory > attach's single memory copy. It does not require running the server as > root. This is the #1 thing you should evaluate for your final > architecture. >=20 > vhost-user-scsi works on the virtio-scsi emulation level. That means > the server must implement the virtio-scsi vring and device emulation. > It is not a block driver. By hooking in at this level you can achieve > the best performance but you lose all QEMU block layer functionality and > need to implement your own SCSI target. You also need to consider live > migration. To clarify why I think vhost-user-scsi is best suited to your requirements for performance: With vhost-user-scsi the qnio server would be notified by kvm.ko via eventfd when the VM submits new I/O requests to the virtio-scsi HBA. The QEMU process is completely bypassed for I/O request submission and the qnio server processes the SCSI command instead. This avoids the context switch to QEMU and then to the qnio server. With cross memory attach QEMU first needs to process the I/O request and hand it to libqnio before the qnio server can be scheduled. The vhost-user-scsi qnio server has shared memory access to guest RAM and is therefore able to do zero-copy I/O into guest buffers. Cross memory attach always incurs a memory copy. Using this high-performance architecture requires significant changes though. vhost-user-scsi hooks into the stack at a different layer so a QEMU block driver is not used at all. QEMU also wouldn't use libqnio. Instead everything will live in your qnio server process (not part of QEMU). You'd have to rethink the resiliency strategy because you currently rely on the QEMU block driver connecting to a different qnio server if the local qnio server fails. In the vhost-user-scsi world it's more like having a phyiscal SCSI adapter - redundancy and multipathing are used to achieve resiliency. For example, virtio-scsi HBA #1 would connect to the local qnio server process. virtio-scsi HBA #2 would connect to another local process called the "proxy process" which forwards requests to a remote qnio server (using libqnio?). If HBA #1 fails then I/O is sent to HBA #2 instead. The path can reset back to HBA #1 once that becomes operational again. If the qnio server is supposed to run in a VM instead of directly in the host environment then it's worth looking at the vhost-pci work that Wei Wang is working on. The email thread is called "[PATCH v2 0/4] *** vhost-user spec extension for vhost-pci ***". The idea here is to allow inter-VM virtio device emulation so that instead of terminating the virtio-scsi device in the qnio server process on the host, you can terminate it inside another VM with good performance characteristics. Stefan --xXmbgvnjoT4axfJE Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEcBAEBAgAGBQJYPpV0AAoJEJykq7OBq3PII0MIAJUt1q4V26PKZfeUEQwoGgWu LWZCmaNFyJjvcjleE3Cf2/Fn8UrOZT1o6PhxwPDV3qxt1LNkNfRRLvGi1AXVQsjj E8WfVSa2xHvxVFqYDpCRR20nyTH20joCPm7VjaUpNw/tDHBAb4IgjlKmYg177px9 +hIpZDCP6BZjda8LBoi+HZhtmh6Ghvy3A/iED7Z6fXoQQ4fpskXbQ/wGxmP/O3lA wAN230xZ16ZKFn9JagO7KtsccsZI1SbjwiR6XWSFbCwlNGKJ1qLYxmL0QPMO8TqS PRyb9pgDji+IU05aLURSujAc5TTtPWpKEiGzt6X4THrfB/c5mY9JqeCsuAso78Q= =tdvw -----END PGP SIGNATURE----- --xXmbgvnjoT4axfJE--