From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: IVSHMEM device performance Date: Mon, 11 Apr 2016 15:27:34 +0300 Message-ID: <20160411151918-mutt-send-email-mst@redhat.com> References: <87fuuso755.fsf@dusky.pond.sub.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Eli Britstein , qemu-devel@nongnu.org, "kvm@vger.kernel.org" To: Markus Armbruster Return-path: Received: from mx1.redhat.com ([209.132.183.28]:51509 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750971AbcDKM1j (ORCPT ); Mon, 11 Apr 2016 08:27:39 -0400 Content-Disposition: inline In-Reply-To: <87fuuso755.fsf@dusky.pond.sub.org> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Apr 11, 2016 at 10:56:54AM +0200, Markus Armbruster wrote: > Cc: qemu-devel > > Eli Britstein writes: > > > Hi > > > > In a VM, I add a IVSHMEM device, on which the MBUFS mempool resides, and also rings I create (I run a DPDK application in the VM). > > I saw there is a performance penalty if I use such device, instead of hugepages (the VM's hugepages). My VM's memory is *NOT* backed with host's hugepages. > > The memory behind the IVSHMEM device is a host hugepage (I use a patched version of QEMU, as provided by Intel). > > I thought maybe the reason is that this memory is seen by the VM as a mapped PCI memory region, so it is not cached, but I am not sure. > > So, my direction was to change the kernel (in the VM) so it will consider this memory as a regular memory (and thus cached), instead of a PCI memory region. > > However, I am not sure my direction is correct, and even if so, I am not sure how/where to change the kernel (my starting point was mm/mmap.c, but I'm not sure it's the correct place to start). > > > > Any suggestion is welcomed. > > Thanks, > > Eli. A cleaner way is just to use virtio, keeping everything in VM's memory, with access either by data copies in hypervisor, or directly using vhost-user. For example, with vhost-pci: https://wiki.opnfv.org/vm2vm_mst there has been recent work on this, see slides 12-14 in http://schd.ws/hosted_files/ons2016/36/Nakajima_and_Ergin_PreSwitch_final.pdf This is very much work in progress, but if you are interested you should probably get in touch with Nakajima et al. -- MST