From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44343) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UQIyD-0000qY-Nf for qemu-devel@nongnu.org; Thu, 11 Apr 2013 10:59:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UQIyA-0000s9-LI for qemu-devel@nongnu.org; Thu, 11 Apr 2013 10:59:25 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:37814) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UQIyA-0000ru-HR for qemu-devel@nongnu.org; Thu, 11 Apr 2013 10:59:22 -0400 Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 11 Apr 2013 10:59:20 -0400 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 5004DC90076 for ; Thu, 11 Apr 2013 10:59:16 -0400 (EDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3BExFsP11993250 for ; Thu, 11 Apr 2013 10:59:16 -0400 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3BEvSvL003316 for ; Thu, 11 Apr 2013 08:57:28 -0600 Message-ID: <5166CF56.2060105@linux.vnet.ibm.com> Date: Thu, 11 Apr 2013 10:57:26 -0400 From: "Michael R. Hines" MIME-Version: 1.0 References: <1365632901-15470-1-git-send-email-mrhines@linux.vnet.ibm.com> <1365632901-15470-11-git-send-email-mrhines@linux.vnet.ibm.com> <20130411073843.GB19601@redhat.com> <51667FEE.903@redhat.com> <5166B9A9.9070904@linux.vnet.ibm.com> <5166C59A.4010904@redhat.com> In-Reply-To: <5166C59A.4010904@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v1: 10/13] introduce new command migrate_check_for_zero List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: aliguori@us.ibm.com, "Michael S. Tsirkin" , qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com We have hardware already with front side bus speeds of 13 GB/s. We also already have 5 GB/s RDMA hardware, and we will likely have even faster RDMA hardware in the future. This analysis is not factoring into account the cycles it takes to map the pages before they are checked for duplicate bytes, regardless whether or not very little of the page is actually cached on the processor. This analysis is also not taking into account the possibility that the VM may be CPU-bound at the same time that QEMU is competing to execute is_dup_page(). Thus, as you mentioned, a worst-case 5 GB/s memory bandwidth for is_dup_page() could be very easily reached given the right conditions - and we do have many workloads both HPC and Multi-tier which can easily cause QEMU's zero scanning performance to suffer. - Michael On 04/11/2013 10:15 AM, Paolo Bonzini wrote: > No, I'm saying that is_dup_page() should not be a problem. I'm saying > it should only loop a lot during the bulk phase. The only effect I can > imagine after the bulk phase is one cache miss. > > Perhaps the stress-test you're using does not reproduce realistic > conditions with respect to zero pages. Peter Lieven benchmarked real > guests, both Linux and Windows, and confirmed the theory that I > mentioned upthread. Almost all non-zero pages are detected within the > first few words, and almost all zero pages come from the bulk phase. > > Considering that one cache miss, RDMA is indeed different here. TCP > would have this cache miss later anyway, RDMA does not. Let's say 300 > cycles/miss; at 2.5 GHz that is 300/2500 microseconds, i.e 0.12 > microseconds per page. This would say that we can run is_dup_page on 30 > GB worth of nonzero pages every second or more. Ok, the estimate is > quite generous in many ways, but is_dup_page() is only a bottleneck if > it can do less than 5 GB/s. > > Paolo >