From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Byrne Subject: Re: Direct I/O to domU seeing a 30% performance hit Date: Tue, 07 Nov 2006 11:17:16 -0800 Message-ID: <4550DBBC.70807@hp.com> References: <8A87A9A84C201449A0C56B728ACF491E01F76D@liverpoolst.ad.cl.cam.ac.uk> <4550B802.8030505@hp.com> <8A87A9A84C201449A0C56B728ACF491E01F779@liverpoolst.ad.cl.cam.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <8A87A9A84C201449A0C56B728ACF491E01F779@liverpoolst.ad.cl.cam.ac.uk> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Pratt Cc: xen-devel , Emmanuel Ackaouy List-Id: xen-devel@lists.xenproject.org Ian, I had screwed up and had a process running in dom0 chewing up CPU in dom0. I thought I had taken care of it. After fixing that, most of the numbers for dom0, domU, and the base SLES kernel are within a couple of tenths of percent of each other. However, there are some fairly large differences in some of the runs where the socket buffers are small. Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 262142 262142 4096 60.00 941.03 base 262142 262142 4096 60.00 939.95 dom0 262142 262142 4096 60.00 937.22 domU 16384 16384 32768 60.00 379.68 base 16384 16384 32768 60.00 350.15 dom0 16384 16384 32768 60.00 367.89 domU In the latter case, the divergence from the base performance is much larger. I assume that when the socket buffers are small, the extra overhead for the interrupts is showing up more because more interrupts are required. Overall, though, the numbers are now acceptable. Thanks for your help. It allowed me to spot my goof. (Sorry about wasting your time though.) One last question: is there an easy way to break out the amount of CPU time spent in the hypervisor? Thanks, John Byrne Ian Pratt wrote: >> Both dom0 and the domU are SLES 10, so I don't know why the "idle" >> performance of the two should be different. The obvious asymmetry is > the >> disk. Since the disk isn't direct, any disk I/O by the domU would >> certainly impact dom0, but I don't think there should be much, if any. > I >> did run a dom0 test with the domU started, but idle and there was no >> real change to dom0's numbers. >> >> What's the best way to gather information about what is going on with >> the domains without perturbing them? (Or, at least, perturbing > everyone >> equally.) >> >> As to the test, I am running netperf 2.4.1 on an outside machine to > the >> dom0 and the domU. (So the doms are running the netserver portion.) I >> was originally running it in the doms to the outside machine, but when >> the bad numbers showed up I moved it to the outside machine because I >> wondered if the bad numbers were due to something happening to the >> system time in domU. The numbers is the "outside" test to domU look > worse. > > > It might be worth checking that there's no interrupt sharing happening. > While running the test against the domU, see how much CPU dom0 burns in > the same period using 'xm vcpu-list'. > > To keep things simple, have dom0 and domU as uniprocessor guests. > > Ian > > >> Ian Pratt wrote: >>>> There have been a couple of network receive throughput >>>> performance regressions to domUs over time that were >>>> subsequently fixed. I think one may have crept in to 3.0.3. >>> The report was (I believe) with a NIC directly assigned to the domU, > so >>> not using netfront/back at all. >>> >>> John: please can you give more details on your config. >>> >>> Ian >>> >>>> Are you seeing any dropped packets on the vif associated with >>>> your domU in your dom0? If so, propagating changeset >>>> 11861 from unstable may help: >>>> >>>> changeset: 11861:637eace6d5c6 >>>> user: kfraser@localhost.localdomain >>>> date: Mon Oct 23 11:20:37 2006 +0100 >>>> summary: [NET] back: Fix packet queuing so that packets >>>> are drained if the >>>> >>>> >>>> In the past, we also had receive throughput issues to domUs >>>> that were due to socket buffer size logic but those were >>>> fixed a while ago. >>>> >>>> Can you send netstat -i output from dom0? >>>> >>>> Emmanuel. >>>> >>>> >>>> On Mon, Nov 06, 2006 at 09:55:17PM -0800, John Byrne wrote: >>>>> I was asked to test direct I/O to a PV domU. Since, I had a system >>>>> with two NICs, I gave one to a domU and one dom0. (Each is >>>> running the >>>>> same >>>>> kernel: xen 3.0.3 x86_64.) >>>>> >>>>> I'm running netperf from an outside system to the domU and >>>> dom0 and I >>>>> am seeing 30% less throughput for the domU vs dom0. >>>>> >>>>> Is this to be expected? If so, why? If not, does anyone >>>> have a guess >>>>> as to what I might be doing wrong or what the issue might be? >>>>> >>>>> Thanks, >>>>> >>>>> John Byrne >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> > >