From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v2) Date: Fri, 03 Dec 2010 12:55:53 -0600 Message-ID: <4CF93D39.9010100@codemonkey.ws> References: <1291298357-5695-1-git-send-email-aliguori@us.ibm.com> <20101202191416.GQ10050@sequoia.sous-sol.org> <20101203115752.GD27994@linux.vnet.ibm.com> <20101203162731.GA11725@linux.vnet.ibm.com> <20101203172906.GD10050@sequoia.sous-sol.org> <20101203175744.GE13515@linux.vnet.ibm.com> <20101203175854.GF10050@sequoia.sous-sol.org> <4CF931D3.6000204@codemonkey.ws> <20101203182015.GG10050@sequoia.sous-sol.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Srivatsa Vaddagiri , kvm@vger.kernel.org, Avi Kivity , Marcelo Tosatti To: Chris Wright Return-path: Received: from mail-qy0-f181.google.com ([209.85.216.181]:58508 "EHLO mail-qy0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753384Ab0LCSz4 (ORCPT ); Fri, 3 Dec 2010 13:55:56 -0500 Received: by qyk12 with SMTP id 12so12015867qyk.19 for ; Fri, 03 Dec 2010 10:55:55 -0800 (PST) In-Reply-To: <20101203182015.GG10050@sequoia.sous-sol.org> Sender: kvm-owner@vger.kernel.org List-ID: On 12/03/2010 12:20 PM, Chris Wright wrote: > * Anthony Liguori (anthony@codemonkey.ws) wrote: > >> On 12/03/2010 11:58 AM, Chris Wright wrote: >> >>> * Srivatsa Vaddagiri (vatsa@linux.vnet.ibm.com) wrote: >>> >>>> On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote: >>>> >>>>> That's what Marcelo's suggestion does w/out a fill thread. >>>>> >>>> There's one complication though even with that. How do we compute the >>>> real utilization of VM (given that it will appear to be burning 100% cycles)? >>>> We need to have scheduler discount the cycles burnt post halt-exit, so more >>>> stuff is needed than those simple 3-4 lines! >>>> >>> Heh, was just about to say the same thing ;) >>> >> My first reaction is that it's not terribly important to account the >> non-idle time in the guest because of the use-case for this model. >> > Depends on the chargeback model. This would put guest vcpu runtime vs > host running guest vcpu time really out of skew. ('course w/out steal > and that time it's already out of skew). But I think most models are > more uptime based rather then actual runtime now. > Right. I'm not familiar with any models that are actually based on CPU-consumption based accounting. In general, the feedback I've received is that predictable accounting is pretty critical so I don't anticipate something as volatile as CPU-consumption ever being something that's explicitly charged for in a granular fashion. >> Eventually, it might be nice to have idle time accounting but I >> don't see it as a critical feature here. >> >> Non-idle time simply isn't as meaningful here as it normally would >> be. If you have 10 VMs in a normal environment and saw that you had >> only 50% CPU utilization, you might be inclined to add more VMs. >> > Who is "you"? cloud user, or cloud service provider's scheduler? > On the user side, 50% cpu utilization wouldn't trigger me to add new > VMs. On the host side, 50% cpu utilization would have to be measure > solely in terms of guest vcpu count. > > >> But if you're offering deterministic execution, it doesn't matter if >> you only have "50%" utilization. If you add another VM, the guests >> will get exactly the same impact as if they were using 100% >> utilization. >> > Sorry, didn't follow here? > The question is, why would something care about host CPU utilization? The answer I can think of is, something wants to measure host CPU utilization to identify an underutilized node. One the underutilized node is identified, more work can be given to it. Adding more work to an underutilized node doesn't change the amount of work that can be done. More concretely, one PCPU, four independent VCPUs. They are consuming, 25%, 25%, 25%, 12% respectively. My management software says, ah hah, I can stick a fifth VCPU on this box that's only using 5%. The other VCPUs are unaffected. However, in a no-yield-on-hlt model, if I have four VCPUs, they each get 25%, 25%, 25%, 25% on the host. Three of the VCPUs are running 100% in the guest and one is running 50%. If I add a fifth VCPU, even if it's only using 5%, each VCPU drops to 20%. That means the three VCPUS that are consuming 100% now see a 25% drop in their performance even though you've added an idle guest. Basically, the traditional view of density simply doesn't apply in this model. Regards, Anthony Liguori > thanks, > -chris >