From mboxrd@z Thu Jan  1 00:00:00 1970
From: Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions
 (v2)
Date: Fri, 03 Dec 2010 12:55:53 -0600
Message-ID: <4CF93D39.9010100@codemonkey.ws>
References: <1291298357-5695-1-git-send-email-aliguori@us.ibm.com> <20101202191416.GQ10050@sequoia.sous-sol.org> <20101203115752.GD27994@linux.vnet.ibm.com> <20101203162731.GA11725@linux.vnet.ibm.com> <20101203172906.GD10050@sequoia.sous-sol.org> <20101203175744.GE13515@linux.vnet.ibm.com> <20101203175854.GF10050@sequoia.sous-sol.org> <4CF931D3.6000204@codemonkey.ws> <20101203182015.GG10050@sequoia.sous-sol.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>, kvm@vger.kernel.org,
	Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
To: Chris Wright <chrisw@sous-sol.org>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-qy0-f181.google.com ([209.85.216.181]:58508 "EHLO
	mail-qy0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753384Ab0LCSz4 (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 3 Dec 2010 13:55:56 -0500
Received: by qyk12 with SMTP id 12so12015867qyk.19
        for <kvm@vger.kernel.org>; Fri, 03 Dec 2010 10:55:55 -0800 (PST)
In-Reply-To: <20101203182015.GG10050@sequoia.sous-sol.org>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 12/03/2010 12:20 PM, Chris Wright wrote:
> * Anthony Liguori (anthony@codemonkey.ws) wrote:
>    
>> On 12/03/2010 11:58 AM, Chris Wright wrote:
>>      
>>> * Srivatsa Vaddagiri (vatsa@linux.vnet.ibm.com) wrote:
>>>        
>>>> On Fri, Dec 03, 2010 at 09:29:06AM -0800, Chris Wright wrote:
>>>>          
>>>>> That's what Marcelo's suggestion does w/out a fill thread.
>>>>>            
>>>> There's one complication though even with that. How do we compute the
>>>> real utilization of VM (given that it will appear to be burning 100% cycles)?
>>>> We need to have scheduler discount the cycles burnt post halt-exit, so more
>>>> stuff is needed than those simple 3-4 lines!
>>>>          
>>> Heh, was just about to say the same thing ;)
>>>        
>> My first reaction is that it's not terribly important to account the
>> non-idle time in the guest because of the use-case for this model.
>>      
> Depends on the chargeback model.  This would put guest vcpu runtime vs
> host running guest vcpu time really out of skew.  ('course w/out steal
> and that time it's already out of skew).  But I think most models are
> more uptime based rather then actual runtime now.
>    

Right.  I'm not familiar with any models that are actually based on 
CPU-consumption based accounting.  In general, the feedback I've 
received is that predictable accounting is pretty critical so I don't 
anticipate something as volatile as CPU-consumption ever being something 
that's explicitly charged for in a granular fashion.

>> Eventually, it might be nice to have idle time accounting but I
>> don't see it as a critical feature here.
>>
>> Non-idle time simply isn't as meaningful here as it normally would
>> be.  If you have 10 VMs in a normal environment and saw that you had
>> only 50% CPU utilization, you might be inclined to add more VMs.
>>      
> Who is "you"?  cloud user, or cloud service provider's scheduler?
> On the user side, 50% cpu utilization wouldn't trigger me to add new
> VMs.  On the host side, 50% cpu utilization would have to be measure
> solely in terms of guest vcpu count.
>
>    
>> But if you're offering deterministic execution, it doesn't matter if
>> you only have "50%" utilization.  If you add another VM, the guests
>> will get exactly the same impact as if they were using 100%
>> utilization.
>>      
> Sorry, didn't follow here?
>    

The question is, why would something care about host CPU utilization?  
The answer I can think of is, something wants to measure host CPU 
utilization to identify an underutilized node.  One the underutilized 
node is identified, more work can be given to it.

Adding more work to an underutilized node doesn't change the amount of 
work that can be done.  More concretely, one PCPU, four independent 
VCPUs.  They are consuming, 25%, 25%, 25%, 12% respectively.  My 
management software says, ah hah, I can stick a fifth VCPU on this box 
that's only using 5%.  The other VCPUs are unaffected.

However, in a no-yield-on-hlt model, if I have four VCPUs, they each get 
25%, 25%, 25%, 25% on the host.  Three of the VCPUs are running 100% in 
the guest and one is running 50%.

If I add a fifth VCPU, even if it's only using 5%, each VCPU drops to 
20%.  That means the three VCPUS that are consuming 100% now see a 25% 
drop in their performance even though you've added an idle guest.

Basically, the traditional view of density simply doesn't apply in this 
model.

Regards,

Anthony Liguori

> thanks,
> -chris
>