kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: john cooper <john.cooper@third-harmonic.com>
To: Avi Kivity <avi@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>, Gleb Natapov <gleb@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org, hpa@zytor.com,
	mingo@elte.hu, npiggin@suse.de, tglx@linutronix.de,
	mtosatti@redhat.com, john cooper <john.cooper@redhat.com>
Subject: Re: [PATCH] use unfair spinlock when running on hypervisor.
Date: Tue, 01 Jun 2010 13:54:07 -0400	[thread overview]
Message-ID: <4C05493F.1040107@third-harmonic.com> (raw)
In-Reply-To: <4C053ACC.5020708@redhat.com>

Avi Kivity wrote:
> On 06/01/2010 07:38 PM, Andi Kleen wrote:
>>>> Your new code would starve again, right?
>>>>
>>>>        
>>> Yes, of course it may starve with unfair spinlock. Since vcpus are not
>>> always running there is much smaller chance then vcpu on remote memory
>>> node will starve forever. Old kernels with unfair spinlocks are running
>>> fine in VMs on NUMA machines with various loads.
>>>      
>> Try it on a NUMA system with unfair memory.
>>    
> 
> We are running everything on NUMA (since all modern machines are now
> NUMA).  At what scale do the issues become observable?
> 
>>> I understand that reason and do not propose to get back to old spinlock
>>> on physical HW! But with virtualization performance hit is unbearable.
>>>      
>> Extreme unfairness can be unbearable too.
>>    
> 
> Well, the question is what happens first.  In our experience, vcpu
> overcommit is a lot more painful.  People will never see the NUMA
> unfairness issue if they can't use kvm due to the vcpu overcommit problem.

Gleb's observed performance hit seems to be a rather mild
throughput depression compared with creating a worst case by
enforcing vcpu overcommit.  Running a single guest with 2:1
overcommit on a 4 core machine I saw over an order of magnitude
slowdown vs. 1:1 commit with the same kernel build test.
Others have reported similar results.

How close you'll get to that scenario depends on host
scheduling dynamics, and statistically the number of opened
and stalled lock held paths waiting to be contended.  So
I'd expect to see quite variable numbers for guest-guest
aggravation of this problem.

> What I'd like to see eventually is a short-term-unfair, long-term-fair
> spinlock.  Might make sense for bare metal as well.  But it won't be
> easy to write.

Collecting the contention/usage statistics on a per spinlock
basis seems complex.  I believe a practical approximation
to this are adaptive mutexes where upon hitting a spin
time threshold, punt and let the scheduler reconcile fairness.

-john

-- 
john.cooper@third-harmonic.com

  parent reply	other threads:[~2010-06-01 17:54 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-01  9:35 [PATCH] use unfair spinlock when running on hypervisor Gleb Natapov
2010-06-01 15:53 ` Andi Kleen
2010-06-01 16:24   ` Gleb Natapov
2010-06-01 16:38     ` Andi Kleen
2010-06-01 16:52       ` Avi Kivity
2010-06-01 17:27         ` Andi Kleen
2010-06-02  2:51           ` Avi Kivity
2010-06-02  5:26             ` Srivatsa Vaddagiri
2010-06-02  8:50             ` Andi Kleen
2010-06-02  9:00               ` Avi Kivity
2010-06-03  4:20                 ` Srivatsa Vaddagiri
2010-06-03  4:51                   ` Eric Dumazet
2010-06-03  5:38                     ` Srivatsa Vaddagiri
2010-06-03  8:52                   ` Andi Kleen
2010-06-03  9:26                     ` Srivatsa Vaddagiri
2010-06-03 10:22                     ` Nick Piggin
2010-06-03 10:38                   ` Nick Piggin
2010-06-03 12:04                     ` Srivatsa Vaddagiri
2010-06-03 12:38                       ` Nick Piggin
2010-06-03 12:58                         ` Srivatsa Vaddagiri
2010-06-03 13:04                           ` Srivatsa Vaddagiri
2010-06-03 13:45                           ` Nick Piggin
2010-06-03 14:48                             ` Srivatsa Vaddagiri
2010-06-03 15:17                         ` Andi Kleen
2010-06-03 15:35                           ` Nick Piggin
2010-06-03 17:25                             ` Andi Kleen
2010-06-01 17:39         ` Valdis.Kletnieks
2010-06-02  2:46           ` Avi Kivity
2010-06-02  7:39           ` H. Peter Anvin
2010-06-01 17:54         ` john cooper [this message]
2010-06-01 19:36           ` Andi Kleen
2010-06-03 11:06             ` David Woodhouse
2010-06-03 15:15               ` Andi Kleen
2010-06-01 21:39         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C05493F.1040107@third-harmonic.com \
    --to=john.cooper@third-harmonic.com \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=john.cooper@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mtosatti@redhat.com \
    --cc=npiggin@suse.de \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).