virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: the arch/x86 maintainers <x86@kernel.org>,
	KVM <kvm@vger.kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andi Kleen <andi@firstfloor.org>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Attilio Rao <attilio.rao@citrix.com>, Ingo Molnar <mingo@elte.hu>,
	Virtualization <virtualization@lists.linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Xen Devel <xen-devel@lists.xensource.com>,
	Stephan Diestelhorst <stephan.diestelhorst@amd.com>
Subject: Re: [PATCH RFC V6 0/11] Paravirtualized ticketlocks
Date: Sun, 01 Apr 2012 16:31:44 +0300	[thread overview]
Message-ID: <4F7858C0.90405@redhat.com> (raw)
In-Reply-To: <alpine.LFD.2.02.1203302333560.2542@ionos>

On 03/31/2012 01:07 AM, Thomas Gleixner wrote:
> On Fri, 30 Mar 2012, H. Peter Anvin wrote:
>
> > What is the current status of this patchset?  I haven't looked at it too
> > closely because I have been focused on 3.4 up until now...
>
> The real question is whether these heuristics are the correct approach
> or not.
>
> If I look at it from the non virtualized kernel side then this is ass
> backwards. We know already that we are holding a spinlock which might
> cause other (v)cpus going into eternal spin. The non virtualized
> kernel solves this by disabling preemption and therefor getting out of
> the critical section as fast as possible,
>
> The virtualization problem reminds me a lot of the problem which RT
> kernels are observing where non raw spinlocks are turned into
> "sleeping spinlocks" and therefor can cause throughput issues for non
> RT workloads.
>
> Though the virtualized situation is even worse. Any preempted guest
> section which holds a spinlock is prone to cause unbound delays.
>
> The paravirt ticketlock solution can only mitigate the problem, but
> not solve it. With massive overcommit there is always a way to trigger
> worst case scenarious unless you are educating the scheduler to cope
> with that.
>
> So if we need to fiddle with the scheduler and frankly that's the only
> way to get a real gain (the numbers, which are achieved by this
> patches, are not that impressive) then the question arises whether we
> should turn the whole thing around.
>
> I know that Peter is going to go berserk on me, but if we are running
> a paravirt guest then it's simple to provide a mechanism which allows
> the host (aka hypervisor) to check that in the guest just by looking
> at some global state.
>
> So if a guest exits due to an external event it's easy to inspect the
> state of that guest and avoid to schedule away when it was interrupted
> in a spinlock held section. That guest/host shared state needs to be
> modified to indicate the guest to invoke an exit when the last nested
> lock has been released.

Interesting idea (I think it has been raised before btw, don't recall by
who).

One thing about it is that it can give many false positives.  Consider a
fine-grained spinlock that is being accessed by many threads.  That is,
the lock is taken and released with high frequency, but there is no
contention, because each vcpu is accessing a different instance.  So the
host scheduler will needlessly delay preemption of vcpus that happen to
be holding a lock, even though this gains nothing.

A second issue may happen with a lock that is taken and released with
high frequency, with a high hold percentage.  The host scheduler may
always sample the guest in a held state, leading it to conclude that
it's exceeding its timeout when in fact the lock is held for a short
time only.

> Of course this needs to be time bound, so a rogue guest cannot
> monopolize the cpu forever, but that's the least to worry about
> problem simply because a guest which does not get out of a spinlocked
> region within a certain amount of time is borked and elegible to
> killing anyway.

Hopefully not killing!  Just because a guest doesn't scale well, or even
if it's deadlocked, doesn't mean it should be killed.  Just preempt it.

> Thoughts ?

It's certainly interesting.  Maybe a combination is worthwhile - prevent
lockholder preemption for a short period of time AND put waiters to
sleep in case that period is insufficient to release the lock.

-- 
error compiling committee.c: too many arguments to function

  parent reply	other threads:[~2012-04-01 13:31 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-21 10:20 [PATCH RFC V6 0/11] Paravirtualized ticketlocks Raghavendra K T
2012-03-21 10:20 ` [PATCH RFC V6 1/11] x86/spinlock: replace pv spinlocks with pv ticketlocks Raghavendra K T
2012-03-21 13:04   ` Attilio Rao
     [not found]   ` <4F69D1D9.9080107@citrix.com>
2012-03-21 13:22     ` Stephan Diestelhorst
     [not found]     ` <2425963.NBpIGX9T40@chlor>
2012-03-21 13:49       ` Attilio Rao
     [not found]       ` <4F69DC68.6080200@citrix.com>
2012-03-21 14:25         ` Stephan Diestelhorst
     [not found]         ` <1363312.nixp29LUbv@chlor>
2012-03-21 14:33           ` Attilio Rao
     [not found]           ` <4F69E6BB.508@citrix.com>
2012-03-21 14:49             ` Raghavendra K T
2012-03-21 10:21 ` [PATCH RFC V6 2/11] x86/ticketlock: don't inline _spin_unlock when using paravirt spinlocks Raghavendra K T
2012-03-21 10:21 ` [PATCH RFC V6 3/11] x86/ticketlock: collapse a layer of functions Raghavendra K T
2012-03-21 10:21 ` [PATCH RFC V6 4/11] xen: defer spinlock setup until boot CPU setup Raghavendra K T
2012-03-21 10:21 ` [PATCH RFC V6 5/11] xen/pvticketlock: Xen implementation for PV ticket locks Raghavendra K T
2012-03-21 10:21 ` [PATCH RFC V6 6/11] xen/pvticketlocks: add xen_nopvspin parameter to disable xen pv ticketlocks Raghavendra K T
2012-03-21 10:21 ` [PATCH RFC V6 7/11] x86/pvticketlock: use callee-save for lock_spinning Raghavendra K T
2012-03-21 10:22 ` [PATCH RFC V6 8/11] x86/pvticketlock: when paravirtualizing ticket locks, increment by 2 Raghavendra K T
2012-03-21 10:22 ` [PATCH RFC V6 9/11] x86/ticketlock: add slowpath logic Raghavendra K T
2012-03-21 10:22 ` [PATCH RFC V6 10/11] xen/pvticketlock: allow interrupts to be enabled while blocking Raghavendra K T
2012-03-21 10:22 ` [PATCH RFC V6 11/11] xen: enable PV ticketlocks on HVM Xen Raghavendra K T
     [not found] ` <20120321102107.473.89777.sendpatchset@codeblue.in.ibm.com>
2012-03-21 17:13   ` [PATCH RFC V6 2/11] x86/ticketlock: don't inline _spin_unlock when using paravirt spinlocks Linus Torvalds
2012-03-22 10:06     ` Raghavendra K T
2012-03-26 14:25 ` [PATCH RFC V6 0/11] Paravirtualized ticketlocks Avi Kivity
2012-03-27  7:37   ` Raghavendra K T
     [not found]   ` <4F716E31.3000803@linux.vnet.ibm.com>
2012-03-28 16:09     ` Alan Meadows
2012-03-28 18:21       ` Raghavendra K T
2012-03-29  9:58         ` Avi Kivity
2012-03-29 18:03           ` Raghavendra K T
2012-03-30 10:07             ` Raghavendra K T
     [not found]             ` <4F7585EE.7060203@linux.vnet.ibm.com>
2012-04-01 13:18               ` Avi Kivity
2012-04-01 13:48                 ` Raghavendra K T
2012-04-01 13:53                   ` Avi Kivity
2012-04-01 13:56                     ` Raghavendra K T
2012-04-02  9:51                     ` Raghavendra K T
2012-04-02  9:51                     ` Raghavendra K T
2012-04-05  8:43                     ` Raghavendra K T
     [not found]                     ` <4F7976B6.5050000@linux.vnet.ibm.com>
2012-04-02 12:15                       ` Raghavendra K T
2012-04-05  9:01                       ` Avi Kivity
2012-04-05 10:40                         ` Raghavendra K T
2012-03-30 20:26 ` H. Peter Anvin
2012-03-30 22:07   ` Thomas Gleixner
2012-03-30 22:18     ` Andi Kleen
2012-03-30 23:04       ` Thomas Gleixner
2012-03-31  0:08         ` Andi Kleen
2012-03-31  8:11           ` Ingo Molnar
2012-03-31  4:07     ` Srivatsa Vaddagiri
2012-03-31  4:09       ` Srivatsa Vaddagiri
2012-04-16 15:44       ` Konrad Rzeszutek Wilk
2012-04-16 16:36         ` [Xen-devel] " Ian Campbell
2012-04-16 16:42           ` Jeremy Fitzhardinge
2012-04-17  2:54           ` Srivatsa Vaddagiri
2012-04-01 13:31     ` Avi Kivity [this message]
2012-04-02  9:26       ` Thomas Gleixner
2012-04-05  9:15         ` Avi Kivity
2012-04-02  4:36     ` [Xen-devel] " Juergen Gross
2012-04-02  9:42     ` Ian Campbell
2012-04-11  1:29     ` Marcelo Tosatti
2012-03-31  0:51   ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F7858C0.90405@redhat.com \
    --to=avi@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=attilio.rao@citrix.com \
    --cc=hpa@zytor.com \
    --cc=jeremy.fitzhardinge@citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=stephan.diestelhorst@amd.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vatsa@linux.vnet.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).