Re: [PATCH RFC V11 15/18] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Gleb Natapov <gleb@redhat.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: jeremy@goop.org, gregkh@suse.de, kvm@vger.kernel.org,
	linux-doc@vger.kernel.org, peterz@infradead.org,
	drjones@redhat.com, virtualization@lists.linux-foundation.org,
	andi@firstfloor.org, hpa@zytor.com,
	stefano.stabellini@eu.citrix.com, xen-devel@lists.xensource.com,
	x86@kernel.org, mingo@redhat.com, habanero@linux.vnet.ibm.com,
	riel@redhat.com, konrad.wilk@oracle.com, ouyang@cs.pitt.edu,
	avi.kivity@gmail.com, tglx@linutronix.de, chegu_vinod@hp.com,
	linux-kernel@vger.kernel.org, srivatsa.vaddagiri@gmail.com,
	attilio.rao@citrix.com, pbonzini@redhat.com,
	torvalds@linux-foundation.org
Subject: Re: [PATCH RFC V11 15/18] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor
Date: Thu, 1 Aug 2013 10:45:30 +0300	[thread overview]
Message-ID: <20130801074529.GO7484@redhat.com> (raw)
In-Reply-To: <51FA1087.9080908@linux.vnet.ibm.com>

On Thu, Aug 01, 2013 at 01:08:47PM +0530, Raghavendra K T wrote:
> On 07/31/2013 11:54 AM, Gleb Natapov wrote:
> >On Tue, Jul 30, 2013 at 10:13:12PM +0530, Raghavendra K T wrote:
> >>On 07/25/2013 03:08 PM, Raghavendra K T wrote:
> >>>On 07/25/2013 02:45 PM, Gleb Natapov wrote:
> >>>>On Thu, Jul 25, 2013 at 02:47:37PM +0530, Raghavendra K T wrote:
> >>>>>On 07/24/2013 06:06 PM, Raghavendra K T wrote:
> >>>>>>On 07/24/2013 05:36 PM, Gleb Natapov wrote:
> >>>>>>>On Wed, Jul 24, 2013 at 05:30:20PM +0530, Raghavendra K T wrote:
> >>>>>>>>On 07/24/2013 04:09 PM, Gleb Natapov wrote:
> >>>>>>>>>On Wed, Jul 24, 2013 at 03:15:50PM +0530, Raghavendra K T wrote:
> >>>>>>>>>>On 07/23/2013 08:37 PM, Gleb Natapov wrote:
> >>>>>>>>>>>On Mon, Jul 22, 2013 at 11:50:16AM +0530, Raghavendra K T wrote:
> >>>>>>>>>>>>+static void kvm_lock_spinning(struct arch_spinlock *lock,
> >>>>>>>>>>>>__ticket_t want)
> >>>>>>>>>>[...]
> >>>>>>>>>>>>+
> >>>>>>>>>>>>+    /*
> >>>>>>>>>>>>+     * halt until it's our turn and kicked. Note that we do safe
> >>>>>>>>>>>>halt
> >>>>>>>>>>>>+     * for irq enabled case to avoid hang when lock info is
> >>>>>>>>>>>>overwritten
> >>>>>>>>>>>>+     * in irq spinlock slowpath and no spurious interrupt occur
> >>>>>>>>>>>>to save us.
> >>>>>>>>>>>>+     */
> >>>>>>>>>>>>+    if (arch_irqs_disabled_flags(flags))
> >>>>>>>>>>>>+        halt();
> >>>>>>>>>>>>+    else
> >>>>>>>>>>>>+        safe_halt();
> >>>>>>>>>>>>+
> >>>>>>>>>>>>+out:
> >>>>>>>>>>>So here now interrupts can be either disabled or enabled. Previous
> >>>>>>>>>>>version disabled interrupts here, so are we sure it is safe to
> >>>>>>>>>>>have them
> >>>>>>>>>>>enabled at this point? I do not see any problem yet, will keep
> >>>>>>>>>>>thinking.
> >>>>>>>>>>
> >>>>>>>>>>If we enable interrupt here, then
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>>+    cpumask_clear_cpu(cpu, &waiting_cpus);
> >>>>>>>>>>
> >>>>>>>>>>and if we start serving lock for an interrupt that came here,
> >>>>>>>>>>cpumask clear and w->lock=null may not happen atomically.
> >>>>>>>>>>if irq spinlock does not take slow path we would have non null
> >>>>>>>>>>value
> >>>>>>>>>>for lock, but with no information in waitingcpu.
> >>>>>>>>>>
> >>>>>>>>>>I am still thinking what would be problem with that.
> >>>>>>>>>>
> >>>>>>>>>Exactly, for kicker waiting_cpus and w->lock updates are
> >>>>>>>>>non atomic anyway.
> >>>>>>>>>
> >>>>>>>>>>>>+    w->lock = NULL;
> >>>>>>>>>>>>+    local_irq_restore(flags);
> >>>>>>>>>>>>+    spin_time_accum_blocked(start);
> >>>>>>>>>>>>+}
> >>>>>>>>>>>>+PV_CALLEE_SAVE_REGS_THUNK(kvm_lock_spinning);
> >>>>>>>>>>>>+
> >>>>>>>>>>>>+/* Kick vcpu waiting on @lock->head to reach value @ticket */
> >>>>>>>>>>>>+static void kvm_unlock_kick(struct arch_spinlock *lock,
> >>>>>>>>>>>>__ticket_t ticket)
> >>>>>>>>>>>>+{
> >>>>>>>>>>>>+    int cpu;
> >>>>>>>>>>>>+
> >>>>>>>>>>>>+    add_stats(RELEASED_SLOW, 1);
> >>>>>>>>>>>>+    for_each_cpu(cpu, &waiting_cpus) {
> >>>>>>>>>>>>+        const struct kvm_lock_waiting *w =
> >>>>>>>>>>>>&per_cpu(lock_waiting, cpu);
> >>>>>>>>>>>>+        if (ACCESS_ONCE(w->lock) == lock &&
> >>>>>>>>>>>>+            ACCESS_ONCE(w->want) == ticket) {
> >>>>>>>>>>>>+            add_stats(RELEASED_SLOW_KICKED, 1);
> >>>>>>>>>>>>+            kvm_kick_cpu(cpu);
> >>>>>>>>>>>What about using NMI to wake sleepers? I think it was
> >>>>>>>>>>>discussed, but
> >>>>>>>>>>>forgot why it was dismissed.
> >>>>>>>>>>
> >>>>>>>>>>I think I have missed that discussion. 'll go back and check. so
> >>>>>>>>>>what is the idea here? we can easily wake up the halted vcpus that
> >>>>>>>>>>have interrupt disabled?
> >>>>>>>>>We can of course. IIRC the objection was that NMI handling path
> >>>>>>>>>is very
> >>>>>>>>>fragile and handling NMI on each wakeup will be more expensive then
> >>>>>>>>>waking up a guest without injecting an event, but it is still
> >>>>>>>>>interesting
> >>>>>>>>>to see the numbers.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>Haam, now I remember, We had tried request based mechanism. (new
> >>>>>>>>request like REQ_UNHALT) and process that. It had worked, but had
> >>>>>>>>some
> >>>>>>>>complex hacks in vcpu_enter_guest to avoid guest hang in case of
> >>>>>>>>request cleared.  So had left it there..
> >>>>>>>>
> >>>>>>>>https://lkml.org/lkml/2012/4/30/67
> >>>>>>>>
> >>>>>>>>But I do not remember performance impact though.
> >>>>>>>No, this is something different. Wakeup with NMI does not need KVM
> >>>>>>>changes at
> >>>>>>>all. Instead of kvm_kick_cpu(cpu) in kvm_unlock_kick you send NMI IPI.
> >>>>>>>
> >>>>>>
> >>>>>>True. It was not NMI.
> >>>>>>just to confirm, are you talking about something like this to be
> >>>>>>tried ?
> >>>>>>
> >>>>>>apic->send_IPI_mask(cpumask_of(cpu), APIC_DM_NMI);
> >>>>>
> >>>>>When I started benchmark, I started seeing
> >>>>>"Dazed and confused, but trying to continue" from unknown nmi error
> >>>>>handling.
> >>>>>Did I miss anything (because we did not register any NMI handler)? or
> >>>>>is it that spurious NMIs are trouble because we could get spurious NMIs
> >>>>>if next waiter already acquired the lock.
> >>>>There is a default NMI handler that tries to detect the reason why NMI
> >>>>happened (which is no so easy on x86) and prints this message if it
> >>>>fails. You need to add logic to detect spinlock slow path there. Check
> >>>>bit in waiting_cpus for instance.
> >>>
> >>>aha.. Okay. will check that.
> >>
> >>yes. Thanks.. that did the trick.
> >>
> >>I did like below in unknown_nmi_error():
> >>if (cpumask_test_cpu(smp_processor_id(), &waiting_cpus))
> >>    return;
> >>
> >>But I believe you asked NMI method only for experimental purpose to
> >>check the upperbound. because as I doubted above, for spurious NMI
> >>(i.e. when unlocker kicks when waiter already got the lock), we would
> >>still hit unknown NMI error.
> >>
> >>I had hit spurious NMI over 1656 times over entire benchmark run.
> >>along with
> >>INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too
> >>long to run: 24.886 msecs etc...
> >>
> >I wonder why this happens.
> >
> >>(and we cannot get away with that too because it means we bypass the
> >>unknown NMI error even in genuine cases too)
> >>
> >>Here was the result for the my dbench test( 32 core  machine with 32
> >>vcpu guest HT off)
> >>
> >>                  ---------- % improvement --------------
> >>		pvspinlock      pvspin_ipi      pvpsin_nmi
> >>dbench_1x	0.9016    	0.7442    	0.7522
> >>dbench_2x	14.7513   	18.0164   	15.9421
> >>dbench_3x	14.7571   	17.0793   	13.3572
> >>dbench_4x	6.3625    	8.7897    	5.3800
> >>
> >>So I am seeing over 2-4% improvement with IPI method.
> >>
> >Yeah, this was expected.
> >
> >>Gleb,
> >>  do you think the current series looks good to you? [one patch I
> >>have resent with in_nmi() check] or do you think I have to respin the
> >>series with IPI method etc. or is there any concerns that I have to
> >>address. Please let me know..
> >>
> >The current code looks fine to me.
> 
> Gleb,
> 
> Shall I consider this as an ack for kvm part?
> 
For everything except 18/18. For that I still want to see numbers. But
18/18 is pretty independent from the reset of the series so it should
not stop the reset from going in.

--
			Gleb.

next prev parent reply	other threads:[~2013-08-01  7:45 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-22  6:16 [PATCH RFC V11 0/18] Paravirtualized ticket spinlocks Raghavendra K T
2013-07-22  6:16 ` [PATCH RFC V11 1/18] x86/spinlock: Replace pv spinlocks with pv ticketlocks Raghavendra K T
2013-07-22  6:17 ` [PATCH RFC V11 2/18] x86/ticketlock: Don't inline _spin_unlock when using paravirt spinlocks Raghavendra K T
2013-07-22  6:17 ` [PATCH RFC V11 3/18] x86/ticketlock: Collapse a layer of functions Raghavendra K T
2013-07-22  6:17 ` [PATCH RFC V11 4/18] xen: Defer spinlock setup until boot CPU setup Raghavendra K T
2013-07-22  6:17 ` [PATCH RFC V11 5/18] xen/pvticketlock: Xen implementation for PV ticket locks Raghavendra K T
2013-07-22  6:17 ` [PATCH RFC V11 6/18] xen/pvticketlocks: Add xen_nopvspin parameter to disable xen pv ticketlocks Raghavendra K T
2013-07-22  6:18 ` [PATCH RFC V11 7/18] x86/pvticketlock: Use callee-save for lock_spinning Raghavendra K T
2013-07-22  6:18 ` [PATCH RFC V11 8/18] x86/pvticketlock: When paravirtualizing ticket locks, increment by 2 Raghavendra K T
2013-07-22  6:18 ` [PATCH RFC V11 9/18] jump_label: Split out rate limiting from jump_label.h Raghavendra K T
2013-07-22  6:18 ` [PATCH RFC V11 10/18] x86/ticketlock: Add slowpath logic Raghavendra K T
2013-07-22  6:19 ` [PATCH RFC V11 11/18] xen/pvticketlock: Allow interrupts to be enabled while blocking Raghavendra K T
2013-07-22  6:19 ` [PATCH RFC V11 12/18] kvm hypervisor : Add a hypercall to KVM hypervisor to support pv-ticketlocks Raghavendra K T
2013-07-22  6:19 ` [PATCH RFC V11 13/18] kvm : Fold pv_unhalt flag into GET_MP_STATE ioctl to aid migration Raghavendra K T
2013-07-22  6:20 ` [PATCH RFC V11 14/18] kvm guest : Add configuration support to enable debug information for KVM Guests Raghavendra K T
2013-07-22  6:20 ` [PATCH RFC V11 15/18] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor Raghavendra K T
2013-07-23 15:07   ` Gleb Natapov
2013-07-24  9:24     ` [PATCH RESEND " Raghavendra K T
2013-07-24  9:45     ` [PATCH " Raghavendra K T
2013-07-24 10:39       ` Gleb Natapov
2013-07-24 12:00         ` Raghavendra K T
2013-07-24 12:06           ` Gleb Natapov
2013-07-24 12:36             ` Raghavendra K T
2013-07-25  9:17               ` Raghavendra K T
2013-07-25  9:15                 ` Gleb Natapov
2013-07-25  9:38                   ` Raghavendra K T
2013-07-30 16:43                     ` Raghavendra K T
2013-07-31  6:24                       ` Gleb Natapov
2013-08-01  7:38                         ` Raghavendra K T
2013-08-01  7:45                           ` Gleb Natapov [this message]
2013-08-01  9:04                             ` Raghavendra K T
2013-08-02  3:22                               ` Raghavendra K T
2013-08-02  9:23                                 ` Ingo Molnar
2013-08-02  9:44                                   ` Raghavendra K T
2013-08-02  9:25                           ` Ingo Molnar
2013-08-02  9:54                             ` Gleb Natapov
2013-08-02 10:57                               ` Raghavendra K T
2013-08-05  9:46                               ` Ingo Molnar
2013-08-05 10:42                                 ` Raghavendra K T
     [not found]                                 ` <20130805095901.GL2258@redhat.com>
2013-08-05 13:52                                   ` Ingo Molnar
2013-08-05 14:05                                     ` Paolo Bonzini
2013-08-05 14:39                                       ` Raghavendra K T
2013-08-05 14:45                                         ` Paolo Bonzini
2013-08-05 15:37                                 ` Konrad Rzeszutek Wilk
2013-07-22  6:20 ` [PATCH RFC V11 16/18] kvm hypervisor : Simplify kvm_for_each_vcpu with kvm_irq_delivery_to_apic Raghavendra K T
2013-07-22  6:20 ` [PATCH RFC V11 17/18] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock Raghavendra K T
2013-07-22  6:20 ` [PATCH RFC V11 18/18] kvm hypervisor: Add directed yield in vcpu block path Raghavendra K T
2013-07-22 19:36 ` [PATCH RFC V11 0/18] Paravirtualized ticket spinlocks Konrad Rzeszutek Wilk
2013-07-23  2:50   ` Raghavendra K T
2013-08-05 22:50 ` H. Peter Anvin
2013-08-06  2:50   ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130801074529.GO7484@redhat.com \
    --to=gleb@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=attilio.rao@citrix.com \
    --cc=avi.kivity@gmail.com \
    --cc=chegu_vinod@hp.com \
    --cc=drjones@redhat.com \
    --cc=gregkh@suse.de \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=ouyang@cs.pitt.edu \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).