From: Marcelo Tosatti <mtosatti@redhat.com>
To: "Zhai, Edwin" <edwin.zhai@intel.com>,
Mark Langsdorf <mark.langsdorf@amd.com>
Cc: Avi Kivity <avi@redhat.com>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting
Date: Fri, 2 Oct 2009 15:28:40 -0300 [thread overview]
Message-ID: <20091002182840.GA6533@amt.cnet> (raw)
In-Reply-To: <20090930162249.GA7440@amt.cnet>
On Wed, Sep 30, 2009 at 01:22:49PM -0300, Marcelo Tosatti wrote:
> On Wed, Sep 30, 2009 at 09:01:51AM +0800, Zhai, Edwin wrote:
> > Avi,
> > I modify it according your comments. The only thing I want to keep is
> > the module param ple_gap/window. Although they are not per-guest, they
> > can be used to find the right value, and disable PLE for debug purpose.
> >
> > Thanks,
> >
> >
> > Avi Kivity wrote:
> >> On 09/28/2009 11:33 AM, Zhai, Edwin wrote:
> >>
> >>> Avi Kivity wrote:
> >>>
> >>>> +#define KVM_VMX_DEFAULT_PLE_GAP 41
> >>>> +#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
> >>>> +static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP;
> >>>> +module_param(ple_gap, int, S_IRUGO);
> >>>> +
> >>>> +static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
> >>>> +module_param(ple_window, int, S_IRUGO);
> >>>>
> >>>> Shouldn't be __read_mostly since they're read very rarely
> >>>> (__read_mostly should be for variables that are very often read,
> >>>> and rarely written).
> >>>>
> >>> In general, they are read only except that experienced user may try
> >>> different parameter for perf tuning.
> >>>
> >>
> >>
> >> __read_mostly doesn't just mean it's read mostly. It also means it's
> >> read often. Otherwise it's just wasting space in hot cachelines.
> >>
> >>
> >>>> I'm not even sure they should be parameters.
> >>>>
> >>> For different spinlock in different OS, and for different workloads,
> >>> we need different parameter for tuning. It's similar as the
> >>> enable_ept.
> >>>
> >>
> >> No, global parameters don't work for tuning workloads and guests since
> >> they cannot be modified on a per-guest basis. enable_ept is only
> >> useful for debugging and testing.
> >>
> >>
> >>>>> + set_current_state(TASK_INTERRUPTIBLE);
> >>>>> + schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
> >>>>> +
> >>>>>
> >>>> Please add a tracepoint for this (since it can cause significant
> >>>> change in behaviour),
> >>> Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE
> >>> vmexit from other vmexits.
> >>>
> >>
> >> Right. I thought of the software spinlock detector, but that's another
> >> problem.
> >>
> >> I think you can drop the sleep_time parameter, it can be part of the
> >> function. Also kvm_vcpu_sleep() is confusing, we also sleep on halt.
> >> Please call it kvm_vcpu_on_spin() or something (since that's what the
> >> guest is doing).
>
> kvm_vcpu_on_spin() should add the vcpu to vcpu->wq (so a new pending
> interrupt wakes it up immediately).
Updated version (also please send it separately from the vmx.c patch):
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 894a56e..43125dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -231,6 +231,7 @@ int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
void kvm_vcpu_block(struct kvm_vcpu *vcpu);
+void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
void kvm_resched(struct kvm_vcpu *vcpu);
void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4d0dd39..e788d70 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1479,6 +1479,21 @@ void kvm_resched(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvm_resched);
+void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu)
+{
+ ktime_t expires;
+ DEFINE_WAIT(wait);
+
+ prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
+
+ /* Sleep for 100 us, and hope lock-holder got scheduled */
+ expires = ktime_add_ns(ktime_get(), 100000UL);
+ schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
+
+ finish_wait(&vcpu->wq, &wait);
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin);
+
static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
{
struct kvm_vcpu *vcpu = vma->vm_file->private_data;
next prev parent reply other threads:[~2009-10-02 18:29 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-23 14:04 [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting Zhai, Edwin
2009-09-23 14:09 ` Avi Kivity
2009-09-25 1:11 ` Zhai, Edwin
2009-09-27 8:28 ` Avi Kivity
2009-09-28 9:33 ` Zhai, Edwin
2009-09-29 12:05 ` Zhai, Edwin
2009-09-29 13:34 ` Avi Kivity
2009-09-30 1:01 ` Zhai, Edwin
2009-09-30 6:28 ` Avi Kivity
2009-09-30 16:22 ` Marcelo Tosatti
2009-10-02 18:28 ` Marcelo Tosatti [this message]
2009-10-09 10:03 ` Zhai, Edwin
2009-10-11 15:34 ` Avi Kivity
2009-10-12 19:13 ` Marcelo Tosatti
2009-09-25 20:43 ` Joerg Roedel
2009-09-27 8:31 ` Avi Kivity
2009-09-27 13:46 ` Joerg Roedel
2009-09-27 13:47 ` Avi Kivity
2009-09-27 14:07 ` Joerg Roedel
2009-09-27 14:18 ` Avi Kivity
2009-09-27 14:53 ` Joerg Roedel
2009-09-29 16:46 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091002182840.GA6533@amt.cnet \
--to=mtosatti@redhat.com \
--cc=avi@redhat.com \
--cc=edwin.zhai@intel.com \
--cc=kvm@vger.kernel.org \
--cc=mark.langsdorf@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).