From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Yang Zhang <yang.zhang.wz@gmail.com>,
xen-devel@lists.xensource.com, jgross@suse.com,
Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
rkrcmar@redhat.com, kvm@vger.kernel.org, mst@redhat.com,
peterz@infradead.org, Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>,
virtualization@lists.linux-foundation.org,
"H. Peter Anvin" <hpa@zytor.com>,
Alok Kataria <akataria@vmware.com>,
wanpeng.li@hotmail.com, x86@kernel.org,
Ingo Molnar <mingo@redhat.com>, Kees Cook <keescook@chromium.org>,
Chris Wright <chrisw@sous-sol.org>,
Andy Lutomirski <luto@kernel.org>,
dmatlack@google.com, tglx@linutronix.de,
Quan Xu <quan.xu0@gmail.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
pbonzini@redhat.com,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFC PATCH v2 1/7] x86/paravirt: Add pv_idle_ops to paravirt ops
Date: Tue, 29 Aug 2017 09:55:48 -0400 [thread overview]
Message-ID: <20170829135548.GG32175@char.us.oracle.com> (raw)
In-Reply-To: <1504007201-12904-2-git-send-email-yang.zhang.wz@gmail.com>
On Tue, Aug 29, 2017 at 11:46:35AM +0000, Yang Zhang wrote:
> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called in
> idle path which will polling for a while before we enter the real idle
> state.
>
> In virtualization, idle path includes several heavy operations
> includes timer access(LAPIC timer or TSC deadline timer) which will hurt
> performance especially for latency intensive workload like message
> passing task. The cost is mainly come from the vmexit which is a
> hardware context switch between VM and hypervisor. Our solution is to
> poll for a while and do not enter real idle path if we can get the
> schedule event during polling.
>
> Poll may cause the CPU waste so we adopt a smart polling mechanism to
> reduce the useless poll.
>
> Signed-off-by: Yang Zhang <yang.zhang.wz@gmail.com>
> Signed-off-by: Quan Xu <quan.xu0@gmail.com>
> Cc: Jeremy Fitzhardinge <jeremy@goop.org>
> Cc: Chris Wright <chrisw@sous-sol.org>
> Cc: Alok Kataria <akataria@vmware.com>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: virtualization@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org
Adding xen-devel.
Juergen, we really should replace Jeremy's name with xen-devel or
your name.. Wasn't there an patch by you that took some of the
mainternship over it?
> ---
> arch/x86/include/asm/paravirt.h | 5 +++++
> arch/x86/include/asm/paravirt_types.h | 6 ++++++
> arch/x86/kernel/paravirt.c | 6 ++++++
> 3 files changed, 17 insertions(+)
>
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 9ccac19..6d46760 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -202,6 +202,11 @@ static inline unsigned long long paravirt_read_pmc(int counter)
>
> #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter))
>
> +static inline void paravirt_idle_poll(void)
> +{
> + PVOP_VCALL0(pv_idle_ops.poll);
> +}
> +
> static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
> {
> PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries);
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index 9ffc36b..cf45726 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -324,6 +324,10 @@ struct pv_lock_ops {
> struct paravirt_callee_save vcpu_is_preempted;
> } __no_randomize_layout;
>
> +struct pv_idle_ops {
> + void (*poll)(void);
> +} __no_randomize_layout;
> +
> /* This contains all the paravirt structures: we get a convenient
> * number for each function using the offset which we use to indicate
> * what to patch. */
> @@ -334,6 +338,7 @@ struct paravirt_patch_template {
> struct pv_irq_ops pv_irq_ops;
> struct pv_mmu_ops pv_mmu_ops;
> struct pv_lock_ops pv_lock_ops;
> + struct pv_idle_ops pv_idle_ops;
> } __no_randomize_layout;
>
> extern struct pv_info pv_info;
> @@ -343,6 +348,7 @@ struct paravirt_patch_template {
> extern struct pv_irq_ops pv_irq_ops;
> extern struct pv_mmu_ops pv_mmu_ops;
> extern struct pv_lock_ops pv_lock_ops;
> +extern struct pv_idle_ops pv_idle_ops;
>
> #define PARAVIRT_PATCH(x) \
> (offsetof(struct paravirt_patch_template, x) / sizeof(void *))
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index bc0a849..1b5b247 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -128,6 +128,7 @@ static void *get_call_destination(u8 type)
> #ifdef CONFIG_PARAVIRT_SPINLOCKS
> .pv_lock_ops = pv_lock_ops,
> #endif
> + .pv_idle_ops = pv_idle_ops,
> };
> return *((void **)&tmpl + type);
> }
> @@ -312,6 +313,10 @@ struct pv_time_ops pv_time_ops = {
> .steal_clock = native_steal_clock,
> };
>
> +struct pv_idle_ops pv_idle_ops = {
> + .poll = paravirt_nop,
> +};
> +
> __visible struct pv_irq_ops pv_irq_ops = {
> .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
> .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
> @@ -471,3 +476,4 @@ struct pv_mmu_ops pv_mmu_ops __ro_after_init = {
> EXPORT_SYMBOL (pv_mmu_ops);
> EXPORT_SYMBOL_GPL(pv_info);
> EXPORT_SYMBOL (pv_irq_ops);
> +EXPORT_SYMBOL (pv_idle_ops);
> --
> 1.8.3.1
>
WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Yang Zhang <yang.zhang.wz@gmail.com>,
xen-devel@lists.xensource.com, jgross@suse.com,
Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
wanpeng.li@hotmail.com, mst@redhat.com, pbonzini@redhat.com,
tglx@linutronix.de, rkrcmar@redhat.com, dmatlack@google.com,
agraf@suse.de, peterz@infradead.org, linux-doc@vger.kernel.org,
Quan Xu <quan.xu0@gmail.com>,
Jeremy Fitzhardinge <jeremy@goop.org>,
Chris Wright <chrisw@sous-sol.org>,
Alok Kataria <akataria@vmware.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, Andy Lutomirski <luto@kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>,
Kees Cook <keescook@chromium.org>,
virtualization@lists.linux-foundation.org
Subject: Re: [RFC PATCH v2 1/7] x86/paravirt: Add pv_idle_ops to paravirt ops
Date: Tue, 29 Aug 2017 09:55:48 -0400 [thread overview]
Message-ID: <20170829135548.GG32175@char.us.oracle.com> (raw)
In-Reply-To: <1504007201-12904-2-git-send-email-yang.zhang.wz@gmail.com>
On Tue, Aug 29, 2017 at 11:46:35AM +0000, Yang Zhang wrote:
> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called in
> idle path which will polling for a while before we enter the real idle
> state.
>
> In virtualization, idle path includes several heavy operations
> includes timer access(LAPIC timer or TSC deadline timer) which will hurt
> performance especially for latency intensive workload like message
> passing task. The cost is mainly come from the vmexit which is a
> hardware context switch between VM and hypervisor. Our solution is to
> poll for a while and do not enter real idle path if we can get the
> schedule event during polling.
>
> Poll may cause the CPU waste so we adopt a smart polling mechanism to
> reduce the useless poll.
>
> Signed-off-by: Yang Zhang <yang.zhang.wz@gmail.com>
> Signed-off-by: Quan Xu <quan.xu0@gmail.com>
> Cc: Jeremy Fitzhardinge <jeremy@goop.org>
> Cc: Chris Wright <chrisw@sous-sol.org>
> Cc: Alok Kataria <akataria@vmware.com>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: virtualization@lists.linux-foundation.org
> Cc: linux-kernel@vger.kernel.org
Adding xen-devel.
Juergen, we really should replace Jeremy's name with xen-devel or
your name.. Wasn't there an patch by you that took some of the
mainternship over it?
> ---
> arch/x86/include/asm/paravirt.h | 5 +++++
> arch/x86/include/asm/paravirt_types.h | 6 ++++++
> arch/x86/kernel/paravirt.c | 6 ++++++
> 3 files changed, 17 insertions(+)
>
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 9ccac19..6d46760 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -202,6 +202,11 @@ static inline unsigned long long paravirt_read_pmc(int counter)
>
> #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter))
>
> +static inline void paravirt_idle_poll(void)
> +{
> + PVOP_VCALL0(pv_idle_ops.poll);
> +}
> +
> static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
> {
> PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries);
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index 9ffc36b..cf45726 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -324,6 +324,10 @@ struct pv_lock_ops {
> struct paravirt_callee_save vcpu_is_preempted;
> } __no_randomize_layout;
>
> +struct pv_idle_ops {
> + void (*poll)(void);
> +} __no_randomize_layout;
> +
> /* This contains all the paravirt structures: we get a convenient
> * number for each function using the offset which we use to indicate
> * what to patch. */
> @@ -334,6 +338,7 @@ struct paravirt_patch_template {
> struct pv_irq_ops pv_irq_ops;
> struct pv_mmu_ops pv_mmu_ops;
> struct pv_lock_ops pv_lock_ops;
> + struct pv_idle_ops pv_idle_ops;
> } __no_randomize_layout;
>
> extern struct pv_info pv_info;
> @@ -343,6 +348,7 @@ struct paravirt_patch_template {
> extern struct pv_irq_ops pv_irq_ops;
> extern struct pv_mmu_ops pv_mmu_ops;
> extern struct pv_lock_ops pv_lock_ops;
> +extern struct pv_idle_ops pv_idle_ops;
>
> #define PARAVIRT_PATCH(x) \
> (offsetof(struct paravirt_patch_template, x) / sizeof(void *))
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index bc0a849..1b5b247 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -128,6 +128,7 @@ static void *get_call_destination(u8 type)
> #ifdef CONFIG_PARAVIRT_SPINLOCKS
> .pv_lock_ops = pv_lock_ops,
> #endif
> + .pv_idle_ops = pv_idle_ops,
> };
> return *((void **)&tmpl + type);
> }
> @@ -312,6 +313,10 @@ struct pv_time_ops pv_time_ops = {
> .steal_clock = native_steal_clock,
> };
>
> +struct pv_idle_ops pv_idle_ops = {
> + .poll = paravirt_nop,
> +};
> +
> __visible struct pv_irq_ops pv_irq_ops = {
> .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
> .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
> @@ -471,3 +476,4 @@ struct pv_mmu_ops pv_mmu_ops __ro_after_init = {
> EXPORT_SYMBOL (pv_mmu_ops);
> EXPORT_SYMBOL_GPL(pv_info);
> EXPORT_SYMBOL (pv_irq_ops);
> +EXPORT_SYMBOL (pv_idle_ops);
> --
> 1.8.3.1
>
next prev parent reply other threads:[~2017-08-29 13:55 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-29 11:46 [RFC PATCH v2 0/7] x86/idle: add halt poll support Yang Zhang
2017-08-29 11:46 ` [RFC PATCH v2 1/7] x86/paravirt: Add pv_idle_ops to paravirt ops Yang Zhang
2017-08-29 11:46 ` Yang Zhang
2017-08-29 13:55 ` Konrad Rzeszutek Wilk [this message]
2017-08-29 13:55 ` Konrad Rzeszutek Wilk
2017-08-30 7:33 ` Juergen Gross
2017-08-30 7:33 ` Juergen Gross
2017-09-01 6:50 ` Yang Zhang
2017-09-01 6:50 ` Yang Zhang
2017-08-29 11:46 ` [RFC PATCH v2 2/7] KVM guest: register kvm_idle_poll for pv_idle_ops Yang Zhang
2017-08-29 11:46 ` [RFC PATCH v2 3/7] sched/idle: Add poll before enter real idle path Yang Zhang
2017-08-29 12:45 ` Peter Zijlstra
2017-09-01 5:57 ` Quan Xu
2017-09-14 8:41 ` Quan Xu
2017-09-14 9:18 ` Borislav Petkov
2017-08-29 14:39 ` Borislav Petkov
2017-09-01 6:49 ` Quan Xu
2017-09-29 10:39 ` Quan Xu
2017-08-29 11:46 ` [RFC PATCH v2 4/7] x86/paravirt: Add update in x86/paravirt pv_idle_ops Yang Zhang
2017-08-29 11:46 ` Yang Zhang
2017-08-29 11:46 ` [RFC PATCH v2 5/7] Documentation: Add three sysctls for smart idle poll Yang Zhang
2017-08-29 11:46 ` Yang Zhang
2017-08-29 17:20 ` Luis R. Rodriguez
2017-08-29 17:20 ` Luis R. Rodriguez
2017-08-29 17:20 ` Luis R. Rodriguez
2017-08-29 11:46 ` Yang Zhang
2017-08-29 11:46 ` [RFC PATCH v2 6/7] KVM guest: introduce smart idle poll algorithm Yang Zhang
2017-08-29 11:46 ` [RFC PATCH v2 7/7] sched/idle: update poll time when wakeup from idle Yang Zhang
2017-08-29 12:46 ` Peter Zijlstra
2017-09-01 7:30 ` Yang Zhang
2017-09-29 10:29 ` Quan Xu
2017-08-29 11:58 ` [RFC PATCH v2 0/7] x86/idle: add halt poll support Alexander Graf
2017-09-01 6:21 ` Yang Zhang
2017-08-29 13:03 ` Andi Kleen
2017-08-29 14:02 ` Wanpeng Li
2017-08-29 14:27 ` Konrad Rzeszutek Wilk
2017-08-29 14:36 ` Michael S. Tsirkin
2017-09-01 6:32 ` Yang Zhang
2017-09-01 6:52 ` Wanpeng Li
2017-09-01 6:44 ` Yang Zhang
2017-09-01 6:58 ` Wanpeng Li
2017-09-01 7:53 ` Yang Zhang
2017-08-29 14:56 ` Michael S. Tsirkin
2017-09-13 11:56 ` Yang Zhang
2017-09-14 8:36 ` Quan Xu
2017-09-14 9:19 ` Wanpeng Li
2017-09-14 9:40 ` Quan Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170829135548.GG32175@char.us.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=akataria@vmware.com \
--cc=boris.ostrovsky@oracle.com \
--cc=chrisw@sous-sol.org \
--cc=dmatlack@google.com \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=jgross@suse.com \
--cc=keescook@chromium.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=quan.xu0@gmail.com \
--cc=rkrcmar@redhat.com \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.linux-foundation.org \
--cc=wanpeng.li@hotmail.com \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xensource.com \
--cc=xinhui.pan@linux.vnet.ibm.com \
--cc=yang.zhang.wz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.