From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
xen-devel <xen-devel@lists.xenproject.org>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Keir Fraser <keir@xen.org>, Tim Deegan <tim@xen.org>
Subject: Re: [PATCH 1/2] x86: suppress event check IPI to MWAITing CPUs
Date: Thu, 11 Sep 2014 11:02:54 +0100 [thread overview]
Message-ID: <5411734E.8010303@citrix.com> (raw)
In-Reply-To: <54118A190200007800033BC9@mail.emea.novell.com>
[-- Attachment #1.1: Type: text/plain, Size: 4388 bytes --]
On 11/09/14 10:40, Jan Beulich wrote:
> Mass wakeups (via vlapic_ipi()) can take enormous amounts of time,
> especially when many of the remote pCPU-s are in deep C-states. For
> 64-vCPU Windows Server 2012 R2 guests on Ivybridge hardware,
> accumulated times of over 2ms were observed (average 1.1ms).
> Considering that Windows broadcasts IPIs from its timer interrupt,
> which at least at certain times can run at 1kHz, it is clear that this
> can't result in good guest behavior. In fact, on said hardware guests
> with significantly beyond 40 vCPU-s simply hung when e.g. ServerManager
> gets started.
>
> Recognizing that writes to softirq_pending() already have the effect of
> waking remote CPUs from MWAITing (due to being co-located on the same
> cache line with mwait_wakeup()), we can avoid sending IPIs to CPUs we
> know are in a (deep) C-state entered via MWAIT.
>
> With this, average broadcast times for a 64-vCPU guest went down to a
> measured maximum of 255us (which is still quite a lot).
>
> One aspect worth noting is that cpumask_raise_softirq() gets brought in
> sync here with cpu_raise_softirq() in that now both don't attempt to
> raise a self-IPI on the processing CPU.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -330,6 +330,16 @@ void cpuidle_wakeup_mwait(cpumask_t *mas
> cpumask_andnot(mask, mask, &target);
> }
>
> +bool_t arch_skip_send_event_check(unsigned int cpu)
> +{
> + /*
> + * This relies on softirq_pending() and mwait_wakeup() to access data
> + * on the same cache line.
> + */
> + smp_mb();
> + return !!cpumask_test_cpu(cpu, &cpuidle_mwait_flags);
> +}
> +
> void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
> {
> unsigned int cpu = smp_processor_id();
> @@ -349,7 +359,7 @@ void mwait_idle_with_hints(unsigned int
> * Timer deadline passing is the event on which we will be woken via
> * cpuidle_mwait_wakeup. So check it now that the location is armed.
> */
> - if ( expires > NOW() || expires == 0 )
> + if ( (expires > NOW() || expires == 0) && !softirq_pending(cpu) )
> {
> cpumask_set_cpu(cpu, &cpuidle_mwait_flags);
Don't you need a smp_wmb() or better here for the results of
cpumask_set_cpu() to be guaranteed to be externally visible? mwait does
not appear to be a serialising instruction, and doesn't appear to have
any ordering guarantees in the manual.
~Andrew
> __mwait(eax, ecx);
> --- a/xen/common/softirq.c
> +++ b/xen/common/softirq.c
> @@ -70,12 +70,14 @@ void open_softirq(int nr, softirq_handle
>
> void cpumask_raise_softirq(const cpumask_t *mask, unsigned int nr)
> {
> - int cpu;
> + unsigned int cpu, this_cpu = smp_processor_id();
> cpumask_t send_mask;
>
> cpumask_clear(&send_mask);
> for_each_cpu(cpu, mask)
> - if ( !test_and_set_bit(nr, &softirq_pending(cpu)) )
> + if ( !test_and_set_bit(nr, &softirq_pending(cpu)) &&
> + cpu != this_cpu &&
> + !arch_skip_send_event_check(cpu) )
> cpumask_set_cpu(cpu, &send_mask);
>
> smp_send_event_check_mask(&send_mask);
> @@ -84,7 +86,8 @@ void cpumask_raise_softirq(const cpumask
> void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
> {
> if ( !test_and_set_bit(nr, &softirq_pending(cpu))
> - && (cpu != smp_processor_id()) )
> + && (cpu != smp_processor_id())
> + && !arch_skip_send_event_check(cpu) )
> smp_send_event_check_cpu(cpu);
> }
>
> --- a/xen/include/asm-arm/softirq.h
> +++ b/xen/include/asm-arm/softirq.h
> @@ -3,6 +3,8 @@
>
> #define NR_ARCH_SOFTIRQS 0
>
> +#define arch_skip_send_event_check(cpu) 0
> +
> #endif /* __ASM_SOFTIRQ_H__ */
> /*
> * Local variables:
> --- a/xen/include/asm-x86/softirq.h
> +++ b/xen/include/asm-x86/softirq.h
> @@ -9,4 +9,6 @@
> #define PCI_SERR_SOFTIRQ (NR_COMMON_SOFTIRQS + 4)
> #define NR_ARCH_SOFTIRQS 5
>
> +bool_t arch_skip_send_event_check(unsigned int cpu);
> +
> #endif /* __ASM_SOFTIRQ_H__ */
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
[-- Attachment #1.2: Type: text/html, Size: 5033 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2014-09-11 10:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-11 9:36 [PATCH 0/2] x86: improve remote CPU wakeup Jan Beulich
2014-09-11 9:40 ` [PATCH 1/2] x86: suppress event check IPI to MWAITing CPUs Jan Beulich
2014-09-11 10:02 ` Andrew Cooper [this message]
2014-09-11 10:07 ` Jan Beulich
2014-09-11 10:09 ` Andrew Cooper
2014-09-11 10:26 ` Jan Beulich
2014-09-11 9:40 ` [PATCH 2/2] x86/HVM: batch vCPU wakeups Jan Beulich
2014-09-11 10:48 ` Andrew Cooper
2014-09-11 11:03 ` Jan Beulich
2014-09-11 11:11 ` Andrew Cooper
2014-09-18 10:59 ` [PATCH 0/2] x86: improve remote CPU wakeup Tim Deegan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5411734E.8010303@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=Ian.Campbell@eu.citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=keir@xen.org \
--cc=tim@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).