xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Keir Fraser <keir@xen.org>, Tim Deegan <tim@xen.org>
Subject: Re: [PATCH 1/2] x86: suppress event check IPI to MWAITing CPUs
Date: Thu, 11 Sep 2014 11:02:54 +0100	[thread overview]
Message-ID: <5411734E.8010303@citrix.com> (raw)
In-Reply-To: <54118A190200007800033BC9@mail.emea.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 4388 bytes --]

On 11/09/14 10:40, Jan Beulich wrote:
> Mass wakeups (via vlapic_ipi()) can take enormous amounts of time,
> especially when many of the remote pCPU-s are in deep C-states. For
> 64-vCPU Windows Server 2012 R2 guests on Ivybridge hardware,
> accumulated times of over 2ms were observed (average 1.1ms).
> Considering that Windows broadcasts IPIs from its timer interrupt,
> which at least at certain times can run at 1kHz, it is clear that this
> can't result in good guest behavior. In fact, on said hardware guests
> with significantly beyond 40 vCPU-s simply hung when e.g. ServerManager
> gets started.
>
> Recognizing that writes to softirq_pending() already have the effect of
> waking remote CPUs from MWAITing (due to being co-located on the same
> cache line with mwait_wakeup()), we can avoid sending IPIs to CPUs we
> know are in a (deep) C-state entered via MWAIT.
>
> With this, average broadcast times for a 64-vCPU guest went down to a
> measured maximum of 255us (which is still quite a lot).
>
> One aspect worth noting is that cpumask_raise_softirq() gets brought in
> sync here with cpu_raise_softirq() in that now both don't attempt to
> raise a self-IPI on the processing CPU.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -330,6 +330,16 @@ void cpuidle_wakeup_mwait(cpumask_t *mas
>      cpumask_andnot(mask, mask, &target);
>  }
>  
> +bool_t arch_skip_send_event_check(unsigned int cpu)
> +{
> +    /*
> +     * This relies on softirq_pending() and mwait_wakeup() to access data
> +     * on the same cache line.
> +     */
> +    smp_mb();
> +    return !!cpumask_test_cpu(cpu, &cpuidle_mwait_flags);
> +}
> +
>  void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
>  {
>      unsigned int cpu = smp_processor_id();
> @@ -349,7 +359,7 @@ void mwait_idle_with_hints(unsigned int 
>       * Timer deadline passing is the event on which we will be woken via
>       * cpuidle_mwait_wakeup. So check it now that the location is armed.
>       */
> -    if ( expires > NOW() || expires == 0 )
> +    if ( (expires > NOW() || expires == 0) && !softirq_pending(cpu) )
>      {
>          cpumask_set_cpu(cpu, &cpuidle_mwait_flags);

Don't you need a smp_wmb() or better here for the results of
cpumask_set_cpu() to be guaranteed to be externally visible?  mwait does
not appear to be a serialising instruction, and doesn't appear to have
any ordering guarantees in the manual.

~Andrew

>          __mwait(eax, ecx);
> --- a/xen/common/softirq.c
> +++ b/xen/common/softirq.c
> @@ -70,12 +70,14 @@ void open_softirq(int nr, softirq_handle
>  
>  void cpumask_raise_softirq(const cpumask_t *mask, unsigned int nr)
>  {
> -    int cpu;
> +    unsigned int cpu, this_cpu = smp_processor_id();
>      cpumask_t send_mask;
>  
>      cpumask_clear(&send_mask);
>      for_each_cpu(cpu, mask)
> -        if ( !test_and_set_bit(nr, &softirq_pending(cpu)) )
> +        if ( !test_and_set_bit(nr, &softirq_pending(cpu)) &&
> +             cpu != this_cpu &&
> +             !arch_skip_send_event_check(cpu) )
>              cpumask_set_cpu(cpu, &send_mask);
>  
>      smp_send_event_check_mask(&send_mask);
> @@ -84,7 +86,8 @@ void cpumask_raise_softirq(const cpumask
>  void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
>  {
>      if ( !test_and_set_bit(nr, &softirq_pending(cpu))
> -         && (cpu != smp_processor_id()) )
> +         && (cpu != smp_processor_id())
> +         && !arch_skip_send_event_check(cpu) )
>          smp_send_event_check_cpu(cpu);
>  }
>  
> --- a/xen/include/asm-arm/softirq.h
> +++ b/xen/include/asm-arm/softirq.h
> @@ -3,6 +3,8 @@
>  
>  #define NR_ARCH_SOFTIRQS       0
>  
> +#define arch_skip_send_event_check(cpu) 0
> +
>  #endif /* __ASM_SOFTIRQ_H__ */
>  /*
>   * Local variables:
> --- a/xen/include/asm-x86/softirq.h
> +++ b/xen/include/asm-x86/softirq.h
> @@ -9,4 +9,6 @@
>  #define PCI_SERR_SOFTIRQ       (NR_COMMON_SOFTIRQS + 4)
>  #define NR_ARCH_SOFTIRQS       5
>  
> +bool_t arch_skip_send_event_check(unsigned int cpu);
> +
>  #endif /* __ASM_SOFTIRQ_H__ */
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


[-- Attachment #1.2: Type: text/html, Size: 5033 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2014-09-11 10:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-11  9:36 [PATCH 0/2] x86: improve remote CPU wakeup Jan Beulich
2014-09-11  9:40 ` [PATCH 1/2] x86: suppress event check IPI to MWAITing CPUs Jan Beulich
2014-09-11 10:02   ` Andrew Cooper [this message]
2014-09-11 10:07     ` Jan Beulich
2014-09-11 10:09       ` Andrew Cooper
2014-09-11 10:26         ` Jan Beulich
2014-09-11  9:40 ` [PATCH 2/2] x86/HVM: batch vCPU wakeups Jan Beulich
2014-09-11 10:48   ` Andrew Cooper
2014-09-11 11:03     ` Jan Beulich
2014-09-11 11:11       ` Andrew Cooper
2014-09-18 10:59 ` [PATCH 0/2] x86: improve remote CPU wakeup Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5411734E.8010303@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Ian.Campbell@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).