All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Keir Fraser <keir@xen.org>, Tim Deegan <tim@xen.org>
Subject: Re: [PATCH 1/2] x86: suppress event check IPI to MWAITing CPUs
Date: Thu, 11 Sep 2014 11:02:54 +0100	[thread overview]
Message-ID: <5411734E.8010303@citrix.com> (raw)
In-Reply-To: <54118A190200007800033BC9@mail.emea.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 4388 bytes --]

On 11/09/14 10:40, Jan Beulich wrote:
> Mass wakeups (via vlapic_ipi()) can take enormous amounts of time,
> especially when many of the remote pCPU-s are in deep C-states. For
> 64-vCPU Windows Server 2012 R2 guests on Ivybridge hardware,
> accumulated times of over 2ms were observed (average 1.1ms).
> Considering that Windows broadcasts IPIs from its timer interrupt,
> which at least at certain times can run at 1kHz, it is clear that this
> can't result in good guest behavior. In fact, on said hardware guests
> with significantly beyond 40 vCPU-s simply hung when e.g. ServerManager
> gets started.
>
> Recognizing that writes to softirq_pending() already have the effect of
> waking remote CPUs from MWAITing (due to being co-located on the same
> cache line with mwait_wakeup()), we can avoid sending IPIs to CPUs we
> know are in a (deep) C-state entered via MWAIT.
>
> With this, average broadcast times for a 64-vCPU guest went down to a
> measured maximum of 255us (which is still quite a lot).
>
> One aspect worth noting is that cpumask_raise_softirq() gets brought in
> sync here with cpu_raise_softirq() in that now both don't attempt to
> raise a self-IPI on the processing CPU.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -330,6 +330,16 @@ void cpuidle_wakeup_mwait(cpumask_t *mas
>      cpumask_andnot(mask, mask, &target);
>  }
>  
> +bool_t arch_skip_send_event_check(unsigned int cpu)
> +{
> +    /*
> +     * This relies on softirq_pending() and mwait_wakeup() to access data
> +     * on the same cache line.
> +     */
> +    smp_mb();
> +    return !!cpumask_test_cpu(cpu, &cpuidle_mwait_flags);
> +}
> +
>  void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
>  {
>      unsigned int cpu = smp_processor_id();
> @@ -349,7 +359,7 @@ void mwait_idle_with_hints(unsigned int 
>       * Timer deadline passing is the event on which we will be woken via
>       * cpuidle_mwait_wakeup. So check it now that the location is armed.
>       */
> -    if ( expires > NOW() || expires == 0 )
> +    if ( (expires > NOW() || expires == 0) && !softirq_pending(cpu) )
>      {
>          cpumask_set_cpu(cpu, &cpuidle_mwait_flags);

Don't you need a smp_wmb() or better here for the results of
cpumask_set_cpu() to be guaranteed to be externally visible?  mwait does
not appear to be a serialising instruction, and doesn't appear to have
any ordering guarantees in the manual.

~Andrew

>          __mwait(eax, ecx);
> --- a/xen/common/softirq.c
> +++ b/xen/common/softirq.c
> @@ -70,12 +70,14 @@ void open_softirq(int nr, softirq_handle
>  
>  void cpumask_raise_softirq(const cpumask_t *mask, unsigned int nr)
>  {
> -    int cpu;
> +    unsigned int cpu, this_cpu = smp_processor_id();
>      cpumask_t send_mask;
>  
>      cpumask_clear(&send_mask);
>      for_each_cpu(cpu, mask)
> -        if ( !test_and_set_bit(nr, &softirq_pending(cpu)) )
> +        if ( !test_and_set_bit(nr, &softirq_pending(cpu)) &&
> +             cpu != this_cpu &&
> +             !arch_skip_send_event_check(cpu) )
>              cpumask_set_cpu(cpu, &send_mask);
>  
>      smp_send_event_check_mask(&send_mask);
> @@ -84,7 +86,8 @@ void cpumask_raise_softirq(const cpumask
>  void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
>  {
>      if ( !test_and_set_bit(nr, &softirq_pending(cpu))
> -         && (cpu != smp_processor_id()) )
> +         && (cpu != smp_processor_id())
> +         && !arch_skip_send_event_check(cpu) )
>          smp_send_event_check_cpu(cpu);
>  }
>  
> --- a/xen/include/asm-arm/softirq.h
> +++ b/xen/include/asm-arm/softirq.h
> @@ -3,6 +3,8 @@
>  
>  #define NR_ARCH_SOFTIRQS       0
>  
> +#define arch_skip_send_event_check(cpu) 0
> +
>  #endif /* __ASM_SOFTIRQ_H__ */
>  /*
>   * Local variables:
> --- a/xen/include/asm-x86/softirq.h
> +++ b/xen/include/asm-x86/softirq.h
> @@ -9,4 +9,6 @@
>  #define PCI_SERR_SOFTIRQ       (NR_COMMON_SOFTIRQS + 4)
>  #define NR_ARCH_SOFTIRQS       5
>  
> +bool_t arch_skip_send_event_check(unsigned int cpu);
> +
>  #endif /* __ASM_SOFTIRQ_H__ */
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


[-- Attachment #1.2: Type: text/html, Size: 5033 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2014-09-11 10:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-11  9:36 [PATCH 0/2] x86: improve remote CPU wakeup Jan Beulich
2014-09-11  9:40 ` [PATCH 1/2] x86: suppress event check IPI to MWAITing CPUs Jan Beulich
2014-09-11 10:02   ` Andrew Cooper [this message]
2014-09-11 10:07     ` Jan Beulich
2014-09-11 10:09       ` Andrew Cooper
2014-09-11 10:26         ` Jan Beulich
2014-09-11  9:40 ` [PATCH 2/2] x86/HVM: batch vCPU wakeups Jan Beulich
2014-09-11 10:48   ` Andrew Cooper
2014-09-11 11:03     ` Jan Beulich
2014-09-11 11:11       ` Andrew Cooper
2014-09-18 10:59 ` [PATCH 0/2] x86: improve remote CPU wakeup Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5411734E.8010303@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Ian.Campbell@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.