From: Gautham R Shenoy <ego@linux.vnet.ibm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>,
Parth Shah <parth@linux.ibm.com>,
Ihor Pasichnyk <Ihor.Pasichnyk@ibm.com>,
Waiman Long <longman@redhat.com>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 1/2] powerpc/vcpu: Assume dedicated processors as non-preempt
Date: Thu, 5 Dec 2019 10:37:54 +0530 [thread overview]
Message-ID: <20191205050754.GA32581@in.ibm.com> (raw)
In-Reply-To: <20191204134459.22470-1-srikar@linux.vnet.ibm.com>
Hi Srikar,
On Wed, Dec 04, 2019 at 07:14:58PM +0530, Srikar Dronamraju wrote:
> With commit 247f2f6f3c70 ("sched/core: Don't schedule threads on pre-empted
> vCPUs"), scheduler avoids preempted vCPUs to schedule tasks on wakeup.
> This leads to wrong choice of CPU, which in-turn leads to larger wakeup
> latencies. Eventually, it leads to performance regression in latency
> sensitive benchmarks like soltp, schbench etc.
The regression in the latency sensitive benchmarks is due to
preferring potentially busy vCPUs over vCPUs in the CEDE state.
>
> On Powerpc, vcpu_is_preempted only looks at yield_count. If the
> yield_count is odd, the vCPU is assumed to be preempted. However
> yield_count is increased whenever LPAR enters CEDE state. So any CPU
> that has entered CEDE state is assumed to be preempted.
>
> Even if vCPU of dedicated LPAR is preempted/donated, it should have
> right of first-use since they are suppose to own the vCPU.
>
> On a Power9 System with 32 cores
> # lscpu
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 128
> On-line CPU(s) list: 0-127
> Thread(s) per core: 8
> Core(s) per socket: 1
> Socket(s): 16
> NUMA node(s): 2
> Model: 2.2 (pvr 004e 0202)
> Model name: POWER9 (architected), altivec supported
> Hypervisor vendor: pHyp
> Virtualization type: para
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 512K
> L3 cache: 10240K
> NUMA node0 CPU(s): 0-63
> NUMA node1 CPU(s): 64-127
>
>
> # perf stat -a -r 5 ./schbench
> v5.4 v5.4 + patch
> Latency percentiles (usec) Latency percentiles (usec)
> 49.0000th: 47 50.0000th: 33
> 74.0000th: 64 75.0000th: 44
> 89.0000th: 76 90.0000th: 50
> 94.0000th: 83 95.0000th: 53
> *98.0000th: 103 *99.0000th: 57
> 98.5000th: 2124 99.5000th: 59
> 98.9000th: 7976 99.9000th: 83
> min=-1, max=10519 min=0, max=117
> Latency percentiles (usec) Latency percentiles (usec)
> 49.0000th: 45 50.0000th: 34
> 74.0000th: 61 75.0000th: 45
> 89.0000th: 70 90.0000th: 52
> 94.0000th: 77 95.0000th: 56
> *98.0000th: 504 *99.0000th: 62
> 98.5000th: 4012 99.5000th: 64
> 98.9000th: 8168 99.9000th: 79
> min=-1, max=14500 min=0, max=123
> Latency percentiles (usec) Latency percentiles (usec)
> 49.0000th: 48 50.0000th: 35
> 74.0000th: 65 75.0000th: 47
> 89.0000th: 76 90.0000th: 55
> 94.0000th: 82 95.0000th: 59
> *98.0000th: 1098 *99.0000th: 67
> 98.5000th: 3988 99.5000th: 71
> 98.9000th: 9360 99.9000th: 98
> min=-1, max=19283 min=0, max=137
> Latency percentiles (usec) Latency percentiles (usec)
> 49.0000th: 46 50.0000th: 35
> 74.0000th: 63 75.0000th: 46
> 89.0000th: 73 90.0000th: 53
> 94.0000th: 78 95.0000th: 57
> *98.0000th: 113 *99.0000th: 63
> 98.5000th: 2316 99.5000th: 65
> 98.9000th: 7704 99.9000th: 83
> min=-1, max=17976 min=0, max=139
> Latency percentiles (usec) Latency percentiles (usec)
> 49.0000th: 46 50.0000th: 34
> 74.0000th: 62 75.0000th: 46
> 89.0000th: 73 90.0000th: 53
> 94.0000th: 79 95.0000th: 57
> *98.0000th: 97 *99.0000th: 64
> 98.5000th: 1398 99.5000th: 70
> 98.9000th: 8136 99.9000th: 100
> min=-1, max=10008 min=0, max=142
>
> Performance counter stats for 'system wide' (4 runs):
>
> context-switches 42,604 ( +- 0.87% ) 45,397 ( +- 0.25% )
> cpu-migrations 0,195 ( +- 2.70% ) 230 ( +- 7.23% )
> page-faults 16,783 ( +- 14.87% ) 16,781 ( +- 9.77% )
>
> Waiman Long suggested using static_keys.
This needs to be Cc'ed to the stable kernel 4.18+
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>
> Reported-by: Parth Shah <parth@linux.ibm.com>
> Reported-by: Ihor Pasichnyk <Ihor.Pasichnyk@ibm.com>
> Cc: Parth Shah <parth@linux.ibm.com>
> Cc: Ihor Pasichnyk <Ihor.Pasichnyk@ibm.com>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Waiman Long <longman@redhat.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/spinlock.h | 5 +++--
> arch/powerpc/mm/numa.c | 4 ++++
> 2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
> index e9a960e28f3c..866f6ca0427a 100644
> --- a/arch/powerpc/include/asm/spinlock.h
> +++ b/arch/powerpc/include/asm/spinlock.h
> @@ -35,11 +35,12 @@
> #define LOCK_TOKEN 1
> #endif
>
> -#ifdef CONFIG_PPC_PSERIES
> +#if defined(CONFIG_PPC_PSERIES) && defined(CONFIG_PPC_SPLPAR)
> +DECLARE_STATIC_KEY_FALSE(shared_processor);
> #define vcpu_is_preempted vcpu_is_preempted
> static inline bool vcpu_is_preempted(int cpu)
> {
> - if (!firmware_has_feature(FW_FEATURE_SPLPAR))
> + if (!static_branch_unlikely(&shared_processor))
> return false;
> return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
> }
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 50d68d21ddcc..ffb971f3a63c 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1568,9 +1568,13 @@ int prrn_is_enabled(void)
> return prrn_enabled;
> }
>
> +DEFINE_STATIC_KEY_FALSE(shared_processor);
> +EXPORT_SYMBOL_GPL(shared_processor);
> +
> void __init shared_proc_topology_init(void)
> {
> if (lppaca_shared_proc(get_lppaca())) {
> + static_branch_enable(&shared_processor);
> bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
> nr_cpumask_bits);
> numa_update_cpu_topology(false);
> --
> 2.18.1
>
next prev parent reply other threads:[~2019-12-05 5:10 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-04 13:44 [PATCH 1/2] powerpc/vcpu: Assume dedicated processors as non-preempt Srikar Dronamraju
2019-12-04 13:44 ` [PATCH 2/2] powerpc/shared: Use static key to detect shared processor Srikar Dronamraju
2019-12-04 16:18 ` [PATCH v2 " Srikar Dronamraju
2019-12-04 15:05 ` [PATCH 1/2] powerpc/vcpu: Assume dedicated processors as non-preempt Srikar Dronamraju
2019-12-04 16:58 ` Waiman Long
2019-12-05 5:07 ` Gautham R Shenoy [this message]
2019-12-05 7:08 ` Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191205050754.GA32581@in.ibm.com \
--to=ego@linux.vnet.ibm.com \
--cc=Ihor.Pasichnyk@ibm.com \
--cc=juri.lelli@redhat.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=longman@redhat.com \
--cc=parth@linux.ibm.com \
--cc=srikar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.