From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
To: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Nathan Lynch <nathanl@linux.ibm.com>,
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr
Date: Wed, 01 Apr 2020 15:28:48 +0530 [thread overview]
Message-ID: <1585734367.oqwn7dzljo.naveen@linux.ibm.com> (raw)
In-Reply-To: <1585308760-28792-7-git-send-email-ego@linux.vnet.ibm.com>
Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> Currently purr, spurr, idle_purr, idle_spurr are exposed for every CPU
> via the sysfs interface
> /sys/devices/system/cpu/cpuX/[idle_][s]purr. Each sysfs read currently
> generates an IPI to obtain the desired value from the target CPU X.
> Since these aforementioned sysfs are typically read one after another,
> we end up generating 4 IPIs per CPU in a short duration.
>
> In order to minimize the IPI noise, this patch caches the values of
> all the four entities whenever one of them is read. If subsequently
> any of these are read within the next 10ms, the cached value is
> returned. With this, we will generate at most one IPI every 10ms for
> every CPU.
>
> Test-results: While reading the four sysfs files back-to-back for a
> given CPU every second for 100 seconds.
>
> Without the patch:
> 16 [XICS 2 Edge IPI] = 422 times
> DBL [Doorbell interrupts] = 13 times
> Total : 435 IPIs.
>
> With the patch:
> 16 [XICS 2 Edge IPI] = 111 times
> DBL [Doorbell interrupts] = 17 times
> Total : 128 IPIs.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> arch/powerpc/kernel/sysfs.c | 117 ++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 97 insertions(+), 20 deletions(-)
>
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index 571b325..bd92023 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -586,8 +586,6 @@ void ppc_enable_pmcs(void)
> * SPRs which are not related to PMU.
> */
> #ifdef CONFIG_PPC64
> -SYSFS_SPRSETUP(purr, SPRN_PURR);
> -SYSFS_SPRSETUP(spurr, SPRN_SPURR);
> SYSFS_SPRSETUP(pir, SPRN_PIR);
> SYSFS_SPRSETUP(tscr, SPRN_TSCR);
>
> @@ -596,8 +594,6 @@ void ppc_enable_pmcs(void)
> enable write when needed with a separate function.
> Lets be conservative and default to pseries.
> */
> -static DEVICE_ATTR(spurr, 0400, show_spurr, NULL);
> -static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
> static DEVICE_ATTR(pir, 0400, show_pir, NULL);
> static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr);
> #endif /* CONFIG_PPC64 */
> @@ -761,22 +757,110 @@ static void create_svm_file(void)
> }
> #endif /* CONFIG_PPC_SVM */
>
> +#ifdef CONFIG_PPC64
> +/*
> + * The duration (in ms) from the last IPI to the target CPU until
> + * which a cached value of purr, spurr, idle_purr, idle_spurr can be
> + * reported to the user on a corresponding sysfs file read. Beyond
> + * this duration, fresh values need to be obtained by sending IPIs to
> + * the target CPU when the sysfs files are read.
> + */
> +static unsigned long util_stats_staleness_tolerance_ms = 10;
This is a nice optimization for our use in lparstat, though I have a
concern below.
> +struct util_acct_stats {
> + u64 latest_purr;
> + u64 latest_spurr;
> +#ifdef CONFIG_PPC_PSERIES
> + u64 latest_idle_purr;
> + u64 latest_idle_spurr;
> +#endif
You can probably drop the 'latest_' prefix.
> + unsigned long last_update_jiffies;
> +};
> +
> +DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats);
Per snowpatch, this should be static, and so should get_util_stats_ptr()
below:
https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/16601//artifact/linux/report.txt
> +
> +static void update_util_acct_stats(void *ptr)
> +{
> + struct util_acct_stats *stats = ptr;
> +
> + stats->latest_purr = mfspr(SPRN_PURR);
> + stats->latest_spurr = mfspr(SPRN_SPURR);
> #ifdef CONFIG_PPC_PSERIES
> -static void read_idle_purr(void *val)
> + stats->latest_idle_purr = read_this_idle_purr();
> + stats->latest_idle_spurr = read_this_idle_spurr();
> +#endif
> + stats->last_update_jiffies = jiffies;
> +}
> +
> +struct util_acct_stats *get_util_stats_ptr(int cpu)
> +{
> + struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu);
> + unsigned long delta_jiffies;
> +
> + delta_jiffies = jiffies - stats->last_update_jiffies;
> +
> + /*
> + * If we have a recent enough data, reuse that instead of
> + * sending an IPI.
> + */
> + if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms)
> + return stats;
> +
> + smp_call_function_single(cpu, update_util_acct_stats, stats, 1);
> + return stats;
> +}
> +
> +static ssize_t show_purr(struct device *dev,
> + struct device_attribute *attr, char *buf)
> {
> - u64 *ret = val;
> + struct cpu *cpu = container_of(dev, struct cpu, dev);
> + struct util_acct_stats *stats;
>
> - *ret = read_this_idle_purr();
> + stats = get_util_stats_ptr(cpu->dev.id);
> + return sprintf(buf, "%llx\n", stats->latest_purr);
This alters the behavior of the current sysfs purr file. I am not sure
if it is reasonable to return the same PURR value across a 10ms window.
I wonder if we should introduce a sysctl interface to control
thresholding. It can default to 0, which disables thresholding so that
the existing behavior continues. Applications (lparstat) can optionally
set it to suit their use.
- Naveen
next prev parent reply other threads:[~2020-04-01 10:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-27 11:32 [PATCH v4 0/6] [PATCH v4 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
2020-03-27 11:32 ` [PATCH v4 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
2020-03-27 11:32 ` [PATCH v4 2/6] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
2020-04-01 9:42 ` Naveen N. Rao
2020-04-03 6:15 ` Gautham R Shenoy
2020-04-03 10:34 ` Naveen N. Rao
2020-04-03 11:24 ` Gautham R Shenoy
2020-03-27 11:32 ` [PATCH v4 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
2020-03-27 11:32 ` [PATCH v4 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
2020-03-27 11:32 ` [PATCH v4 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
2020-04-01 9:45 ` Naveen N. Rao
2020-03-27 11:32 ` [PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr Gautham R. Shenoy
2020-04-01 9:58 ` Naveen N. Rao [this message]
2020-04-01 12:01 ` Gautham R Shenoy
2020-04-02 7:34 ` Naveen N. Rao
2020-04-03 6:28 ` Gautham R Shenoy
2020-04-03 18:10 ` Nathan Lynch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1585734367.oqwn7dzljo.naveen@linux.ibm.com \
--to=naveen.n.rao@linux.vnet.ibm.com \
--cc=ego@linux.vnet.ibm.com \
--cc=kamalesh@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=nathanl@linux.ibm.com \
--cc=svaidy@linux.vnet.ibm.com \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).