From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Nathan Lynch <nathanl@linux.ibm.com>, mpe@ellerman.id.au
Cc: naveen.n.rao@linux.ibm.com, tyreld@linux.ibm.com,
linuxppc-dev@lists.ozlabs.org, mahesh@linux.ibm.com,
npiggin@gmail.com
Subject: Re: [PATCH] powerpc/pseries: Add pool idle time at LPAR boot
Date: Fri, 5 Apr 2024 22:32:24 +0530 [thread overview]
Message-ID: <4b1d3241-2caa-4bd1-b9cc-871d176c409e@linux.ibm.com> (raw)
In-Reply-To: <87msq86kju.fsf@li-e15d104c-2135-11b2-a85c-d7ef17e56be6.ibm.com>
On 4/5/24 6:19 PM, Nathan Lynch wrote:
> Shrikanth Hegde <sshegde@linux.ibm.com> writes:
Hi Nathan, Thanks for reviewing this.
>> When there are no options specified for lparstat, it is expected to
>> give reports since LPAR(Logical Partition) boot. App is an indicator
>> for available processor pool in an Shared Processor LPAR(SPLPAR). App is
>> derived using pool_idle_time which is obtained using H_PIC call.
>
> If "App" is short for "Available Procesoor Pool" then it should be
> written "APP" and the it should be introduced and defined more clearly
> than this.
>
Ok.I reworded it for v2.
yes APP is Available Processor Pool.
>
>> The interval based reports show correct App value while since boot
>> report shows very high App values. This happens because in that case app
>> is obtained by dividing pool idle time by LPAR uptime. Since pool idle
>> time is reported by the PowerVM hypervisor since its boot, it need not
>> align with LPAR boot. This leads to large App values.
>>
>> To fix that export boot pool idle time in lparcfg and powerpc-utils will
>> use this info to derive App as below for since boot reports.
>>
>> App = (pool idle time - boot pool idle time) / (uptime * timebase)
>>
>> Results:: Observe app values.
>> ====================== Shared LPAR ================================
>> lparstat
>> System Configuration
>> type=Shared mode=Uncapped smt=8 lcpu=12 mem=15573440 kB cpus=37 ent=12.00
>>
>> reboot
>> stress-ng --cpu=$(nproc) -t 600
>> sleep 600
>> So in this case app is expected to close to 37-6=31.
>>
>> ====== 6.9-rc1 and lparstat 1.3.10 =============
>> %user %sys %wait %idle physc %entc lbusy app vcsw phint
>> ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
>> 47.48 0.01 0.00 52.51 0.00 0.00 47.49 69099.72 541547 21
>>
>> === With this patch and powerpc-utils patch to do the above equation ===
>> %user %sys %wait %idle physc %entc lbusy app vcsw phint
>> ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
>> 47.48 0.01 0.00 52.51 5.73 47.75 47.49 31.21 541753 21
>> =====================================================================
>>
>> Note: physc, purr/idle purr being inaccurate is being handled in a
>> separate patch in powerpc-utils tree.
>>
>> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
>> ---
>> Note:
>>
>> This patch needs to merged first in the kernel for the powerpc-utils
>> patches to work. powerpc-utils patches will be posted to its mailing
>> list and link would be found in the reply to this patch if available.
>>
>> arch/powerpc/platforms/pseries/lparcfg.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c b/arch/powerpc/platforms/pseries/lparcfg.c
>> index f73c4d1c26af..8df4e7c529d7 100644
>> --- a/arch/powerpc/platforms/pseries/lparcfg.c
>> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
>> @@ -184,6 +184,8 @@ static unsigned h_pic(unsigned long *pool_idle_time,
>> return rc;
>> }
>>
>> +unsigned long boot_pool_idle_time;
>
> Should be static, and u64. Better to use explicitly sized types for data
> at the kernel-hypervisor boundary.
Current usage of h_pic doesn't follow this either. Are you suggesting we change that
as well? Or is this applicable to only boot_pool_idle_time?
For example in parse_ppp_data:
if (lppaca_shared_proc()) {
unsigned long pool_idle_time, pool_procs;
h_pic(&pool_idle_time, &pool_procs);
seq_printf(m, "pool_idle_time=%ld\n", pool_idle_time);
seq_printf(m, "pool_num_procs=%ld\n", pool_procs);
>
>> +
>> /*
>> * parse_ppp_data
>> * Parse out the data returned from h_get_ppp and h_pic
>> @@ -218,6 +220,7 @@ static void parse_ppp_data(struct seq_file *m)
>> h_pic(&pool_idle_time, &pool_procs);
>> seq_printf(m, "pool_idle_time=%ld\n", pool_idle_time);
>> seq_printf(m, "pool_num_procs=%ld\n", pool_procs);
>> + seq_printf(m, "boot_pool_idle_time=%ld\n", boot_pool_idle_time);
>
> If boot_pool_idle_time is unsigned then the format string should be %ul
> or similar, not %ld.
>
>> }
>>
>> seq_printf(m, "unallocated_capacity_weight=%d\n",
>> @@ -792,6 +795,7 @@ static const struct proc_ops lparcfg_proc_ops = {
>> static int __init lparcfg_init(void)
>> {
>> umode_t mode = 0444;
>> + unsigned long num_procs;
>>
>> /* Allow writing if we have FW_FEATURE_SPLPAR */
>> if (firmware_has_feature(FW_FEATURE_SPLPAR))
>> @@ -801,6 +805,9 @@ static int __init lparcfg_init(void)
>> printk(KERN_ERR "Failed to create powerpc/lparcfg\n");
>> return -EIO;
>> }
>> +
>> + h_pic(&boot_pool_idle_time, &num_procs);
>
> h_pic() can fail, leaving the out parameters uninitialized.
Naveen pointed to me this a while ago, but I forgot.
Currently h_pic return value is not checked at all, either at boor or at runtime.
When it fails, should we re-try or just print a kernel debug? What would be expected
behavior? because if it fails, it would anyway result in wrong values of app even
if the variables are initialized to 0.
>
>> +
>> return 0;
>> }
>> machine_device_initcall(pseries, lparcfg_init);
>> --
>> 2.39.3
next prev parent reply other threads:[~2024-04-05 17:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-05 10:13 [PATCH] powerpc/pseries: Add pool idle time at LPAR boot Shrikanth Hegde
2024-04-05 11:14 ` Shrikanth Hegde
2024-04-05 12:49 ` Nathan Lynch
2024-04-05 17:02 ` Shrikanth Hegde [this message]
2024-04-05 20:03 ` Nathan Lynch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b1d3241-2caa-4bd1-b9cc-871d176c409e@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=nathanl@linux.ibm.com \
--cc=naveen.n.rao@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox