public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Moger, Babu" <babu.moger@amd.com>
To: James Morse <james.morse@arm.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>,
	shameerali.kolothum.thodi@huawei.com,
	D Scott Phillips OS <scott@os.amperecomputing.com>,
	carl@os.amperecomputing.com, lcherian@marvell.com,
	bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com,
	baolin.wang@linux.alibaba.com,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>,
	peternewman@google.com, dfustini@baylibre.com,
	amitsinght@marvell.com
Subject: Re: [PATCH v7 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI
Date: Thu, 9 Nov 2023 14:40:33 -0600	[thread overview]
Message-ID: <c0cc5cfc-c22b-4a27-b512-75f2e27b59cb@amd.com> (raw)
In-Reply-To: <20231025180345.28061-14-james.morse@arm.com>

Hi James,

On 10/25/23 13:03, James Morse wrote:
> Intel is blessed with an abundance of monitors, one per RMID, that can be
> read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC,
> the number implemented is up to the manufacturer. This means when there are
> fewer monitors than needed, they need to be allocated and freed.
> 
> MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The
> CSU counter is allowed to return 'not ready' for a small number of
> micro-seconds after programming. To allow one CSU hardware monitor to be
> used for multiple control or monitor groups, the CPU accessing the
> monitor needs to be able to block when configuring and reading the
> counter.
> 
> Worse, the domain may be broken up into slices, and the MMIO accesses
> for each slice may need performing from different CPUs.
> 
> These two details mean MPAMs monitor code needs to be able to sleep, and
> IPI another CPU in the domain to read from a resource that has been sliced.
> 
> mon_event_read() already invokes mon_event_count() via IPI, which means
> this isn't possible. On systems using nohz-full, some CPUs need to be
> interrupted to run kernel work as they otherwise stay in user-space
> running realtime workloads. Interrupting these CPUs should be avoided,
> and scheduling work on them may never complete.
> 
> Change mon_event_read() to pick a housekeeping CPU, (one that is not using
> nohz_full) and schedule mon_event_count() and wait. If all the CPUs
> in a domain are using nohz-full, then an IPI is used as the fallback.
> 
> This function is only used in response to a user-space filesystem request
> (not the timing sensitive overflow code).
> 
> This allows MPAM to hide the slice behaviour from resctrl, and to keep
> the monitor-allocation in monitor.c. When the IPI fallback is used on
> machines where MPAM needs to make an access on multiple CPUs, the counter
> read will always fail.
> 
> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> Reviewed-by: Peter Newman <peternewman@google.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v2:
>  * Use cpumask_any_housekeeping() and fallback to an IPI if needed.
> 
> Changes since v3:
>  * Actually include the IPI fallback code.
> 
> Changes since v4:
>  * Tinkered with existing capitalisation.
> 
> Changes since v5:
>  * Added a newline.
> 
> Changes since v6:
>  * Moved lockdep annotations to a later patch.
> ---
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 26 +++++++++++++++++++++--
>  arch/x86/kernel/cpu/resctrl/monitor.c     |  2 +-
>  2 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index beccb0e87ba7..d07f99245851 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -19,6 +19,8 @@
>  #include <linux/kernfs.h>
>  #include <linux/seq_file.h>
>  #include <linux/slab.h>
> +#include <linux/tick.h>
> +
>  #include "internal.h"
>  
>  /*
> @@ -522,12 +524,21 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
>  	return ret;
>  }
>  
> +static int smp_mon_event_count(void *arg)
> +{
> +	mon_event_count(arg);
> +
> +	return 0;
> +}

Shouldn't this function defined as "void" similar to mon_event_count?
Return code is not used anywhere.

> +
>  void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  		    struct rdt_domain *d, struct rdtgroup *rdtgrp,
>  		    int evtid, int first)
>  {
> +	int cpu;
> +
>  	/*
> -	 * setup the parameters to send to the IPI to read the data.
> +	 * Setup the parameters to pass to mon_event_count() to read the data.
>  	 */
>  	rr->rgrp = rdtgrp;
>  	rr->evtid = evtid;
> @@ -536,7 +547,18 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  	rr->val = 0;
>  	rr->first = first;
>  
> -	smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1);
> +	cpu = cpumask_any_housekeeping(&d->cpu_mask);
> +
> +	/*
> +	 * cpumask_any_housekeeping() prefers housekeeping CPUs, but
> +	 * are all the CPUs nohz_full? If yes, pick a CPU to IPI.
> +	 * MPAM's resctrl_arch_rmid_read() is unable to read the
> +	 * counters on some platforms if its called in irq context.
> +	 */
> +	if (tick_nohz_full_cpu(cpu))
> +		smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1);
> +	else
> +		smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
>  }
>  
>  int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 718770aea2af..fa3319021881 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -588,7 +588,7 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr)
>  }
>  
>  /*
> - * This is called via IPI to read the CQM/MBM counters
> + * This is scheduled by mon_event_read() to read the CQM/MBM counters
>   * on a domain.
>   */
>  void mon_event_count(void *info)

-- 
Thanks
Babu Moger

  parent reply	other threads:[~2023-11-09 20:40 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 18:03 [PATCH v7 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2023-10-25 18:03 ` [PATCH v7 01/24] tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef James Morse
2023-10-25 18:03 ` [PATCH v7 02/24] x86/resctrl: kfree() rmid_ptrs from rdtgroup_exit() James Morse
2023-11-09 17:39   ` Reinette Chatre
2023-12-13 18:03     ` James Morse
2023-12-13 23:27       ` Reinette Chatre
2023-12-14 18:28         ` James Morse
2023-12-14 19:06           ` Reinette Chatre
2023-12-15 17:40             ` James Morse
2023-11-09 20:28   ` Moger, Babu
2023-12-13 18:03     ` James Morse
2023-10-25 18:03 ` [PATCH v7 03/24] x86/resctrl: Create helper for RMID allocation and mondata dir creation James Morse
2023-11-09 17:40   ` Reinette Chatre
2023-11-09 20:28   ` Moger, Babu
2023-12-13 18:03     ` James Morse
2023-10-25 18:03 ` [PATCH v7 04/24] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() James Morse
2023-11-09 20:29   ` Moger, Babu
2023-12-13 18:03     ` James Morse
2023-10-25 18:03 ` [PATCH v7 05/24] x86/resctrl: Track the closid with the rmid James Morse
2023-11-09 17:41   ` Reinette Chatre
2023-12-13 18:03     ` James Morse
2023-11-09 20:31   ` Moger, Babu
2023-12-13 18:04     ` James Morse
2023-10-25 18:03 ` [PATCH v7 06/24] x86/resctrl: Access per-rmid structures by index James Morse
2023-10-31  7:43   ` [EXT] " Amit Singh Tomar
2023-12-11 14:33     ` James Morse
2024-01-21 10:27       ` Amit Singh Tomar
2024-01-22 18:07         ` James Morse
2023-11-09 17:42   ` Reinette Chatre
2023-12-13 18:04     ` James Morse
2023-11-09 20:32   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 07/24] x86/resctrl: Allow RMID allocation to be scoped by CLOSID James Morse
2023-11-09 17:42   ` Reinette Chatre
2023-11-09 20:32   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 08/24] x86/resctrl: Track the number of dirty RMID a CLOSID has James Morse
2023-11-09 17:43   ` Reinette Chatre
2023-12-13 18:04     ` James Morse
2023-11-09 20:38   ` Moger, Babu
2023-12-13 18:04     ` James Morse
2023-10-25 18:03 ` [PATCH v7 09/24] x86/resctrl: Use __set_bit()/__clear_bit() instead of open coding James Morse
2023-11-09 17:44   ` Reinette Chatre
2023-12-13 18:05     ` James Morse
2023-11-09 20:38   ` Moger, Babu
2023-12-13 18:05     ` James Morse
2023-10-25 18:03 ` [PATCH v7 10/24] x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid James Morse
2023-11-09 17:46   ` Reinette Chatre
2023-12-14 11:36     ` James Morse
2023-11-09 20:39   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-10-25 18:03 ` [PATCH v7 11/24] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers James Morse
2023-11-09 20:39   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-11-09 20:39   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 12/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow James Morse
2023-11-09 17:46   ` Reinette Chatre
2023-11-09 20:40   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI James Morse
2023-11-09 17:46   ` Reinette Chatre
2023-11-09 20:40   ` Moger, Babu [this message]
2023-12-14 11:37     ` James Morse
2023-10-25 18:03 ` [PATCH v7 14/24] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep James Morse
2023-11-09 17:47   ` Reinette Chatre
2023-12-14 11:37     ` James Morse
2023-12-14 18:52       ` Reinette Chatre
2023-11-09 20:42   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-10-25 18:03 ` [PATCH v7 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() James Morse
2023-11-09 20:47   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 16/24] x86/resctrl: Make resctrl_mounted checks explicit James Morse
2023-11-09 20:47   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 17/24] x86/resctrl: Move alloc/mon static keys into helpers James Morse
2023-11-09 20:48   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 18/24] x86/resctrl: Make rdt_enable_key the arch's decision to switch James Morse
2023-11-09 20:48   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 19/24] x86/resctrl: Add helpers for system wide mon/alloc capable James Morse
2023-11-09 20:51   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 20/24] x86/resctrl: Add CPU online callback for resctrl work James Morse
2023-11-09 20:51   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu James Morse
2023-11-09 17:48   ` Reinette Chatre
2023-12-14 11:38     ` James Morse
2023-12-14 18:53       ` Reinette Chatre
2023-12-15 17:41         ` James Morse
2023-11-09 20:51   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 22/24] x86/resctrl: Add CPU offline callback for resctrl work James Morse
2023-11-09 20:52   ` Moger, Babu
2023-12-14 11:39     ` James Morse
2023-10-25 18:03 ` [PATCH v7 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu() James Morse
2023-11-09 20:52   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 24/24] x86/resctrl: Separate arch and fs resctrl locks James Morse
2023-11-09 17:48   ` Reinette Chatre
2023-12-14 11:39     ` James Morse
2023-11-09 20:52   ` Moger, Babu
2023-12-14 11:39     ` James Morse
2023-11-09 14:05 ` [PATCH v7 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking Moger, Babu
2023-12-14 11:39   ` James Morse
2023-11-13  1:54 ` Shaopeng Tan (Fujitsu)
2023-12-14 18:28   ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c0cc5cfc-c22b-4a27-b512-75f2e27b59cb@amd.com \
    --to=babu.moger@amd.com \
    --cc=amitsinght@marvell.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=carl@os.amperecomputing.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peternewman@google.com \
    --cc=quic_jiles@quicinc.com \
    --cc=reinette.chatre@intel.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xhao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox