All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Martin <Dave.Martin@arm.com>
To: "Moger, Babu" <bmoger@amd.com>
Cc: Babu Moger <babu.moger@amd.com>,
	corbet@lwn.net, reinette.chatre@intel.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	tony.luck@intel.com, peternewman@google.com, x86@kernel.org,
	hpa@zytor.com, paulmck@kernel.org, akpm@linux-foundation.org,
	thuth@redhat.com, rostedt@goodmis.org,
	xiongwei.song@windriver.com, pawan.kumar.gupta@linux.intel.com,
	daniel.sneddon@linux.intel.com, jpoimboe@kernel.org,
	perry.yuan@amd.com, sandipan.das@amd.com, kai.huang@intel.com,
	xiaoyao.li@intel.com, seanjc@google.com, xin3.li@intel.com,
	andrew.cooper3@citrix.com, ebiggers@google.com,
	mario.limonciello@amd.com, james.morse@arm.com,
	tan.shaopeng@fujitsu.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, maciej.wieczor-retman@intel.com,
	eranian@google.com
Subject: Re: [PATCH v11 23/23] x86/resctrl: Introduce interface to modify assignment states of the groups
Date: Thu, 20 Feb 2025 15:21:33 +0000	[thread overview]
Message-ID: <Z7dIfWAk+f4Gc54X@e133380.arm.com> (raw)
In-Reply-To: <1ccb907b-e8c9-4997-bc45-4a457ee84494@amd.com>

Hi,

On Wed, Feb 19, 2025 at 06:34:42PM -0600, Moger, Babu wrote:
> Hi Dave,
> 
> On 2/19/2025 10:07 AM, Dave Martin wrote:
> > Hi,
> > 
> > On Wed, Jan 22, 2025 at 02:20:31PM -0600, Babu Moger wrote:

> > [...]
> > 
> > > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > > index 6e29827239e0..299839bcf23f 100644
> > > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > > @@ -1050,6 +1050,244 @@ static int resctrl_mbm_assign_control_show(struct kernfs_open_file *of,
> > 
> > [...]
> > 
> > > +static ssize_t resctrl_mbm_assign_control_write(struct kernfs_open_file *of,
> > > +						char *buf, size_t nbytes, loff_t off)
> > > +{

[...]

> > > +	while ((token = strsep(&buf, "\n")) != NULL) {
> > > +		/*
> > > +		 * The write command follows the following format:
> > > +		 * “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
> > > +		 * Extract the CTRL_MON group.
> > > +		 */
> > > +		cmon_grp = strsep(&token, "/");
> > > +
> > 
> > As when reading this file, I think that the data can grow larger than a
> > page and get split into multiple write() calls.
> > 
> > I don't currently think the file needs to be redesigned, but there are
> > some concerns about how userspace will work with it that need to be
> > sorted out.
> > 
> > Every monitoring group can contribute a line to this file:
> > 
> > 	CTRL_GROUP / MON_GROUP / DOMAIN = [t][l] [ ; DOMAIN = [t][l] ]* LF
> > 
> > so, 2 * (NAME_MAX + 1) + NUM_DOMAINS * 5 - 1 + 1
> > 
> > NAME_MAX on Linux is 255, so with, say, up to 16 domains, that's about
> > 600 bytes per monitoring group in the worst case.
> > 
> > We don't need to have many control and monitoring groups for this to
> > grow potentially over 4K.
> > 
> > 
> > We could simply place a limit on how much userspace is allowed to write
> > to this file in one go, although this restriction feels difficult for
> > userspace to follow -- but maybe this is workable in the short term, on
> > current systems (?)
> > 
> > Otherwise, since we expect this interface to be written using scripting
> > languages, I think we need to be prepared to accept fully-buffered
> > I/O.  That means that the data may be cut at random places, not
> > necessarily at newlines.  (For smaller files such as schemata this is
> > not such an issue, since the whole file is likely to be small enough to
> > fit into the default stdio buffers -- this is how sysfs gets away with
> > it IIUC.)
> > 
> > For fully-buffered I/O, we may have to cache an incomplete line in
> > between write() calls.  If there is a dangling incomplete line when the
> > file is closed then it is hard to tell userspace, because people often
> > don't bother to check the return value of close(), fclose() etc.
> > However, since it's an ABI violation for userspace to end this file
> > with a partial line, I think it's sufficient to report that via
> > last_cmd_status.  (Making close() return -EIO still seems a good idea
> > though, just in case userspace is listening.)
> 
> Seems like we can add a check in resctrl_mbm_assign_control_write() to
> compare nbytes > PAGE_SIZE.

This might be a reasonable stopgap approach, if we are confident that the
number of RMIDs and monitoring domains is small enough on known
platforms that the problem is unlikely to be hit.  I can't really judge
on this.

> But do we really need this? I have no way of testing this. Help me
> understand.

It's easy to demonatrate this using the schemata file (which works in a
similar way).  Open f in /sys/fs/resctrl/schemata, then:

	int n = 0;

	for (n = 0; n < 1000; n++)
		if (fputs("MB:0=100;1=100\n", f) == EOF)
			fprintf(stderr, "Failed on interation %d\n", n);

This will succeed a certain number of times (272, for me) and then fail
when the stdio buffer for f overflows, triggering a write().

Putting an explicit fflush() after every fputs() call (or doing a
setlinebuf(f) before the loop) makes it work.  But this is awkward and
unexpected for the user, and doing the right thing from a scripting
language may be tricky.

In this example I am doing something a bit artificial -- we don't
officially say what happens when a pre-opened schemata file handle is
reused in this way, AFAICT.  But for mbm_assign_control it is
legitimate to write many lines, and we can hit this kind of problem.


I'll leave it to others to judge whether we _need_ to fix this, but it
feels like a problem waiting to happen.


> All these file operations go thru generic call kernfs_fop_write_iter().
> Doesn't it take care of buffer check and overflow?

No, this is called for each iovec segment (where userspace used one of
the iovec based I/O syscalls).  But there is no buffering or
concatenation of the data read in: each segment gets passed down to the
individual kernfs_file_operations write method for the file:

	len = ops->write(of, buf, len, iocb->ki_pos)

calls down to

	resctrl_mbm_assign_control_write(of, buf, len, iocb->ki_pos).


I'll try to port my buffering hack on top of the series -- that should
help to illustrate what I mean.

Cheers
---Dave

  reply	other threads:[~2025-02-20 15:21 UTC|newest]

Thread overview: 209+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22 20:20 [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-01-22 20:20 ` [PATCH v11 01/23] x86/resctrl: Add __init attribute to functions called from resctrl_late_init() Babu Moger
2025-02-05 22:22   ` Reinette Chatre
2025-02-19 13:28   ` Dave Martin
2025-02-19 16:53     ` Moger, Babu
2025-02-20 13:29       ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 02/23] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-01-22 20:20 ` [PATCH v11 03/23] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2025-01-22 20:20 ` [PATCH v11 04/23] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2025-01-22 20:20 ` [PATCH v11 05/23] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2025-01-22 20:20 ` [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
2025-02-05 22:49   ` Reinette Chatre
2025-02-06 16:15     ` Moger, Babu
2025-02-06 18:42       ` Reinette Chatre
2025-02-06 22:57         ` Moger, Babu
2025-02-06 23:28           ` Reinette Chatre
2025-02-21 18:05   ` James Morse
2025-02-21 18:25     ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 07/23] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2025-02-06 18:01   ` Reinette Chatre
2025-02-06 23:41     ` Moger, Babu
2025-02-21 18:06   ` James Morse
2025-02-21 19:44     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 08/23] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
2025-02-05 23:17   ` Reinette Chatre
2025-02-07 17:18     ` Moger, Babu
2025-02-07 18:52       ` Moger, Babu
2025-02-10 18:08         ` Reinette Chatre
2025-02-10 20:26           ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 09/23] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
2025-01-22 20:20 ` [PATCH v11 10/23] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
2025-02-05 23:58   ` Reinette Chatre
2025-02-06  0:51     ` Luck, Tony
2025-02-06  1:41       ` Reinette Chatre
2025-02-06 15:56         ` Luck, Tony
2025-02-21 18:08           ` James Morse
2025-02-19 13:28         ` Dave Martin
2025-02-21 18:08           ` James Morse
2025-02-07 17:30     ` Moger, Babu
2025-02-06  6:24   ` Xin Li
2025-02-06 16:17     ` Reinette Chatre
2025-02-07 10:07       ` Xin Li
2025-02-11 19:44         ` Moger, Babu
2025-02-12  8:33           ` Xin Li
2025-01-22 20:20 ` [PATCH v11 11/23] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
2025-02-05 23:57   ` Reinette Chatre
2025-02-07 18:23     ` Moger, Babu
2025-02-10 18:10       ` Reinette Chatre
2025-02-19 13:30         ` Dave Martin
2025-02-19 18:07           ` Moger, Babu
2025-02-20 13:33             ` Dave Martin
2025-02-21 18:07   ` James Morse
2025-02-21 18:35     ` Reinette Chatre
2025-02-21 20:10       ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 12/23] x86/resctrl: Introduce interface to display number of free counters Babu Moger
2025-02-06  0:19   ` Reinette Chatre
2025-02-07 18:59     ` Moger, Babu
2025-02-19 13:31       ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 13/23] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
2025-01-22 20:20 ` [PATCH v11 14/23] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
2025-02-19 13:32   ` Dave Martin
2025-02-19 21:00     ` Moger, Babu
2025-02-21 18:06   ` James Morse
2025-02-21 22:24     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 15/23] x86/resctrl: Add the functionality to assigm MBM events Babu Moger
2025-02-06  1:05   ` Reinette Chatre
2025-02-07 21:10     ` Moger, Babu
2025-02-10 18:25       ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 16/23] x86/resctrl: Add the functionality to unassigm " Babu Moger
2025-02-06  3:54   ` Reinette Chatre
2025-02-10 16:23     ` Moger, Babu
2025-02-10 18:30       ` Reinette Chatre
2025-02-22  0:36         ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 17/23] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
2025-02-06 18:03   ` Reinette Chatre
2025-02-10 17:27     ` Moger, Babu
2025-02-10 18:34       ` Reinette Chatre
2025-02-19 13:41   ` Dave Martin
2025-02-19 14:09     ` Peter Newman
2025-02-19 17:55       ` Reinette Chatre
2025-02-20 10:35         ` Peter Newman
2025-02-20 13:40           ` Dave Martin
2025-02-20 17:08             ` Reinette Chatre
2025-02-21 17:14               ` Dave Martin
2025-02-21 18:23                 ` Moger, Babu
2025-02-21 22:48                   ` Reinette Chatre
2025-02-21 23:42                     ` Moger, Babu
2025-02-27 11:07                       ` Peter Newman
2025-01-22 20:20 ` [PATCH v11 18/23] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode Babu Moger
2025-02-06 18:04   ` Reinette Chatre
2025-02-10 17:39     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 19/23] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
2025-02-06 18:05   ` Reinette Chatre
2025-02-10 18:54     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 20/23] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger
2025-02-21 18:06   ` James Morse
2025-02-24 15:49     ` Moger, Babu
2025-02-24 17:01       ` Reinette Chatre
2025-02-24 21:18         ` Moger, Babu
2025-02-24 22:20           ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 21/23] x86/resctrl: Update assignments on event configuration changes Babu Moger
2025-01-22 20:20 ` [PATCH v11 22/23] x86/resctrl: Introduce interface to list assignment states of all the groups Babu Moger
2025-02-19 13:53   ` Dave Martin
2025-02-19 21:09     ` Moger, Babu
2025-02-20 15:44       ` Dave Martin
2025-02-20 21:29         ` Moger, Babu
2025-02-21 16:00           ` Dave Martin
2025-02-21 20:10             ` Reinette Chatre
2025-02-24 17:17               ` Dave Martin
2025-02-24 17:23                 ` Luck, Tony
2025-02-28 17:50                   ` Dave Martin
2025-03-03 19:30                     ` Luck, Tony
2025-03-05 18:06                       ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 23/23] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
2025-02-06 18:48   ` Reinette Chatre
2025-02-10 19:46     ` Moger, Babu
2025-02-19 16:07   ` Dave Martin
2025-02-19 17:43     ` Luck, Tony
2025-02-20 14:57       ` Dave Martin
2025-02-20  0:34     ` Moger, Babu
2025-02-20 15:21       ` Dave Martin [this message]
2025-02-20 20:57         ` Moger, Babu
2025-02-21 15:53           ` Dave Martin
2025-02-21 20:16             ` Reinette Chatre
2025-02-21 18:07   ` James Morse
2025-02-24 20:49     ` Moger, Babu
2025-02-03 14:54 ` [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Peter Newman
2025-02-03 20:49   ` Moger, Babu
2025-02-13 17:51     ` Dave Martin
2025-02-13 18:08       ` Luck, Tony
2025-02-12 17:46 ` Dave Martin
2025-02-12 23:33   ` Reinette Chatre
2025-02-12 23:40     ` Reinette Chatre
2025-02-13  0:11     ` Luck, Tony
2025-02-13 17:56       ` Dave Martin
2025-02-13 17:37     ` Dave Martin
2025-02-14  6:26       ` Reinette Chatre
2025-02-14 18:31         ` Moger, Babu
2025-02-14 19:18           ` Reinette Chatre
2025-02-14 19:51             ` Moger, Babu
2025-02-17 10:26             ` Peter Newman
2025-02-17 16:45               ` Moger, Babu
2025-02-18 12:30                 ` Dave Martin
2025-02-18 15:39                   ` Moger, Babu
2025-02-18 18:14                     ` Reinette Chatre
2025-02-18 19:32                       ` Moger, Babu
2025-02-18 21:29                         ` Reinette Chatre
2025-02-19 12:26                           ` Dave Martin
2025-02-19 12:24                     ` Dave Martin
2025-02-18 16:51                 ` Luck, Tony
2025-02-18 18:27                   ` Reinette Chatre
2025-02-18 19:08                     ` Luck, Tony
2025-02-18 21:32                       ` Reinette Chatre
2025-02-18 17:49               ` Reinette Chatre
2025-02-19 11:28                 ` Peter Newman
2025-02-19 12:26                   ` Dave Martin
2025-02-19 17:56                   ` Reinette Chatre
2025-02-20 14:53                     ` Peter Newman
2025-02-20 18:36                       ` Reinette Chatre
2025-02-21 13:12                         ` Peter Newman
2025-02-21 22:43                           ` Reinette Chatre
2025-02-25 17:11                             ` Peter Newman
2025-02-25 21:31                               ` Moger, Babu
2025-02-26 13:27                                 ` Peter Newman
2025-02-26 16:25                                   ` Reinette Chatre
2025-02-26 17:12                                     ` Moger, Babu
2025-03-03 19:16                                   ` Moger, Babu
2025-03-04 16:44                                     ` Peter Newman
2025-03-04 21:49                                       ` Moger, Babu
2025-03-05 10:40                                         ` Peter Newman
2025-03-05 19:34                                           ` Moger, Babu
2025-03-10 22:48                                             ` Moger, Babu
2025-03-10 23:22                                               ` Luck, Tony
2025-03-11  1:44                                                 ` Moger, Babu
2025-03-11  3:51                                                   ` Reinette Chatre
2025-03-11 20:35                                                     ` Moger, Babu
2025-03-11 20:53                                                       ` Luck, Tony
2025-03-12 15:14                                                         ` Moger, Babu
2025-03-12 15:15                                                         ` Reinette Chatre
2025-03-12 15:07                                                       ` Reinette Chatre
2025-03-12 16:03                                                         ` Moger, Babu
2025-03-12 17:14                                                           ` Reinette Chatre
2025-03-12 18:14                                                             ` Moger, Babu
2025-03-13 16:08                                                               ` Reinette Chatre
2025-03-13 20:13                                                                 ` Moger, Babu
2025-03-13 20:36                                                                   ` Luck, Tony
2025-03-14 14:49                                                                     ` Moger, Babu
2025-03-13 21:21                                                                   ` Reinette Chatre
2025-03-14 16:18                                                                     ` Moger, Babu
2025-03-19 18:36                                                                       ` Reinette Chatre
2025-03-20 18:12                                                                         ` Moger, Babu
2025-03-20 22:35                                                                           ` Reinette Chatre
2025-03-21  0:35                                                                             ` Moger, Babu
2025-03-17 16:27                                                                     ` Peter Newman
2025-03-17 23:00                                                                       ` Moger, Babu
2025-03-19 20:53                                                                         ` Reinette Chatre
2025-03-20 20:29                                                                           ` Moger, Babu
2025-02-25 21:41                               ` Reinette Chatre
2025-02-20 16:46                     ` Dave Martin
2025-02-20 17:46                       ` Dave Martin
2025-02-20 18:36                         ` Reinette Chatre
2025-02-21 16:47                           ` Dave Martin
2025-02-21 22:43                             ` Reinette Chatre
2025-02-13 16:19   ` Moger, Babu
2025-02-13 18:18     ` Dave Martin
2025-02-13 18:39       ` Luck, Tony
2025-02-14  6:34         ` Reinette Chatre
2025-02-14  7:23           ` Reinette Chatre
2025-02-21 18:07 ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7dIfWAk+f4Gc54X@e133380.arm.com \
    --to=dave.martin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=babu.moger@amd.com \
    --cc=bmoger@amd.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=daniel.sneddon@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ebiggers@google.com \
    --cc=eranian@google.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=jpoimboe@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=perry.yuan@amd.com \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=sandipan.das@amd.com \
    --cc=seanjc@google.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=thuth@redhat.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=xin3.li@intel.com \
    --cc=xiongwei.song@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.