linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Martin <Dave.Martin@arm.com>
To: "Moger, Babu" <bmoger@amd.com>
Cc: Babu Moger <babu.moger@amd.com>,
	corbet@lwn.net, reinette.chatre@intel.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	tony.luck@intel.com, peternewman@google.com, x86@kernel.org,
	hpa@zytor.com, paulmck@kernel.org, akpm@linux-foundation.org,
	thuth@redhat.com, rostedt@goodmis.org,
	xiongwei.song@windriver.com, pawan.kumar.gupta@linux.intel.com,
	daniel.sneddon@linux.intel.com, jpoimboe@kernel.org,
	perry.yuan@amd.com, sandipan.das@amd.com, kai.huang@intel.com,
	xiaoyao.li@intel.com, seanjc@google.com, xin3.li@intel.com,
	andrew.cooper3@citrix.com, ebiggers@google.com,
	mario.limonciello@amd.com, james.morse@arm.com,
	tan.shaopeng@fujitsu.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, maciej.wieczor-retman@intel.com,
	eranian@google.com
Subject: Re: [PATCH v11 23/23] x86/resctrl: Introduce interface to modify assignment states of the groups
Date: Thu, 20 Feb 2025 15:21:33 +0000	[thread overview]
Message-ID: <Z7dIfWAk+f4Gc54X@e133380.arm.com> (raw)
In-Reply-To: <1ccb907b-e8c9-4997-bc45-4a457ee84494@amd.com>

Hi,

On Wed, Feb 19, 2025 at 06:34:42PM -0600, Moger, Babu wrote:
> Hi Dave,
> 
> On 2/19/2025 10:07 AM, Dave Martin wrote:
> > Hi,
> > 
> > On Wed, Jan 22, 2025 at 02:20:31PM -0600, Babu Moger wrote:

> > [...]
> > 
> > > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > > index 6e29827239e0..299839bcf23f 100644
> > > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > > @@ -1050,6 +1050,244 @@ static int resctrl_mbm_assign_control_show(struct kernfs_open_file *of,
> > 
> > [...]
> > 
> > > +static ssize_t resctrl_mbm_assign_control_write(struct kernfs_open_file *of,
> > > +						char *buf, size_t nbytes, loff_t off)
> > > +{

[...]

> > > +	while ((token = strsep(&buf, "\n")) != NULL) {
> > > +		/*
> > > +		 * The write command follows the following format:
> > > +		 * “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
> > > +		 * Extract the CTRL_MON group.
> > > +		 */
> > > +		cmon_grp = strsep(&token, "/");
> > > +
> > 
> > As when reading this file, I think that the data can grow larger than a
> > page and get split into multiple write() calls.
> > 
> > I don't currently think the file needs to be redesigned, but there are
> > some concerns about how userspace will work with it that need to be
> > sorted out.
> > 
> > Every monitoring group can contribute a line to this file:
> > 
> > 	CTRL_GROUP / MON_GROUP / DOMAIN = [t][l] [ ; DOMAIN = [t][l] ]* LF
> > 
> > so, 2 * (NAME_MAX + 1) + NUM_DOMAINS * 5 - 1 + 1
> > 
> > NAME_MAX on Linux is 255, so with, say, up to 16 domains, that's about
> > 600 bytes per monitoring group in the worst case.
> > 
> > We don't need to have many control and monitoring groups for this to
> > grow potentially over 4K.
> > 
> > 
> > We could simply place a limit on how much userspace is allowed to write
> > to this file in one go, although this restriction feels difficult for
> > userspace to follow -- but maybe this is workable in the short term, on
> > current systems (?)
> > 
> > Otherwise, since we expect this interface to be written using scripting
> > languages, I think we need to be prepared to accept fully-buffered
> > I/O.  That means that the data may be cut at random places, not
> > necessarily at newlines.  (For smaller files such as schemata this is
> > not such an issue, since the whole file is likely to be small enough to
> > fit into the default stdio buffers -- this is how sysfs gets away with
> > it IIUC.)
> > 
> > For fully-buffered I/O, we may have to cache an incomplete line in
> > between write() calls.  If there is a dangling incomplete line when the
> > file is closed then it is hard to tell userspace, because people often
> > don't bother to check the return value of close(), fclose() etc.
> > However, since it's an ABI violation for userspace to end this file
> > with a partial line, I think it's sufficient to report that via
> > last_cmd_status.  (Making close() return -EIO still seems a good idea
> > though, just in case userspace is listening.)
> 
> Seems like we can add a check in resctrl_mbm_assign_control_write() to
> compare nbytes > PAGE_SIZE.

This might be a reasonable stopgap approach, if we are confident that the
number of RMIDs and monitoring domains is small enough on known
platforms that the problem is unlikely to be hit.  I can't really judge
on this.

> But do we really need this? I have no way of testing this. Help me
> understand.

It's easy to demonatrate this using the schemata file (which works in a
similar way).  Open f in /sys/fs/resctrl/schemata, then:

	int n = 0;

	for (n = 0; n < 1000; n++)
		if (fputs("MB:0=100;1=100\n", f) == EOF)
			fprintf(stderr, "Failed on interation %d\n", n);

This will succeed a certain number of times (272, for me) and then fail
when the stdio buffer for f overflows, triggering a write().

Putting an explicit fflush() after every fputs() call (or doing a
setlinebuf(f) before the loop) makes it work.  But this is awkward and
unexpected for the user, and doing the right thing from a scripting
language may be tricky.

In this example I am doing something a bit artificial -- we don't
officially say what happens when a pre-opened schemata file handle is
reused in this way, AFAICT.  But for mbm_assign_control it is
legitimate to write many lines, and we can hit this kind of problem.


I'll leave it to others to judge whether we _need_ to fix this, but it
feels like a problem waiting to happen.


> All these file operations go thru generic call kernfs_fop_write_iter().
> Doesn't it take care of buffer check and overflow?

No, this is called for each iovec segment (where userspace used one of
the iovec based I/O syscalls).  But there is no buffering or
concatenation of the data read in: each segment gets passed down to the
individual kernfs_file_operations write method for the file:

	len = ops->write(of, buf, len, iocb->ki_pos)

calls down to

	resctrl_mbm_assign_control_write(of, buf, len, iocb->ki_pos).


I'll try to port my buffering hack on top of the series -- that should
help to illustrate what I mean.

Cheers
---Dave

  reply	other threads:[~2025-02-20 15:21 UTC|newest]

Thread overview: 209+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22 20:20 [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-01-22 20:20 ` [PATCH v11 01/23] x86/resctrl: Add __init attribute to functions called from resctrl_late_init() Babu Moger
2025-02-05 22:22   ` Reinette Chatre
2025-02-19 13:28   ` Dave Martin
2025-02-19 16:53     ` Moger, Babu
2025-02-20 13:29       ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 02/23] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-01-22 20:20 ` [PATCH v11 03/23] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2025-01-22 20:20 ` [PATCH v11 04/23] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2025-01-22 20:20 ` [PATCH v11 05/23] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2025-01-22 20:20 ` [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
2025-02-05 22:49   ` Reinette Chatre
2025-02-06 16:15     ` Moger, Babu
2025-02-06 18:42       ` Reinette Chatre
2025-02-06 22:57         ` Moger, Babu
2025-02-06 23:28           ` Reinette Chatre
2025-02-21 18:05   ` James Morse
2025-02-21 18:25     ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 07/23] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2025-02-06 18:01   ` Reinette Chatre
2025-02-06 23:41     ` Moger, Babu
2025-02-21 18:06   ` James Morse
2025-02-21 19:44     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 08/23] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
2025-02-05 23:17   ` Reinette Chatre
2025-02-07 17:18     ` Moger, Babu
2025-02-07 18:52       ` Moger, Babu
2025-02-10 18:08         ` Reinette Chatre
2025-02-10 20:26           ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 09/23] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
2025-01-22 20:20 ` [PATCH v11 10/23] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
2025-02-05 23:58   ` Reinette Chatre
2025-02-06  0:51     ` Luck, Tony
2025-02-06  1:41       ` Reinette Chatre
2025-02-06 15:56         ` Luck, Tony
2025-02-21 18:08           ` James Morse
2025-02-19 13:28         ` Dave Martin
2025-02-21 18:08           ` James Morse
2025-02-07 17:30     ` Moger, Babu
2025-02-06  6:24   ` Xin Li
2025-02-06 16:17     ` Reinette Chatre
2025-02-07 10:07       ` Xin Li
2025-02-11 19:44         ` Moger, Babu
2025-02-12  8:33           ` Xin Li
2025-01-22 20:20 ` [PATCH v11 11/23] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
2025-02-05 23:57   ` Reinette Chatre
2025-02-07 18:23     ` Moger, Babu
2025-02-10 18:10       ` Reinette Chatre
2025-02-19 13:30         ` Dave Martin
2025-02-19 18:07           ` Moger, Babu
2025-02-20 13:33             ` Dave Martin
2025-02-21 18:07   ` James Morse
2025-02-21 18:35     ` Reinette Chatre
2025-02-21 20:10       ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 12/23] x86/resctrl: Introduce interface to display number of free counters Babu Moger
2025-02-06  0:19   ` Reinette Chatre
2025-02-07 18:59     ` Moger, Babu
2025-02-19 13:31       ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 13/23] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
2025-01-22 20:20 ` [PATCH v11 14/23] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
2025-02-19 13:32   ` Dave Martin
2025-02-19 21:00     ` Moger, Babu
2025-02-21 18:06   ` James Morse
2025-02-21 22:24     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 15/23] x86/resctrl: Add the functionality to assigm MBM events Babu Moger
2025-02-06  1:05   ` Reinette Chatre
2025-02-07 21:10     ` Moger, Babu
2025-02-10 18:25       ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 16/23] x86/resctrl: Add the functionality to unassigm " Babu Moger
2025-02-06  3:54   ` Reinette Chatre
2025-02-10 16:23     ` Moger, Babu
2025-02-10 18:30       ` Reinette Chatre
2025-02-22  0:36         ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 17/23] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
2025-02-06 18:03   ` Reinette Chatre
2025-02-10 17:27     ` Moger, Babu
2025-02-10 18:34       ` Reinette Chatre
2025-02-19 13:41   ` Dave Martin
2025-02-19 14:09     ` Peter Newman
2025-02-19 17:55       ` Reinette Chatre
2025-02-20 10:35         ` Peter Newman
2025-02-20 13:40           ` Dave Martin
2025-02-20 17:08             ` Reinette Chatre
2025-02-21 17:14               ` Dave Martin
2025-02-21 18:23                 ` Moger, Babu
2025-02-21 22:48                   ` Reinette Chatre
2025-02-21 23:42                     ` Moger, Babu
2025-02-27 11:07                       ` Peter Newman
2025-01-22 20:20 ` [PATCH v11 18/23] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode Babu Moger
2025-02-06 18:04   ` Reinette Chatre
2025-02-10 17:39     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 19/23] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
2025-02-06 18:05   ` Reinette Chatre
2025-02-10 18:54     ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 20/23] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger
2025-02-21 18:06   ` James Morse
2025-02-24 15:49     ` Moger, Babu
2025-02-24 17:01       ` Reinette Chatre
2025-02-24 21:18         ` Moger, Babu
2025-02-24 22:20           ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 21/23] x86/resctrl: Update assignments on event configuration changes Babu Moger
2025-01-22 20:20 ` [PATCH v11 22/23] x86/resctrl: Introduce interface to list assignment states of all the groups Babu Moger
2025-02-19 13:53   ` Dave Martin
2025-02-19 21:09     ` Moger, Babu
2025-02-20 15:44       ` Dave Martin
2025-02-20 21:29         ` Moger, Babu
2025-02-21 16:00           ` Dave Martin
2025-02-21 20:10             ` Reinette Chatre
2025-02-24 17:17               ` Dave Martin
2025-02-24 17:23                 ` Luck, Tony
2025-02-28 17:50                   ` Dave Martin
2025-03-03 19:30                     ` Luck, Tony
2025-03-05 18:06                       ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 23/23] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
2025-02-06 18:48   ` Reinette Chatre
2025-02-10 19:46     ` Moger, Babu
2025-02-19 16:07   ` Dave Martin
2025-02-19 17:43     ` Luck, Tony
2025-02-20 14:57       ` Dave Martin
2025-02-20  0:34     ` Moger, Babu
2025-02-20 15:21       ` Dave Martin [this message]
2025-02-20 20:57         ` Moger, Babu
2025-02-21 15:53           ` Dave Martin
2025-02-21 20:16             ` Reinette Chatre
2025-02-21 18:07   ` James Morse
2025-02-24 20:49     ` Moger, Babu
2025-02-03 14:54 ` [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Peter Newman
2025-02-03 20:49   ` Moger, Babu
2025-02-13 17:51     ` Dave Martin
2025-02-13 18:08       ` Luck, Tony
2025-02-12 17:46 ` Dave Martin
2025-02-12 23:33   ` Reinette Chatre
2025-02-12 23:40     ` Reinette Chatre
2025-02-13  0:11     ` Luck, Tony
2025-02-13 17:56       ` Dave Martin
2025-02-13 17:37     ` Dave Martin
2025-02-14  6:26       ` Reinette Chatre
2025-02-14 18:31         ` Moger, Babu
2025-02-14 19:18           ` Reinette Chatre
2025-02-14 19:51             ` Moger, Babu
2025-02-17 10:26             ` Peter Newman
2025-02-17 16:45               ` Moger, Babu
2025-02-18 12:30                 ` Dave Martin
2025-02-18 15:39                   ` Moger, Babu
2025-02-18 18:14                     ` Reinette Chatre
2025-02-18 19:32                       ` Moger, Babu
2025-02-18 21:29                         ` Reinette Chatre
2025-02-19 12:26                           ` Dave Martin
2025-02-19 12:24                     ` Dave Martin
2025-02-18 16:51                 ` Luck, Tony
2025-02-18 18:27                   ` Reinette Chatre
2025-02-18 19:08                     ` Luck, Tony
2025-02-18 21:32                       ` Reinette Chatre
2025-02-18 17:49               ` Reinette Chatre
2025-02-19 11:28                 ` Peter Newman
2025-02-19 12:26                   ` Dave Martin
2025-02-19 17:56                   ` Reinette Chatre
2025-02-20 14:53                     ` Peter Newman
2025-02-20 18:36                       ` Reinette Chatre
2025-02-21 13:12                         ` Peter Newman
2025-02-21 22:43                           ` Reinette Chatre
2025-02-25 17:11                             ` Peter Newman
2025-02-25 21:31                               ` Moger, Babu
2025-02-26 13:27                                 ` Peter Newman
2025-02-26 16:25                                   ` Reinette Chatre
2025-02-26 17:12                                     ` Moger, Babu
2025-03-03 19:16                                   ` Moger, Babu
2025-03-04 16:44                                     ` Peter Newman
2025-03-04 21:49                                       ` Moger, Babu
2025-03-05 10:40                                         ` Peter Newman
2025-03-05 19:34                                           ` Moger, Babu
2025-03-10 22:48                                             ` Moger, Babu
2025-03-10 23:22                                               ` Luck, Tony
2025-03-11  1:44                                                 ` Moger, Babu
2025-03-11  3:51                                                   ` Reinette Chatre
2025-03-11 20:35                                                     ` Moger, Babu
2025-03-11 20:53                                                       ` Luck, Tony
2025-03-12 15:14                                                         ` Moger, Babu
2025-03-12 15:15                                                         ` Reinette Chatre
2025-03-12 15:07                                                       ` Reinette Chatre
2025-03-12 16:03                                                         ` Moger, Babu
2025-03-12 17:14                                                           ` Reinette Chatre
2025-03-12 18:14                                                             ` Moger, Babu
2025-03-13 16:08                                                               ` Reinette Chatre
2025-03-13 20:13                                                                 ` Moger, Babu
2025-03-13 20:36                                                                   ` Luck, Tony
2025-03-14 14:49                                                                     ` Moger, Babu
2025-03-13 21:21                                                                   ` Reinette Chatre
2025-03-14 16:18                                                                     ` Moger, Babu
2025-03-19 18:36                                                                       ` Reinette Chatre
2025-03-20 18:12                                                                         ` Moger, Babu
2025-03-20 22:35                                                                           ` Reinette Chatre
2025-03-21  0:35                                                                             ` Moger, Babu
2025-03-17 16:27                                                                     ` Peter Newman
2025-03-17 23:00                                                                       ` Moger, Babu
2025-03-19 20:53                                                                         ` Reinette Chatre
2025-03-20 20:29                                                                           ` Moger, Babu
2025-02-25 21:41                               ` Reinette Chatre
2025-02-20 16:46                     ` Dave Martin
2025-02-20 17:46                       ` Dave Martin
2025-02-20 18:36                         ` Reinette Chatre
2025-02-21 16:47                           ` Dave Martin
2025-02-21 22:43                             ` Reinette Chatre
2025-02-13 16:19   ` Moger, Babu
2025-02-13 18:18     ` Dave Martin
2025-02-13 18:39       ` Luck, Tony
2025-02-14  6:34         ` Reinette Chatre
2025-02-14  7:23           ` Reinette Chatre
2025-02-21 18:07 ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7dIfWAk+f4Gc54X@e133380.arm.com \
    --to=dave.martin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=babu.moger@amd.com \
    --cc=bmoger@amd.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=daniel.sneddon@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ebiggers@google.com \
    --cc=eranian@google.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=jpoimboe@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=perry.yuan@amd.com \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=sandipan.das@amd.com \
    --cc=seanjc@google.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=thuth@redhat.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=xin3.li@intel.com \
    --cc=xiongwei.song@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).