From: Dave Martin <Dave.Martin@arm.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Chatre, Reinette" <reinette.chatre@intel.com>,
"Moger, Babu" <babu.moger@amd.com>,
"corbet@lwn.net" <corbet@lwn.net>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"bp@alien8.de" <bp@alien8.de>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"peternewman@google.com" <peternewman@google.com>,
"x86@kernel.org" <x86@kernel.org>,
"hpa@zytor.com" <hpa@zytor.com>,
"paulmck@kernel.org" <paulmck@kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"thuth@redhat.com" <thuth@redhat.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"xiongwei.song@windriver.com" <xiongwei.song@windriver.com>,
"pawan.kumar.gupta@linux.intel.com"
<pawan.kumar.gupta@linux.intel.com>,
"daniel.sneddon@linux.intel.com" <daniel.sneddon@linux.intel.com>,
"jpoimboe@kernel.org" <jpoimboe@kernel.org>,
"perry.yuan@amd.com" <perry.yuan@amd.com>,
"sandipan.das@amd.com" <sandipan.das@amd.com>,
"Huang, Kai" <kai.huang@intel.com>,
"Li, Xiaoyao" <xiaoyao.li@intel.com>,
"seanjc@google.com" <seanjc@google.com>,
"Li, Xin3" <xin3.li@intel.com>,
"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
"ebiggers@google.com" <ebiggers@google.com>,
"mario.limonciello@amd.com" <mario.limonciello@amd.com>,
"james.morse@arm.com" <james.morse@arm.com>,
"tan.shaopeng@fujitsu.com" <tan.shaopeng@fujitsu.com>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>,
"Eranian, Stephane" <eranian@google.com>
Subject: Re: [PATCH v11 22/23] x86/resctrl: Introduce interface to list assignment states of all the groups
Date: Wed, 5 Mar 2025 18:06:55 +0000 [thread overview]
Message-ID: <Z8iSrcTKSnxLx9n4@e133380.arm.com> (raw)
In-Reply-To: <SJ1PR11MB6083186EB2D63441E2D4BC04FCC92@SJ1PR11MB6083.namprd11.prod.outlook.com>
On Mon, Mar 03, 2025 at 07:30:48PM +0000, Luck, Tony wrote:
> > After having spent a bit of time looking into this, I think we are probably
> > OK, at least for reading these files.
> >
> > seq_file will loop over the file's show() callback, growing the seq_file
> > buffer until show() can run without overrunning the buffer.
> >
> > This means that the show() callback receives a buffer that is magically big
> > enough, but there may be some "speculative" calls whose output never goes
> > to userspace. Once seq_file has the data, it deals with the userspace-
> > facing I/O buffering internally, so we shouldn't have to worry about that.
>
> Doesn't this depend on the size of the user read(2) syscall request?
Yes and no.
If I've understood correctly:
To service a given read() call, seq_file calls down into the backend to
generate some whole record, then copies it out to userspace, then
repeats this process so long as there is any space left in the user
buffer.
For resctrl files, we don't implement a seq_file iterator: there is no
.next(), no .llseek(), and we don't implement any notion of file
position. So our _show() functions generate a single big record that
contains the whole dump -- frequently multiple lines of text.
(This might or might not be desirable, but it is at least simple.)
If a _show() function in resctrl holds rdtgroup_mutex throughout, then
whatever it dumps will be dumped atomically with respect to other
resctrl operations that take this mutex.
So, to flesh out your scenario:
>
> If the total size of the resctrl file is very large, we have a potential issue:
Let's say it's 5KB.
> 1) User asks for 4KB, owns the resctrl mutex.
(Note, rdtgroup_mutex is only held temporarily inside the resctrlfs
backend to these operations; at the start of the process, it is not
held.)
> 2) resctrl uses seq_file and fills with more than 4KB
(It's actually seq_file that uses resctrl here via callbacks: seq_file
sits in between the VFS layer and resctrl.)
When a .show() callback is called, resctrl doesn't know how much data
to generate; it just writes stuff out with seq_printf() etc.
If there's too much to fit in the default seq_file buffer, the data
gets truncated and the seq_file will get internally marked as having
overflowed. resctrl could check for this condition in order to avoid
formatting text that will get thrown away due to truncation, but this
is not required. When the .show() callback returns, the seq_file
implementation will respond to the overflow by growing the buffer and
retrying the whole thing until this doesn't occur (see the loop
preceding the "Fill" label in seq_file.c:seq_read_iter().)
This terminates with a seq_file buffer that contains all the output
(untruncated), or with an -ENOMEM failure (which would be punted to
userspace).
So, assuming nothing went wrong, the seq_file buffer now has the 5KB of
data. rdtgroup_mutex is not held (it was only held in the _show()
callback).
> 3) User gets the first 4KB, releases the resctrl mutex
Userspace gets the first 4KB, and seq_file's notion of the file
position is advanced by this amount, and the generated text is
kept in the seq_file's buffer.
> 4) Some other pending resctrl operation now gets the mutex and makes changes that affect the contents of this file
The un-read data remains buffered in seq_file. Other resctrl
operations can happen, so the buffered data may become stale, but it is
still an atomic snapshot.
> 5) User asks for next 4K (when it reaquires resctrl mutex)
If an iterator is implemented, seq_file might try to generate another
record to fill the requested space. But we don't have an iterator, so
the generated data remains as-is.
> 6) resctrl uses seq_file() to construct new image of file incorporating changes because of step 4
I think this happens only if the file is reopened or lseek()'d, and
only if .llseek() is wired up in struct file_operations. Resctrl
doesn't seem to do this (whether by accident or by design).
So userspace just sees a non-seekable file.
> 7) User gets the second 4KB from the seq_file buffer (which doesn't fit cleanly next to data it got in step 3).
Userspace gets the final 1K of the data that was generated in response
to the original read() call.
If userspace tries to read again, it will get EOF (again, because we
don't have an iterator -- meaning that no additional records can be
generated).
I haven't traced in detail through the code, but that's my
understanding.
Cheers
---Dave
next prev parent reply other threads:[~2025-03-05 18:07 UTC|newest]
Thread overview: 209+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-22 20:20 [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-01-22 20:20 ` [PATCH v11 01/23] x86/resctrl: Add __init attribute to functions called from resctrl_late_init() Babu Moger
2025-02-05 22:22 ` Reinette Chatre
2025-02-19 13:28 ` Dave Martin
2025-02-19 16:53 ` Moger, Babu
2025-02-20 13:29 ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 02/23] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-01-22 20:20 ` [PATCH v11 03/23] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2025-01-22 20:20 ` [PATCH v11 04/23] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2025-01-22 20:20 ` [PATCH v11 05/23] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2025-01-22 20:20 ` [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
2025-02-05 22:49 ` Reinette Chatre
2025-02-06 16:15 ` Moger, Babu
2025-02-06 18:42 ` Reinette Chatre
2025-02-06 22:57 ` Moger, Babu
2025-02-06 23:28 ` Reinette Chatre
2025-02-21 18:05 ` James Morse
2025-02-21 18:25 ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 07/23] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2025-02-06 18:01 ` Reinette Chatre
2025-02-06 23:41 ` Moger, Babu
2025-02-21 18:06 ` James Morse
2025-02-21 19:44 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 08/23] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
2025-02-05 23:17 ` Reinette Chatre
2025-02-07 17:18 ` Moger, Babu
2025-02-07 18:52 ` Moger, Babu
2025-02-10 18:08 ` Reinette Chatre
2025-02-10 20:26 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 09/23] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
2025-01-22 20:20 ` [PATCH v11 10/23] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
2025-02-05 23:58 ` Reinette Chatre
2025-02-06 0:51 ` Luck, Tony
2025-02-06 1:41 ` Reinette Chatre
2025-02-06 15:56 ` Luck, Tony
2025-02-21 18:08 ` James Morse
2025-02-19 13:28 ` Dave Martin
2025-02-21 18:08 ` James Morse
2025-02-07 17:30 ` Moger, Babu
2025-02-06 6:24 ` Xin Li
2025-02-06 16:17 ` Reinette Chatre
2025-02-07 10:07 ` Xin Li
2025-02-11 19:44 ` Moger, Babu
2025-02-12 8:33 ` Xin Li
2025-01-22 20:20 ` [PATCH v11 11/23] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
2025-02-05 23:57 ` Reinette Chatre
2025-02-07 18:23 ` Moger, Babu
2025-02-10 18:10 ` Reinette Chatre
2025-02-19 13:30 ` Dave Martin
2025-02-19 18:07 ` Moger, Babu
2025-02-20 13:33 ` Dave Martin
2025-02-21 18:07 ` James Morse
2025-02-21 18:35 ` Reinette Chatre
2025-02-21 20:10 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 12/23] x86/resctrl: Introduce interface to display number of free counters Babu Moger
2025-02-06 0:19 ` Reinette Chatre
2025-02-07 18:59 ` Moger, Babu
2025-02-19 13:31 ` Dave Martin
2025-01-22 20:20 ` [PATCH v11 13/23] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
2025-01-22 20:20 ` [PATCH v11 14/23] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
2025-02-19 13:32 ` Dave Martin
2025-02-19 21:00 ` Moger, Babu
2025-02-21 18:06 ` James Morse
2025-02-21 22:24 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 15/23] x86/resctrl: Add the functionality to assigm MBM events Babu Moger
2025-02-06 1:05 ` Reinette Chatre
2025-02-07 21:10 ` Moger, Babu
2025-02-10 18:25 ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 16/23] x86/resctrl: Add the functionality to unassigm " Babu Moger
2025-02-06 3:54 ` Reinette Chatre
2025-02-10 16:23 ` Moger, Babu
2025-02-10 18:30 ` Reinette Chatre
2025-02-22 0:36 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 17/23] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
2025-02-06 18:03 ` Reinette Chatre
2025-02-10 17:27 ` Moger, Babu
2025-02-10 18:34 ` Reinette Chatre
2025-02-19 13:41 ` Dave Martin
2025-02-19 14:09 ` Peter Newman
2025-02-19 17:55 ` Reinette Chatre
2025-02-20 10:35 ` Peter Newman
2025-02-20 13:40 ` Dave Martin
2025-02-20 17:08 ` Reinette Chatre
2025-02-21 17:14 ` Dave Martin
2025-02-21 18:23 ` Moger, Babu
2025-02-21 22:48 ` Reinette Chatre
2025-02-21 23:42 ` Moger, Babu
2025-02-27 11:07 ` Peter Newman
2025-01-22 20:20 ` [PATCH v11 18/23] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode Babu Moger
2025-02-06 18:04 ` Reinette Chatre
2025-02-10 17:39 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 19/23] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
2025-02-06 18:05 ` Reinette Chatre
2025-02-10 18:54 ` Moger, Babu
2025-01-22 20:20 ` [PATCH v11 20/23] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger
2025-02-21 18:06 ` James Morse
2025-02-24 15:49 ` Moger, Babu
2025-02-24 17:01 ` Reinette Chatre
2025-02-24 21:18 ` Moger, Babu
2025-02-24 22:20 ` Reinette Chatre
2025-01-22 20:20 ` [PATCH v11 21/23] x86/resctrl: Update assignments on event configuration changes Babu Moger
2025-01-22 20:20 ` [PATCH v11 22/23] x86/resctrl: Introduce interface to list assignment states of all the groups Babu Moger
2025-02-19 13:53 ` Dave Martin
2025-02-19 21:09 ` Moger, Babu
2025-02-20 15:44 ` Dave Martin
2025-02-20 21:29 ` Moger, Babu
2025-02-21 16:00 ` Dave Martin
2025-02-21 20:10 ` Reinette Chatre
2025-02-24 17:17 ` Dave Martin
2025-02-24 17:23 ` Luck, Tony
2025-02-28 17:50 ` Dave Martin
2025-03-03 19:30 ` Luck, Tony
2025-03-05 18:06 ` Dave Martin [this message]
2025-01-22 20:20 ` [PATCH v11 23/23] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
2025-02-06 18:48 ` Reinette Chatre
2025-02-10 19:46 ` Moger, Babu
2025-02-19 16:07 ` Dave Martin
2025-02-19 17:43 ` Luck, Tony
2025-02-20 14:57 ` Dave Martin
2025-02-20 0:34 ` Moger, Babu
2025-02-20 15:21 ` Dave Martin
2025-02-20 20:57 ` Moger, Babu
2025-02-21 15:53 ` Dave Martin
2025-02-21 20:16 ` Reinette Chatre
2025-02-21 18:07 ` James Morse
2025-02-24 20:49 ` Moger, Babu
2025-02-03 14:54 ` [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Peter Newman
2025-02-03 20:49 ` Moger, Babu
2025-02-13 17:51 ` Dave Martin
2025-02-13 18:08 ` Luck, Tony
2025-02-12 17:46 ` Dave Martin
2025-02-12 23:33 ` Reinette Chatre
2025-02-12 23:40 ` Reinette Chatre
2025-02-13 0:11 ` Luck, Tony
2025-02-13 17:56 ` Dave Martin
2025-02-13 17:37 ` Dave Martin
2025-02-14 6:26 ` Reinette Chatre
2025-02-14 18:31 ` Moger, Babu
2025-02-14 19:18 ` Reinette Chatre
2025-02-14 19:51 ` Moger, Babu
2025-02-17 10:26 ` Peter Newman
2025-02-17 16:45 ` Moger, Babu
2025-02-18 12:30 ` Dave Martin
2025-02-18 15:39 ` Moger, Babu
2025-02-18 18:14 ` Reinette Chatre
2025-02-18 19:32 ` Moger, Babu
2025-02-18 21:29 ` Reinette Chatre
2025-02-19 12:26 ` Dave Martin
2025-02-19 12:24 ` Dave Martin
2025-02-18 16:51 ` Luck, Tony
2025-02-18 18:27 ` Reinette Chatre
2025-02-18 19:08 ` Luck, Tony
2025-02-18 21:32 ` Reinette Chatre
2025-02-18 17:49 ` Reinette Chatre
2025-02-19 11:28 ` Peter Newman
2025-02-19 12:26 ` Dave Martin
2025-02-19 17:56 ` Reinette Chatre
2025-02-20 14:53 ` Peter Newman
2025-02-20 18:36 ` Reinette Chatre
2025-02-21 13:12 ` Peter Newman
2025-02-21 22:43 ` Reinette Chatre
2025-02-25 17:11 ` Peter Newman
2025-02-25 21:31 ` Moger, Babu
2025-02-26 13:27 ` Peter Newman
2025-02-26 16:25 ` Reinette Chatre
2025-02-26 17:12 ` Moger, Babu
2025-03-03 19:16 ` Moger, Babu
2025-03-04 16:44 ` Peter Newman
2025-03-04 21:49 ` Moger, Babu
2025-03-05 10:40 ` Peter Newman
2025-03-05 19:34 ` Moger, Babu
2025-03-10 22:48 ` Moger, Babu
2025-03-10 23:22 ` Luck, Tony
2025-03-11 1:44 ` Moger, Babu
2025-03-11 3:51 ` Reinette Chatre
2025-03-11 20:35 ` Moger, Babu
2025-03-11 20:53 ` Luck, Tony
2025-03-12 15:14 ` Moger, Babu
2025-03-12 15:15 ` Reinette Chatre
2025-03-12 15:07 ` Reinette Chatre
2025-03-12 16:03 ` Moger, Babu
2025-03-12 17:14 ` Reinette Chatre
2025-03-12 18:14 ` Moger, Babu
2025-03-13 16:08 ` Reinette Chatre
2025-03-13 20:13 ` Moger, Babu
2025-03-13 20:36 ` Luck, Tony
2025-03-14 14:49 ` Moger, Babu
2025-03-13 21:21 ` Reinette Chatre
2025-03-14 16:18 ` Moger, Babu
2025-03-19 18:36 ` Reinette Chatre
2025-03-20 18:12 ` Moger, Babu
2025-03-20 22:35 ` Reinette Chatre
2025-03-21 0:35 ` Moger, Babu
2025-03-17 16:27 ` Peter Newman
2025-03-17 23:00 ` Moger, Babu
2025-03-19 20:53 ` Reinette Chatre
2025-03-20 20:29 ` Moger, Babu
2025-02-25 21:41 ` Reinette Chatre
2025-02-20 16:46 ` Dave Martin
2025-02-20 17:46 ` Dave Martin
2025-02-20 18:36 ` Reinette Chatre
2025-02-21 16:47 ` Dave Martin
2025-02-21 22:43 ` Reinette Chatre
2025-02-13 16:19 ` Moger, Babu
2025-02-13 18:18 ` Dave Martin
2025-02-13 18:39 ` Luck, Tony
2025-02-14 6:34 ` Reinette Chatre
2025-02-14 7:23 ` Reinette Chatre
2025-02-21 18:07 ` James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8iSrcTKSnxLx9n4@e133380.arm.com \
--to=dave.martin@arm.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.cooper3@citrix.com \
--cc=babu.moger@amd.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=daniel.sneddon@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=ebiggers@google.com \
--cc=eranian@google.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=jpoimboe@kernel.org \
--cc=kai.huang@intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maciej.wieczor-retman@intel.com \
--cc=mario.limonciello@amd.com \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=perry.yuan@amd.com \
--cc=peternewman@google.com \
--cc=reinette.chatre@intel.com \
--cc=rostedt@goodmis.org \
--cc=sandipan.das@amd.com \
--cc=seanjc@google.com \
--cc=tan.shaopeng@fujitsu.com \
--cc=tglx@linutronix.de \
--cc=thuth@redhat.com \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
--cc=xin3.li@intel.com \
--cc=xiongwei.song@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).