public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Peter Newman <peternewman@google.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	James Morse <james.morse@arm.com>
Cc: Stephane Eranian <eranian@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>, <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Juri Lelli" <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	"Valentin Schneider" <vschneid@redhat.com>,
	Uros Bizjak <ubizjak@gmail.com>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Xin Li <xin3.li@intel.com>, Babu Moger <babu.moger@amd.com>,
	Shaopeng Tan <tan.shaopeng@fujitsu.com>,
	"Maciej Wieczor-Retman" <maciej.wieczor-retman@intel.com>,
	Jens Axboe <axboe@kernel.dk>,
	Christian Brauner <brauner@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tycho Andersen <tandersen@netflix.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Beau Belgrave <beaub@linux.microsoft.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v1 0/6] x86/resctrl: Avoid searching tasklist during mongrp_reparent
Date: Thu, 4 Apr 2024 16:09:12 -0700	[thread overview]
Message-ID: <4681c9c5-3364-4628-a581-0366d0024211@intel.com> (raw)
In-Reply-To: <20240325172707.73966-1-peternewman@google.com>

Hi Peter,

On 3/25/2024 10:27 AM, Peter Newman wrote:
> Hi Reinette,
> 
> I've been working with users of the recently-added mongroup rename
> operation[1] who have observed the impact of back-to-back operations on
> latency-sensitive, thread pool-based services. Because changing a
> resctrl group's CLOSID (or RMID) requires all task_structs in the system
> to be inspected with the tasklist_lock read-locked, a series of move
> operations can block out thread creation for long periods of time, as
> creating threads needs to write-lock the tasklist_lock.

Could you please give some insight into the delays experienced? "long
periods of time" mean different things to different people and this
series seems to get more ominous as is progresses with the cover letter
starting with "long periods of time" and by the time the final patch
appears it has become "disastrous".

> 
> To avoid this issue, this series replaces the CLOSID and RMID values
> cached in the task_struct with a pointer to the task's rdtgroup, through
> which the current CLOSID and RMID can be obtained indirectly during a

On a high level this seems fair but using a pointer to the rdtgroup in the
task_struct means that the scheduling code needs to dereference that pointer
without any lock held. The changes do take great care and it
would be valuable to clearly document how these accesses are safe. (please
see patch #4 and #6)

> context switch. Updating a group's ID then only requires the current
> task to be switched back in on all CPUs. On server hosts with very large
> thread counts, this is much less disruptive than making thread creation
> globally unavailable. However, this is less desirable on CPU-isolated,
> realtime workloads, so I am interested in suggestions on how to reach a
> compromise for the two use cases.

As I understand this only impacts moving a monitor group? To me this sounds
like a user space triggered event associated with a particular use case that
may not be relevant to the workloads that you refer to. I think this could be
something that can be documented for users with this type of workloads.
(please see patch #6)

> 
> Also note that the soft-ABMC[2] proposal is based on swapping the RMID
> values in mongroups when monitors are assigned to them, which will
> result in the same slowdown as was encountered with re-parenting
> monitoring groups.
> 
> Using pointers to rdtgroups from the task_struct been used internally at

"have been used internally"?

> Google for a few years to support an alternate CLOSID management
> technique[3], which was replaced by mongroup rename. Usage of this
> technique during production revealed an issue where threads in an
> exiting process become unreachable via for_each_process_thread() before
> destruction, but can still switch in and out. Consequently, an rmdir
> operation can fail to remove references to an rdtgroup before it frees
> it, causing the pointer to freed memory to be dereferenced after the
> structure is freed. This would result in invalid CLOSID/RMID values
> being written into MSRs, causing an exception. The workaround for this
> is to clear a task's rdtgroup pointer when it exits.

This does not sound like a problem unique to this new implementation but the
"invalid CLOSID/RMID values being written into MSRs" sounds like something
that happens today? Probably not worth pulling out for this reason because
in its current form the exiting process will keep using the original
CLOSID/RMID that have no issue being written to MSRs.

Reinette

  parent reply	other threads:[~2024-04-04 23:09 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 17:27 [PATCH v1 0/6] x86/resctrl: Avoid searching tasklist during mongrp_reparent Peter Newman
2024-03-25 17:27 ` [PATCH v1 1/6] x86/resctrl: Move __resctrl_sched_in() out-of-line Peter Newman
2024-04-04 23:09   ` Reinette Chatre
2024-04-05 22:04     ` Peter Newman
2024-04-07 19:21       ` Reinette Chatre
2024-04-08 19:05         ` Peter Newman
2024-04-08 20:59           ` Reinette Chatre
2024-04-08 21:41             ` Peter Newman
2024-04-09  3:53               ` Reinette Chatre
2024-03-25 17:27 ` [PATCH v1 2/6] x86/resctrl: Add hook for releasing task_struct references Peter Newman
2024-04-04 23:10   ` Reinette Chatre
2024-03-25 17:27 ` [PATCH v1 3/6] x86/resctrl: Disallow mongroup rename on MPAM Peter Newman
2024-04-04 23:11   ` Reinette Chatre
2024-04-05 22:10     ` Peter Newman
2024-12-05 15:03       ` Peter Newman
2024-12-06  4:14         ` Reinette Chatre
2024-12-06 10:23           ` Peter Newman
2024-12-06 16:14             ` Reinette Chatre
2024-03-25 17:27 ` [PATCH v1 4/6] x86/resctrl: Use rdtgroup pointer to indicate task membership Peter Newman
2024-04-04 23:15   ` Reinette Chatre
2024-03-25 17:27 ` [PATCH v1 5/6] x86/resctrl: Abstract PQR_ASSOC from generic code Peter Newman
2024-04-04 23:16   ` Reinette Chatre
2024-04-05 17:15   ` Reinette Chatre
2024-04-07 19:21   ` Reinette Chatre
2024-03-25 17:27 ` [PATCH v1 6/6] x86/resctrl: Don't search tasklist in mongroup rename Peter Newman
2024-04-04 23:18   ` Reinette Chatre
2024-04-04 23:09 ` Reinette Chatre [this message]
2024-04-05 21:30   ` [PATCH v1 0/6] x86/resctrl: Avoid searching tasklist during mongrp_reparent Peter Newman
2024-04-07 19:13     ` Reinette Chatre
2024-04-11 22:34 ` Moger, Babu
2024-04-11 22:44   ` Peter Newman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4681c9c5-3364-4628-a581-0366d0024211@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=babu.moger@amd.com \
    --cc=beaub@linux.microsoft.com \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=eranian@google.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=oleg@redhat.com \
    --cc=peternewman@google.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tandersen@netflix.com \
    --cc=tglx@linutronix.de \
    --cc=ubizjak@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xin3.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox