From: Valentin Schneider <valentin.schneider@arm.com>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
Fenghua Yu <fenghua.yu@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
James Morse <James.Morse@arm.com>
Subject: Re: [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct rmid/closid write race
Date: Wed, 25 Nov 2020 15:01:16 +0000 [thread overview]
Message-ID: <jhjpn41v5tv.mognet@arm.com> (raw)
In-Reply-To: <87be8915-21b0-5214-9742-ccc7515c298b@intel.com>
Hi Reinette,
On 24/11/20 21:37, Reinette Chatre wrote:
> Hi Valentin,
>
> On 11/22/2020 6:24 PM, Valentin Schneider wrote:
>> This is a small cleanup + a fix for a race I stumbled upon while staring at
>> resctrl stuff.
>>
>> Briefly tested on a Xeon Gold 5120 (m2.xlarge.x86 on Equinix) by bouncing
>> a few tasks around control groups.
>>
>
> ...
>
> Thank you very much for taking this on. Unfortunately this race is one
> of a few issues with the way in which tasks moving to a new resource
> group is handled.
>
> Other issues are:
>
> 1.
> Until the queued work is run, the moved task runs with old (and even
> invalid in the case when its original resource group has been removed)
> closid and rmid.
>
For a userspace task, that queued work should be run as soon as possible
(& relevant). If said task is currently running, then task_work_add() will
lead to an IPI; the other cases (task moving itself or not currently
running) are covered by the return to userspace path.
Kernel threads however are a prickly matter because they quite explicitly
don't have this return to userspace - they only run their task_work
callbacks on exit. So we currently have to wait for those kthreads to go
through a context switch to update the relevant register, but I don't
see any other alternative that wouldn't involve interrupting every other
CPU (the kthread could move between us triggering some remote work and its
previous CPU receiving the IPI).
> 2.
> Work to update the PQR_ASSOC register is queued every time the user
> writes a task id to the "tasks" file, even if the task already belongs
> to the resource group and in addition to any other pending work for that
> task. This could result in multiple pending work items associated with a
> single task even if they are all identical and even though only a single
> update with most recent values is needed. This could result in
> significant system resource waste, especially on tasks sleeping for a
> long time.
>
> Fenghua solved these issues by replacing the callback with a synchronous
> update, similar to how tasks are currently moved when a resource group
> is deleted. We plan to submit this work next week.
>
> This new solution will make patch 1 and 2 of this series unnecessary. As
> I understand it patch 3 would still be a welcome addition but would
> require changes. As you prefer you could either submit patch 3 on its
> own for the code as it is now and we will rework the task related
> changes on top of that, or you could wait for the task related changes
> to land first?
>
Please do Cc me on those - I'll re-evaluate the need for patch 3 then.
Thanks!
>>
>> Valentin Schneider (3):
>> x86/intel_rdt: Check monitor group vs control group membership earlier
>> x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race
>> x86/intel_rdt: Apply READ_ONCE/WRITE_ONCE to task_struct .rmid &
>> .closid
>>
>
> Thank you very much
>
> Reinette
next prev parent reply other threads:[~2020-11-25 15:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-23 2:24 [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct rmid/closid write race Valentin Schneider
2020-11-23 2:24 ` [PATCH v2 1/3] x86/intel_rdt: Check monitor group vs control group membership earlier Valentin Schneider
2020-11-23 2:24 ` [PATCH v2 2/3] x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race Valentin Schneider
2020-11-23 2:24 ` [PATCH v2 3/3] x86/intel_rdt: Apply READ_ONCE/WRITE_ONCE to task_struct .rmid & .closid Valentin Schneider
2020-11-24 21:37 ` [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct rmid/closid write race Reinette Chatre
2020-11-25 15:01 ` Valentin Schneider [this message]
2020-11-25 17:20 ` Reinette Chatre
2020-11-25 18:39 ` Valentin Schneider
2020-11-25 19:06 ` Reinette Chatre
2020-11-25 23:23 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jhjpn41v5tv.mognet@arm.com \
--to=valentin.schneider@arm.com \
--cc=James.Morse@arm.com \
--cc=bp@alien8.de \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox