From: Valentin Schneider <valentin.schneider@arm.com>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
Fenghua Yu <fenghua.yu@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
James Morse <James.Morse@arm.com>
Subject: Re: [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct rmid/closid write race
Date: Wed, 25 Nov 2020 15:01:16 +0000 [thread overview]
Message-ID: <jhjpn41v5tv.mognet@arm.com> (raw)
In-Reply-To: <87be8915-21b0-5214-9742-ccc7515c298b@intel.com>
Hi Reinette,
On 24/11/20 21:37, Reinette Chatre wrote:
> Hi Valentin,
>
> On 11/22/2020 6:24 PM, Valentin Schneider wrote:
>> This is a small cleanup + a fix for a race I stumbled upon while staring at
>> resctrl stuff.
>>
>> Briefly tested on a Xeon Gold 5120 (m2.xlarge.x86 on Equinix) by bouncing
>> a few tasks around control groups.
>>
>
> ...
>
> Thank you very much for taking this on. Unfortunately this race is one
> of a few issues with the way in which tasks moving to a new resource
> group is handled.
>
> Other issues are:
>
> 1.
> Until the queued work is run, the moved task runs with old (and even
> invalid in the case when its original resource group has been removed)
> closid and rmid.
>
For a userspace task, that queued work should be run as soon as possible
(& relevant). If said task is currently running, then task_work_add() will
lead to an IPI; the other cases (task moving itself or not currently
running) are covered by the return to userspace path.
Kernel threads however are a prickly matter because they quite explicitly
don't have this return to userspace - they only run their task_work
callbacks on exit. So we currently have to wait for those kthreads to go
through a context switch to update the relevant register, but I don't
see any other alternative that wouldn't involve interrupting every other
CPU (the kthread could move between us triggering some remote work and its
previous CPU receiving the IPI).
> 2.
> Work to update the PQR_ASSOC register is queued every time the user
> writes a task id to the "tasks" file, even if the task already belongs
> to the resource group and in addition to any other pending work for that
> task. This could result in multiple pending work items associated with a
> single task even if they are all identical and even though only a single
> update with most recent values is needed. This could result in
> significant system resource waste, especially on tasks sleeping for a
> long time.
>
> Fenghua solved these issues by replacing the callback with a synchronous
> update, similar to how tasks are currently moved when a resource group
> is deleted. We plan to submit this work next week.
>
> This new solution will make patch 1 and 2 of this series unnecessary. As
> I understand it patch 3 would still be a welcome addition but would
> require changes. As you prefer you could either submit patch 3 on its
> own for the code as it is now and we will rework the task related
> changes on top of that, or you could wait for the task related changes
> to land first?
>
Please do Cc me on those - I'll re-evaluate the need for patch 3 then.
Thanks!
>>
>> Valentin Schneider (3):
>> x86/intel_rdt: Check monitor group vs control group membership earlier
>> x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race
>> x86/intel_rdt: Apply READ_ONCE/WRITE_ONCE to task_struct .rmid &
>> .closid
>>
>
> Thank you very much
>
> Reinette
next prev parent reply other threads:[~2020-11-25 15:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-23 2:24 [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct rmid/closid write race Valentin Schneider
2020-11-23 2:24 ` [PATCH v2 1/3] x86/intel_rdt: Check monitor group vs control group membership earlier Valentin Schneider
2020-11-23 2:24 ` [PATCH v2 2/3] x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race Valentin Schneider
2020-11-23 2:24 ` [PATCH v2 3/3] x86/intel_rdt: Apply READ_ONCE/WRITE_ONCE to task_struct .rmid & .closid Valentin Schneider
2020-11-24 21:37 ` [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct rmid/closid write race Reinette Chatre
2020-11-25 15:01 ` Valentin Schneider [this message]
2020-11-25 17:20 ` Reinette Chatre
2020-11-25 18:39 ` Valentin Schneider
2020-11-25 19:06 ` Reinette Chatre
2020-11-25 23:23 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jhjpn41v5tv.mognet@arm.com \
--to=valentin.schneider@arm.com \
--cc=James.Morse@arm.com \
--cc=bp@alien8.de \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.