public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: James Morse <james.morse@arm.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	Fenghua Yu <fenghua.yu@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 2/2] x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race
Date: Fri, 20 Nov 2020 15:54:53 +0000	[thread overview]
Message-ID: <jhjtutkuipe.mognet@arm.com> (raw)
In-Reply-To: <c3370c24-98f2-dac8-e390-9c80a3321cf0@arm.com>


Hi James,

On 20/11/20 14:53, James Morse wrote:
> Hi Valentin,
>
> On 18/11/2020 18:00, Valentin Schneider wrote:
>> Upon moving a task to a new control / monitor group, said task's {closid,
>> rmid} fields are updated *after* triggering the move_myself() task_work
>> callback. This can cause said callback to miss the update, e.g. if the
>> triggering thread got preempted before fiddling with task_struct, or if the
>> targeted task was already on its way to return to userspace.
>
> So, if move_myself() runs after task_work_add() but before tsk is written to.
> Sounds fun!
>
>
>> Update the task_struct's {closid, rmid} tuple *before* invoking
>> task_work_add(). As they can happen concurrently, wrap {closid, rmid}
>> accesses with READ_ONCE() and WRITE_ONCE(). Highlight the required ordering
>> with a pair of comments.
>
> ... and this one is if move_myself() or __resctrl_sched_in() runs while tsk is being
> written to on another CPU. It might get torn values, or multiple-reads get different values.
>
> The READ_ONCE/WRITE_ONCEry would have been easier to read as a separate patch as you touch
> all sites, and move/change some of them.
>

True, I initially only fixed up the reads/writes involved with
__rdtgroup_move_task(), but ended up coccinelle'ing the whole lot - which I
should have then moved to a dedicated patch. Thanks for powering through
it, I'll send a v2 with a neater split. 

> Regardless:
> Reviewed-by: James Morse <james.morse@arm.com>
>

Thanks!

>
> I don't 'get' memory-ordering, so one curiosity below:
>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index b6b5b95df833..135a51529f70 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -524,11 +524,13 @@ static void move_myself(struct callback_head *head)
>>  	 * If resource group was deleted before this task work callback
>>  	 * was invoked, then assign the task to root group and free the
>>  	 * resource group.
>> +	 *
>> +	 * See pairing atomic_inc() in __rdtgroup_move_task()
>>  	 */
>>  	if (atomic_dec_and_test(&rdtgrp->waitcount) &&
>>  	    (rdtgrp->flags & RDT_DELETED)) {
>> -		current->closid = 0;
>> -		current->rmid = 0;
>> +		WRITE_ONCE(current->closid, 0);
>> +		WRITE_ONCE(current->rmid, 0);
>>  		kfree(rdtgrp);
>>  	}
>>  
>> @@ -553,14 +555,32 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
>
>>  	/*
>>  	 * Take a refcount, so rdtgrp cannot be freed before the
>>  	 * callback has been invoked.
>> +	 *
>> +	 * Also ensures above {closid, rmid} writes are observed by
>> +	 * move_myself(), as it can run immediately after task_work_add().
>> +	 * Otherwise old values may be loaded, and the move will only actually
>> +	 * happen at the next context switch.
>
> But __resctrl_sched_in() can still occur at anytime and READ_ONCE() a pair of values that
> don't go together?

Yes, the thought did cross my mind...

> I don't think this is a problem for RDT as with old-rmid the task was a member of that
> monitor-group previously, and 'freed' rmid are kept in limbo for a while after.
> (old-closid is the same as the task having not schedule()d since the change, which is fine).
>
> For MPAM, this is more annoying as changing just the closid may put the task in a
> monitoring group that never existed, meaning its surprise dirty later.
>
> If this all makes sense, I guess the fix (for much later) is to union closid/rmid, and
> WRITE_ONCE() them together where necessary.
> (I've made a note for when I next pass that part of the MPAM tree)
>

It does make sense to me - one more question back to you: can RDT exist on
an X86_32 system? It shouldn't be a stopper, but would be an inconvenience.

FWIW kernel/sched/fair.c uses two synced u64's for this; see

  struct cfs_rq { .min_vruntime, .min_vruntime_copy }

and

  kernel/sched/fair.c:update_min_vruntime()
  kernel/sched/fair.c:migrate_task_rq_fair()
>
>> +	 *
>> +	 * Pairs with atomic_dec() in move_myself().
>>  	 */
>>  	atomic_inc(&rdtgrp->waitcount);
>> +
>>  	ret = task_work_add(tsk, &callback->work, TWA_RESUME);
>>  	if (ret) {
>>  		/*
>
>
> Thanks!
>
> James


      reply	other threads:[~2020-11-20 15:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18 18:00 [PATCH 0/2] x86/intel_rdt: task_work vs task_struct rmid/closid write race Valentin Schneider
2020-11-18 18:00 ` [PATCH 1/2] x86/intel_rdt: Check monitor group vs control group membership earlier Valentin Schneider
2020-11-20 14:53   ` James Morse
2020-11-20 15:53     ` Valentin Schneider
2020-11-18 18:00 ` [PATCH 2/2] x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race Valentin Schneider
2020-11-20 14:53   ` James Morse
2020-11-20 15:54     ` Valentin Schneider [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhjtutkuipe.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox