public inbox for linux-safety@lists.elisa.tech
 help / color / mirror / Atom feed
From: "Singh, Balbir" <sblbir@amazon.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Nathan Chancellor <natechancellor@gmail.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"clang-built-linux@googlegroups.com"
	<clang-built-linux@googlegroups.com>,
	"kernel-janitors@vger.kernel.org"
	<kernel-janitors@vger.kernel.org>,
	"linux-safety@lists.elisa.tech" <linux-safety@lists.elisa.tech>
Subject: Re: [PATCH -next for tip:x86/pti] x86/tlb: drop unneeded local vars in enable_l1d_flush_for_task()
Date: Thu, 1 Oct 2020 10:48:44 +1000	[thread overview]
Message-ID: <044e9835-f4fe-6670-90df-15fe376ecadd@amazon.com> (raw)
In-Reply-To: <19f57cbe-ba33-17d5-440c-2765e670782f@amazon.com>

On 1/10/20 9:49 am, Singh, Balbir wrote:
> On 1/10/20 7:38 am, Thomas Gleixner wrote:
> 
>>
>>
>>
>> On Wed, Sep 30 2020 at 20:35, Peter Zijlstra wrote:
>>> On Wed, Sep 30, 2020 at 08:00:59PM +0200, Thomas Gleixner wrote:
>>>> On Wed, Sep 30 2020 at 19:03, Peter Zijlstra wrote:
>>>>> On Wed, Sep 30, 2020 at 05:40:08PM +0200, Thomas Gleixner wrote:
>>>>> Also, that preempt_disable() in there doesn't actually do anything.
>>>>> Worse, preempt_disable(); for_each_cpu(); is an anti-pattern. It mixes
>>>>> static_cpu_has() and boot_cpu_has() in the same bloody condition and has
>>>>> a pointless ret variable.
>>>
>>> Also, I forgot to add, it accesses ->cpus_mask without the proper
>>> locking, so it could be reading intermediate state from whatever cpumask
>>> operation that's in progress.
>>
>> Yes. I saw that after hitting send. :(
>>
>>>> I absolutely agree and I really missed it when looking at it before
>>>> merging. cpus_read_lock()/unlock() is the right thing to do if at all.
>>>>
>>>>> It's shoddy code, that only works if you align the planets right. We
>>>>> really shouldn't provide interfaces that are this bad.
>>>>>
>>>>> It's correct operation is only by accident.
>>>>
>>>> True :(
>>>>
>>>> I understand Balbirs problem and it makes some sense to provide a
>>>> solution. We can:
>>>>
>>>>     1) reject set_affinity() if the task has that flush muck enabled
>>>>        and user space tries to move it to a SMT enabled core
>>>>
>>>>     2) disable the muck if if detects that it is runs on a SMT enabled
>>>>        core suddenly (hotplug says hello)
>>>>
>>>>        This one is nasty because there is no feedback to user space
>>>>        about the wreckage.
>>>
>>> That's and, right, not or. because 1) deals with sched_setffinity()
>>> and 2) deals wit hotplug.
>>
>> It was meant as AND of course.
>>
>>> Now 1) requires an arch hook in sched_setaffinity(), something I'm not
>>> keen on providing, once we provide it, who knows what strange and
>>> wonderful things archs will dream up.
>>
>> I don't think so. We can have that magic in core:
>>
>> #ifdef CONFIG_HAS_PARANOID_L1D_FLUSH
>> static bool paranoid_l1d_valid(struct task_struct *tsk,
>>                                const struct cpumask *msk)
>> {
>>         if (!test_tsk_thread_flag(tsk, TIF_SPEC_L1D_FLUSH))
>>                 return true;
>>         /* Do magic stuff */
>>         return res;
>> }
>> #else
>> static bool paranoid_l1d_valid(struct task_struct *tsk,
>>                                const struct cpumask *msk)
>> {
>>         return true;
>> }
>> #endif
>>
>> It's a pretty well defined problem and having the magic in core code
>> prevents an arch hook which allows abuse of all sorts.
>>
>> And the same applies to enable_l1d_flush_for_task(). The only
>> architecture specific nonsense are the checks whether the CPU bug is
>> there and whether the hardware supports L1D flushing.
>>
>> So we can have:
>>
>> #ifdef CONFIG_HAS_PARANOID_L1D_FLUSH
>> int paranoid_l1d_enable(struct task_struct *tsk)
>> {
>>         /* Do the SMT validation under the proper locks */
>>         if (!res)
>>                 set_task_thread_flag(tsk, TIF_SPEC_L1D_FLUSH);
>>         return res;
>> }
>> #endif
>>
>>> And 2) also happens on hot-un-plug, when the task's affinity gets
>>> forced because it became empty. No user feedback there either, and
>>> information is lost.
>>
>> Of course. It's both that suddenly SMT gets enabled on a core which was
>> isolated and when the last isolated core in the tasks CPU mask goes
>> offline.
>>
>>> I suppose we can do 2) but send a signal. That would cover all cases and
>>> keep it in arch code. But yes, that's pretty terrible too.
>>
>> Bah. I just looked at the condition to flush:
>>
>>         if (sched_smt_active() && !this_cpu_read(cpu_info.smt_active) &&
>>                 (prev_mm & LAST_USER_MM_L1D_FLUSH))
>>                 l1d_flush_hw();
>>
>> That fails to flush when SMT is disabled globally. Balbir?
>>
>> Of course this should be:
>>
>>         if (!this_cpu_read(cpu_info.smt_active) && (prev_mm & LAST_USER_MM_L1D_FLUSH))
>>                 l1d_flush_hw();
>>
>> Now we can make this:
>>
>>         if (unlikely(prev_mm & LAST_USER_MM_L1D_FLUSH)) {
>>                 if (!this_cpu_read(cpu_info.smt_active))
>>                         l1d_flush_hw();
>>                 else
>>                         task_work_add(...);
>>
>> And that task work clears the flag and sends a signal. We're not going
>> to send a signal from switch_mm() ....
>>
>> Thanks,
>>
> 
> 
> So this is the change I am playing with, I don't like the idea of killing the task, but it's better than silently not flushing, I guess system administrators will learn with time not to correctly the affinity of tasks flushing
> L1D. For the affinity bits, not being able to change the affinity is better, but not being able to provide feedback on as to why is a bit weird as well, but I wonder if there are other cases where we might want to lock the affinity of a task for it's lifetime.
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 6b0f4c88b07c..6b0d0a9cd447 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -320,26 +320,15 @@ int enable_l1d_flush_for_task(struct task_struct *tsk)
>  
>  	/*
>  	 * Do not enable L1D_FLUSH_OUT if
> -	 * b. The CPU is not affected by the L1TF bug
> -	 * c. The CPU does not have L1D FLUSH feature support
> -	 * c. The task's affinity is on cores with SMT on.
> +	 * a. The CPU is not affected by the L1TF bug
> +	 * b. The CPU does not have L1D FLUSH feature support
>  	 */
>  
>  	if (!boot_cpu_has_bug(X86_BUG_L1TF) ||
> -			!static_cpu_has(X86_FEATURE_FLUSH_L1D))
> +		!boot_cpu_has(X86_FEATURE_FLUSH_L1D))
>  		return -EINVAL;
>  
> -	cpu = get_cpu();
> -
> -	for_each_cpu(i, &tsk->cpus_mask) {
> -		if (cpu_data(i).smt_active == true) {
> -			put_cpu();
> -			return -EINVAL;
> -		}
> -	}
> -
>  	set_ti_thread_flag(&tsk->thread_info, TIF_SPEC_L1D_FLUSH);
> -	put_cpu();
>  	return ret;
>  }
>  
> @@ -349,6 +338,12 @@ int disable_l1d_flush_for_task(struct task_struct *tsk)
>  	return 0;
>  }
>  
> +static void l1d_flush_kill(struct callback_head *ch)
> +{
> +	clear_ti_thread_flag(&current->thread_info, TIF_SPEC_L1D_FLUSH);
> +	force_signal(SIGBUS);
> +}
> +
>  void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  	       struct task_struct *tsk)
>  {
> @@ -443,12 +438,14 @@ static void cond_mitigation(struct task_struct *next)
>  	}
>  
>  	/*
> -	 * Flush only if SMT is disabled as per the contract, which is checked
> -	 * when the feature is enabled.
> +	 * Flush only if SMT is disabled, if flushing is enabled
> +	 * and we are on an SMT enabled core, kill the task
>  	 */
> -	if (sched_smt_active() && !this_cpu_read(cpu_info.smt_active) &&
> -		(prev_mm & LAST_USER_MM_L1D_FLUSH))
> -		l1d_flush_hw();
> +	if (unlikely(prev_mm & LAST_USER_MM_L1D_FLUSH)) {
> +		if (!this_cpu_read(cpu_info.smt_active))
> +			l1d_flush_hw();
> +		else
> +			task_work_add(prev, l1d_flush_kill, true);

We have no access the to the previous task and mm->owner depends on MEMCG :)
We can do the magic in mm_mangle_tif_spec_bits(), I suppose

Balbir

  reply	other threads:[~2020-10-01  0:48 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-28 12:44 [PATCH -next for tip:x86/pti] x86/tlb: drop unneeded local vars in enable_l1d_flush_for_task() Lukas Bulwahn
2020-09-28 20:43 ` Nathan Chancellor
2020-09-29  7:12 ` Peter Zijlstra
2020-09-29  8:33   ` Lukas Bulwahn
2020-09-29  8:37   ` Peter Zijlstra
2020-09-30 15:40     ` Thomas Gleixner
2020-09-30 16:53       ` Lukas Bulwahn
2020-09-30 17:03       ` Peter Zijlstra
2020-09-30 18:00         ` Thomas Gleixner
2020-09-30 18:35           ` Peter Zijlstra
2020-09-30 21:38             ` Thomas Gleixner
2020-09-30 22:59               ` Singh, Balbir
2020-09-30 23:49               ` Singh, Balbir
2020-10-01  0:48                 ` Singh, Balbir [this message]
2020-10-01  8:17                   ` Thomas Gleixner
2020-10-01  8:19                 ` Peter Zijlstra
2020-09-30 22:46           ` Singh, Balbir

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=044e9835-f4fe-6670-90df-15fe376ecadd@amazon.com \
    --to=sblbir@amazon.com \
    --cc=bp@alien8.de \
    --cc=clang-built-linux@googlegroups.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-safety@lists.elisa.tech \
    --cc=lukas.bulwahn@gmail.com \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=natechancellor@gmail.com \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox