All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-mm@kvack.org, Anton Blanchard <anton@ozlabs.org>
Subject: Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
Date: Wed, 2 Dec 2020 13:45:19 +0100	[thread overview]
Message-ID: <20201202124519.GP3092@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20201202111731.GA2414@hirez.programming.kicks-ass.net>

On Wed, Dec 02, 2020 at 12:17:31PM +0100, Peter Zijlstra wrote:

> So the obvious 'improvement' here would be something like:
> 
> 	for_each_online_cpu(cpu) {
> 		p = rcu_dereference(cpu_rq(cpu)->curr;
> 		if (p->active_mm != mm)
> 			continue;
> 		__cpumask_set_cpu(cpu, tmpmask);
> 	}
> 	on_each_cpu_mask(tmpmask, ...);
> 
> The remote CPU will never switch _to_ @mm, on account of it being quite
> dead, but it is quite prone to false negatives.
> 
> Consider that __schedule() sets rq->curr *before* context_switch(), this
> means we'll see next->active_mm, even though prev->active_mm might still
> be our @mm.
> 
> Now, because we'll be removing the atomic ops from context_switch()'s
> active_mm swizzling, I think we can change this to something like the
> below. The hope being that the cost of the new barrier can be offset by
> the loss of the atomics.
> 
> Hmm ?
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 41404afb7f4c..2597c5c0ccb0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4509,7 +4509,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  	if (!next->mm) {                                // to kernel
>  		enter_lazy_tlb(prev->active_mm, next);
>  
> -		next->active_mm = prev->active_mm;
>  		if (prev->mm)                           // from user
>  			mmgrab(prev->active_mm);
>  		else
> @@ -4524,6 +4523,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  		 * case 'prev->active_mm == next->mm' through
>  		 * finish_task_switch()'s mmdrop().
>  		 */
> +		next->active_mm = next->mm;
>  		switch_mm_irqs_off(prev->active_mm, next->mm, next);

I think that next->active_mm store should be after switch_mm(),
otherwise we still race.

>  
>  		if (!prev->mm) {                        // from kernel
> @@ -5713,11 +5713,9 @@ static void __sched notrace __schedule(bool preempt)
>  
>  	if (likely(prev != next)) {
>  		rq->nr_switches++;
> -		/*
> -		 * RCU users of rcu_dereference(rq->curr) may not see
> -		 * changes to task_struct made by pick_next_task().
> -		 */
> -		RCU_INIT_POINTER(rq->curr, next);
> +
> +		next->active_mm = prev->active_mm;
> +		rcu_assign_pointer(rq->curr, next);
>  		/*
>  		 * The membarrier system call requires each architecture
>  		 * to have a full memory barrier after updating

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: linux-arch@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>,
	x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
Date: Wed, 2 Dec 2020 13:45:19 +0100	[thread overview]
Message-ID: <20201202124519.GP3092@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20201202111731.GA2414@hirez.programming.kicks-ass.net>

On Wed, Dec 02, 2020 at 12:17:31PM +0100, Peter Zijlstra wrote:

> So the obvious 'improvement' here would be something like:
> 
> 	for_each_online_cpu(cpu) {
> 		p = rcu_dereference(cpu_rq(cpu)->curr;
> 		if (p->active_mm != mm)
> 			continue;
> 		__cpumask_set_cpu(cpu, tmpmask);
> 	}
> 	on_each_cpu_mask(tmpmask, ...);
> 
> The remote CPU will never switch _to_ @mm, on account of it being quite
> dead, but it is quite prone to false negatives.
> 
> Consider that __schedule() sets rq->curr *before* context_switch(), this
> means we'll see next->active_mm, even though prev->active_mm might still
> be our @mm.
> 
> Now, because we'll be removing the atomic ops from context_switch()'s
> active_mm swizzling, I think we can change this to something like the
> below. The hope being that the cost of the new barrier can be offset by
> the loss of the atomics.
> 
> Hmm ?
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 41404afb7f4c..2597c5c0ccb0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4509,7 +4509,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  	if (!next->mm) {                                // to kernel
>  		enter_lazy_tlb(prev->active_mm, next);
>  
> -		next->active_mm = prev->active_mm;
>  		if (prev->mm)                           // from user
>  			mmgrab(prev->active_mm);
>  		else
> @@ -4524,6 +4523,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  		 * case 'prev->active_mm == next->mm' through
>  		 * finish_task_switch()'s mmdrop().
>  		 */
> +		next->active_mm = next->mm;
>  		switch_mm_irqs_off(prev->active_mm, next->mm, next);

I think that next->active_mm store should be after switch_mm(),
otherwise we still race.

>  
>  		if (!prev->mm) {                        // from kernel
> @@ -5713,11 +5713,9 @@ static void __sched notrace __schedule(bool preempt)
>  
>  	if (likely(prev != next)) {
>  		rq->nr_switches++;
> -		/*
> -		 * RCU users of rcu_dereference(rq->curr) may not see
> -		 * changes to task_struct made by pick_next_task().
> -		 */
> -		RCU_INIT_POINTER(rq->curr, next);
> +
> +		next->active_mm = prev->active_mm;
> +		rcu_assign_pointer(rq->curr, next);
>  		/*
>  		 * The membarrier system call requires each architecture
>  		 * to have a full memory barrier after updating

  reply	other threads:[~2020-12-02 12:46 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 16:01 [PATCH 0/8] shoot lazy tlbs Nicholas Piggin
2020-11-28 16:01 ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 1/8] lazy tlb: introduce exit_lazy_tlb Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-29  0:38   ` Andy Lutomirski
2020-11-29  0:38     ` Andy Lutomirski
2020-12-02  2:49     ` Nicholas Piggin
2020-12-02  2:49       ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 2/8] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 17:55   ` Andy Lutomirski
2020-11-28 17:55     ` Andy Lutomirski
2020-12-02  2:49     ` Nicholas Piggin
2020-12-02  2:49       ` Nicholas Piggin
2020-12-03  5:09       ` Andy Lutomirski
2020-12-03  5:09         ` Andy Lutomirski
2020-12-05  8:00         ` Nicholas Piggin
2020-12-05  8:00           ` Nicholas Piggin
2020-12-05 16:11           ` Andy Lutomirski
2020-12-05 16:11             ` Andy Lutomirski
2020-12-05 23:14             ` Nicholas Piggin
2020-12-05 23:14               ` Nicholas Piggin
2020-12-06  0:36               ` Andy Lutomirski
2020-12-06  0:36                 ` Andy Lutomirski
2020-12-06  3:59                 ` Nicholas Piggin
2020-12-06  3:59                   ` Nicholas Piggin
2020-12-11  0:11                   ` Andy Lutomirski
2020-12-11  0:11                     ` Andy Lutomirski
2020-12-14  4:07                     ` Nicholas Piggin
2020-12-14  4:07                       ` Nicholas Piggin
2020-12-14  5:53                       ` Nicholas Piggin
2020-12-14  5:53                         ` Nicholas Piggin
2020-11-30 14:57   ` Mathieu Desnoyers
2020-11-30 14:57     ` Mathieu Desnoyers
2020-11-28 16:01 ` [PATCH 3/8] x86: remove ARCH_HAS_SYNC_CORE_BEFORE_USERMODE Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 4/8] lazy tlb: introduce lazy mm refcount helper functions Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 5/8] lazy tlb: allow lazy tlb mm switching to be configurable Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-29  0:36   ` Andy Lutomirski
2020-11-29  0:36     ` Andy Lutomirski
2020-12-02  2:49     ` Nicholas Piggin
2020-12-02  2:49       ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-29  3:54   ` Andy Lutomirski
2020-11-29  3:54     ` Andy Lutomirski
2020-11-29 20:16     ` Andy Lutomirski
2020-11-29 20:16       ` Andy Lutomirski
2020-11-30  9:25       ` Peter Zijlstra
2020-11-30  9:25         ` Peter Zijlstra
2020-11-30 18:31       ` Andy Lutomirski
2020-11-30 18:31         ` Andy Lutomirski
2020-12-01 21:27         ` Will Deacon
2020-12-01 21:27           ` Will Deacon
2020-12-01 21:50           ` Andy Lutomirski
2020-12-01 21:50             ` Andy Lutomirski
2020-12-01 23:04             ` Will Deacon
2020-12-01 23:04               ` Will Deacon
2020-12-02  3:47         ` Nicholas Piggin
2020-12-02  3:47           ` Nicholas Piggin
2020-12-03  5:05           ` Andy Lutomirski
2020-12-03  5:05             ` Andy Lutomirski
2020-12-03 17:03         ` Alexander Gordeev
2020-12-03 17:03           ` Alexander Gordeev
2020-12-03 17:14           ` Andy Lutomirski
2020-12-03 17:14             ` Andy Lutomirski
2020-12-03 18:33             ` Alexander Gordeev
2020-12-03 18:33               ` Alexander Gordeev
2020-11-30  9:26     ` Peter Zijlstra
2020-11-30  9:26       ` Peter Zijlstra
2020-11-30  9:30     ` Peter Zijlstra
2020-11-30  9:30       ` Peter Zijlstra
2020-11-30  9:34       ` Peter Zijlstra
2020-11-30  9:34         ` Peter Zijlstra
2020-12-02  3:09     ` Nicholas Piggin
2020-12-02  3:09       ` Nicholas Piggin
2020-12-02 11:17   ` Peter Zijlstra
2020-12-02 11:17     ` Peter Zijlstra
2020-12-02 12:45     ` Peter Zijlstra [this message]
2020-12-02 12:45       ` Peter Zijlstra
2020-12-02 14:19   ` Peter Zijlstra
2020-12-02 14:19     ` Peter Zijlstra
2020-12-02 14:38     ` Andy Lutomirski
2020-12-02 14:38       ` Andy Lutomirski
2020-12-02 16:29       ` Peter Zijlstra
2020-12-02 16:29         ` Peter Zijlstra
2020-11-28 16:01 ` [PATCH 7/8] powerpc: use lazy mm refcount helper functions Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 8/8] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201202124519.GP3092@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=anton@ozlabs.org \
    --cc=arnd@arndb.de \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=npiggin@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.