Re: linux-next: manual merge of the rcu tree with the tip tree

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: linux-next: manual merge of the rcu tree with the tip tree
Date: Tue, 1 Aug 2017 14:02:32 +0000 (UTC)	[thread overview]
Message-ID: <1639218309.1091.1501596152868.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <CALCETrWO5g5HuNpxa4Phxg--fDPWpuCVDTVr-UfuzrK5wn-8dQ@mail.gmail.com>

----- On Aug 1, 2017, at 9:43 AM, Andy Lutomirski luto@kernel.org wrote:

> On Mon, Jul 31, 2017 at 9:03 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
>> On Tue, Aug 01, 2017 at 12:04:05AM +0000, Mathieu Desnoyers wrote:
>>> ----- On Jul 31, 2017, at 12:13 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
>>> wrote:
>>>
> 
>>                                                         Thanx, Paul
>>
>> ------------------------------------------------------------------------
>>
>> commit fde19879b6bd1abc0c1d4d5f945efed61bf7eb8c
>> Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Date:   Fri Jul 28 16:40:40 2017 -0400
>>
>>     membarrier: Expedited private command
>>
>>     Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED with IPIs using cpumask built
>>     from all runqueues for which current thread's mm is the same as the
>>     thread calling sys_membarrier. It executes faster than the non-expedited
>>     variant (no blocking). It also works on NOHZ_FULL configurations.
>>
>>     Scheduler-wise, it requires a memory barrier before and after context
>>     switching between processes (which have different mm). The memory
>>     barrier before context switch is already present. For the barrier after
>>     context switch:
>>
>>     * Our TSO archs can do RELEASE without being a full barrier. Look at
>>       x86 spin_unlock() being a regular STORE for example.  But for those
>>       archs, all atomics imply smp_mb and all of them have atomic ops in
>>       switch_mm() for mm_cpumask().
> 
> I think that, on x86, context switches, even without mm changes, must
> at least flush the store buffer (maybe SFENCE is okay) to avoid
> visible inconsistency due to store-buffer forwarding.
> 
> Anyway, can you document whatever property you require with a comment
> in switch_mm() or wherever you're finding that property so that future
> arch changes don't break it?

As I asked to Paul in my reply to his proposed manual merge,
we should indeed have a comment in switch_mm() stating something
like this just before the line invoking cpumask_set_cpu():

/*
 * The full memory barrier implied by mm_cpumask update operations
 * is required by the membarrier system call.
 */

What we want to order here is:

prev userspace memory accesses
schedule
  <full mb> (it's already there) [A]
  update to rq->curr changing the rq->curr->mm value
  <full mb> (provided by mm_cpumask updates in switch_mm on x86) [B]
next userspace memory accesses

wrt to:

userspace memory accesses
sys_membarrier
  <full mb> [C]
  iterate on each cpu's rq->curr, compare their "mm" to current->mm
    IPI each CPU that match
  <full mb> [D]
userspace memory accesses

[A] pairs with [D] and [B] pairs with [C].


> 
>> +static void membarrier_private_expedited(void)
>> +{
>> +       int cpu;
>> +       bool fallback = false;
>> +       cpumask_var_t tmpmask;
>> +
>> +       if (num_online_cpus() == 1)
>> +               return;
>> +
>> +       /*
>> +        * Matches memory barriers around rq->curr modification in
>> +        * scheduler.
>> +        */
>> +       smp_mb();       /* system call entry is not a mb. */
>> +
>> +       /*
>> +        * Expedited membarrier commands guarantee that they won't
>> +        * block, hence the GFP_NOWAIT allocation flag and fallback
>> +        * implementation.
>> +        */
>> +       if (!zalloc_cpumask_var(&tmpmask, GFP_NOWAIT)) {
>> +               /* Fallback for OOM. */
>> +               fallback = true;
>> +       }
>> +
>> +       cpus_read_lock();
>> +       for_each_online_cpu(cpu) {
>> +               struct task_struct *p;
>> +
>> +               /*
>> +                * Skipping the current CPU is OK even through we can be
>> +                * migrated at any point. The current CPU, at the point
>> +                * where we read raw_smp_processor_id(), is ensured to
>> +                * be in program order with respect to the caller
>> +                * thread. Therefore, we can skip this CPU from the
>> +                * iteration.
>> +                */
>> +               if (cpu == raw_smp_processor_id())
>> +                       continue;
>> +               rcu_read_lock();
>> +               p = task_rcu_dereference(&cpu_rq(cpu)->curr);
>> +               if (p && p->mm == current->mm) {
> 
> I'm a bit surprised you're iterating all CPUs instead of just CPUs in
> mm_cpumask().

I see two reasons for this. The first is because architectures like
ARM64 don't even bother populating the mm_cpumask. The second reason
is because I don't think all architectures ensure that updates to
mm_cpumask imply full memory barriers. Therefore, we would need to revisit
each architecture switch_mm to ensure mm_cpumask bit set ops either imply
a full memory barrier, or are followed by an explicit one, if we
choose to use this bitmask as an optimization.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

next prev parent reply	other threads:[~2017-08-01 14:02 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-31  3:50 linux-next: manual merge of the rcu tree with the tip tree Stephen Rothwell
2017-07-31 16:13 ` Paul E. McKenney
2017-08-01  0:04   ` Mathieu Desnoyers
2017-08-01  4:03     ` Paul E. McKenney
2017-08-01  4:25       ` Mathieu Desnoyers
2017-08-01 16:31         ` Paul E. McKenney
2017-08-01 13:43       ` Andy Lutomirski
2017-08-01 13:58         ` Peter Zijlstra
2017-08-01 14:15           ` Peter Zijlstra
2017-08-01 14:17             ` Andy Lutomirski
2017-08-01 14:02         ` Mathieu Desnoyers [this message]
2017-08-01 14:15           ` Andy Lutomirski
2017-08-01 15:40             ` Mathieu Desnoyers
2017-08-01 21:36             ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2024-02-27  1:55 Stephen Rothwell
2022-04-06  2:45 Stephen Rothwell
2022-04-06 16:28 ` Paul E. McKenney
2022-02-21 18:17 broonie
2021-10-12  4:48 Stephen Rothwell
2021-10-13 16:31 ` Paul E. McKenney
2021-08-17  7:09 Stephen Rothwell
2021-06-23  5:33 Stephen Rothwell
2021-06-22  4:51 Stephen Rothwell
2021-06-22  4:47 Stephen Rothwell
2021-06-22  5:04 ` Stephen Rothwell
2021-06-22 17:10 ` Paul E. McKenney
2020-10-09  4:59 Stephen Rothwell
2020-07-29  6:23 Stephen Rothwell
2020-06-26  3:14 Stephen Rothwell
2020-06-25  2:44 Stephen Rothwell
2020-06-25  3:44 ` Paul E. McKenney
2020-06-24  3:04 Stephen Rothwell
2020-06-24  4:06 ` Paul E. McKenney
2020-05-29  6:22 Stephen Rothwell
2020-05-29  6:41 ` Stephen Rothwell
2020-05-29 14:15   ` Paul E. McKenney
2020-05-29 23:38     ` Stephen Rothwell
2020-03-25  3:08 Stephen Rothwell
2020-03-25  3:18 ` Paul E. McKenney
2020-03-25 21:31   ` Paul E. McKenney
2019-12-19  0:50 Stephen Rothwell
2019-12-19  1:27 ` Paul E. McKenney
2019-12-19  1:31   ` Paul E. McKenney
2019-12-19  8:41     ` Peter Zijlstra
2019-12-19 13:38       ` Paul E. McKenney
2019-12-16 23:37 Stephen Rothwell
2018-06-22  2:27 Stephen Rothwell
2018-06-26 19:33 ` Paul E. McKenney
2017-11-10  2:14 Stephen Rothwell
2017-08-22  4:13 Stephen Rothwell
2016-07-18  5:26 Stephen Rothwell
2016-07-19  3:00 ` Paul E. McKenney
2016-06-09  5:14 Stephen Rothwell
2016-06-09 15:59 ` Paul E. McKenney
2016-03-04  4:13 Stephen Rothwell
2016-03-04 15:04 ` Paul E. McKenney
2015-07-16  2:57 Stephen Rothwell
2015-05-07  3:56 Stephen Rothwell
2014-02-24  4:18 Stephen Rothwell
2014-02-24  4:42 ` Paul E. McKenney
2012-09-05  3:59 Stephen Rothwell
2012-09-05 16:39 ` Paul E. McKenney
2012-09-05 17:11   ` Peter Zijlstra
2012-08-23  3:01 Stephen Rothwell
2012-08-23  3:22 ` Paul E. McKenney
2012-08-22  4:27 Stephen Rothwell
2012-08-22  5:05 ` Paul E. McKenney
2012-08-22  4:27 Stephen Rothwell
2012-08-22  5:03 ` Paul E. McKenney
2011-06-20  4:47 Stephen Rothwell
2011-06-20 15:17 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1639218309.1091.1501596152868.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.