Re: [PATCH 1/3] sched, timer: Remove usages of ACCESS_ONCE in the scheduler

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>, Mel Gorman <mel@csn.ul.ie>,
	Rik van Riel <riel@redhat.com>, Jason Low <jason.low2@hp.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Mel Gorman <mgorman@suse.de>,
	Preeti U Murthy <preeti@linux.vnet.ibm.com>,
	hideaki.kimura@hp.com, Aswin Chandramouleeswaran <aswin@hp.com>,
	Scott J Norton <scott.norton@hp.com>
Subject: Re: [PATCH 1/3] sched, timer: Remove usages of ACCESS_ONCE in the scheduler
Date: Thu, 16 Apr 2015 20:02:27 +0200	[thread overview]
Message-ID: <20150416180227.GB17401@gmail.com> (raw)
In-Reply-To: <20150416165224.GD12676@worktop.ger.corp.intel.com>


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, Apr 15, 2015 at 09:46:01AM +0200, Ingo Molnar wrote:
> 
>  > @@ -2088,7 +2088,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
>  >  
>  >  static void reset_ptenuma_scan(struct task_struct *p)
>  >  {
>  > -	ACCESS_ONCE(p->mm->numa_scan_seq)++;
>  > +	WRITE_ONCE(p->mm->numa_scan_seq, READ_ONCE(p->mm->numa_scan_seq) + 1);
>  
> vs
> 
> 	seq = ACCESS_ONCE(p->mm->numa_scan_seq);
> 	if (p->numa_scan_seq == seq)
> 		return;
> 	p->numa_scan_seq = seq;
> 
> 
> > So the original ACCESS_ONCE() barriers were misguided to begin with: I 
> > think they tried to handle races with the scheduler balancing softirq 
> > and tried to avoid having to use atomics for the sequence counter 
> > (which would be overkill), but things like ACCESS_ONCE(x)++ never 
> > guaranteed atomicity (or even coherency) of the update.
> > 
> > But since in reality this is only statistical sampling code, all these 
> > compiler barriers can be removed I think. Peter, Mel, Rik, do you 
> > agree?
> 
> ACCESS_ONCE() is not a compiler barrier

It's not a general compiler barrier (and I didn't claim so) but it is 
still a compiler barrier: it's documented as a weak, variable specific 
barrier in Documentation/memor-barriers.txt:

  COMPILER BARRIER
  ----------------

  The Linux kernel has an explicit compiler barrier function that  prevents the
  compiler from moving the memory accesses either side of it to the  other side:

        barrier();

  This is a general barrier -- there are no read-read or write-write variants
  of barrier().  However, ACCESS_ONCE() can be thought of as a weak form
  for barrier() that affects only the specific accesses flagged by the
  ACCESS_ONCE().

 [...]

> The 'read' side uses ACCESS_ONCE() for two purposes:
>  - to load the value once, we don't want the seq number to change under
>    us for obvious reasons
>  - to avoid load tearing and observe weird seq numbers
> 
> The update side uses ACCESS_ONCE() to avoid write tearing, and 
> strictly speaking it should also worry about read-tearing since its 
> not hard serialized, although its very unlikely to actually have 
> concurrency (IIRC).

So what bad effects can there be from the very unlikely read and write 
tearing?

AFAICS nothing particularly bad. On the read side:

        seq = ACCESS_ONCE(p->mm->numa_scan_seq);
        if (p->numa_scan_seq == seq)
                return;
        p->numa_scan_seq = seq;

If p->mm->numa_scan_seq gets loaded twice (very unlikely), and two 
different values happen, then we might get a 'double' NUMA placement 
run - i.e. statistical noise.

On the update side:

        ACCESS_ONCE(p->mm->numa_scan_seq)++;
        p->mm->numa_scan_offset = 0;

If the compiler tears that up we might skip an update - again 
statistical noise at worst.

Nor is compiler tearing the only theoretical worry here: in theory, 
with long cache coherency latencies we might get two updates 'mixed 
up' and resulting in a (single) missed update.

Only atomics would solve all the races, but I think that would be 
overdoing it.

This is what I meant by that there's no harm from this race.

Thanks,

	Ingo

next prev parent reply	other threads:[~2015-04-16 18:02 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-14 23:09 [PATCH 0/3] sched, timer: Improve scalability of itimers Jason Low
2015-04-14 23:09 ` [PATCH 1/3] sched, timer: Remove usages of ACCESS_ONCE in the scheduler Jason Low
2015-04-14 23:59   ` Steven Rostedt
2015-04-15  2:12     ` Jason Low
2015-04-15  2:40       ` Steven Rostedt
2015-04-15  7:46         ` Ingo Molnar
2015-04-15 18:49           ` Jason Low
2015-04-15 19:16             ` Steven Rostedt
2015-04-16  2:46           ` Jason Low
2015-04-16 16:52           ` Peter Zijlstra
2015-04-16 18:02             ` Ingo Molnar [this message]
2015-04-16 18:15               ` Peter Zijlstra
2015-04-16 18:24                 ` Ingo Molnar
2015-04-16 19:02                   ` Peter Zijlstra
2015-04-16 19:41                     ` Paul E. McKenney
2015-04-17  3:25                   ` Jason Low
2015-04-17  8:19                     ` Ingo Molnar
2015-04-16 21:00                 ` Jason Low
2015-04-16  2:29         ` Jason Low
2015-04-16  2:37           ` Steven Rostedt
2015-04-14 23:09 ` [PATCH 2/3] sched, timer: Use atomics for thread_group_cputimer to improve scalability Jason Low
2015-04-15  7:33   ` Ingo Molnar
2015-04-15  7:35     ` Ingo Molnar
2015-04-15 17:14       ` Jason Low
2015-04-15 10:37   ` Preeti U Murthy
2015-04-15 19:09     ` Jason Low
2015-04-15 13:25   ` Frederic Weisbecker
2015-04-15 13:32     ` Peter Zijlstra
2015-04-15 20:04       ` Jason Low
2015-04-15 14:23   ` Davidlohr Bueso
2015-04-15 21:15     ` Jason Low
2015-04-14 23:09 ` [PATCH 3/3] sched, timer: Use cmpxchg to do updates in update_gt_cputime() Jason Low
2015-04-14 23:53 ` [PATCH 0/3] sched, timer: Improve scalability of itimers Linus Torvalds
2015-04-15  7:24   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150416180227.GB17401@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aswin@hp.com \
    --cc=fweisbec@gmail.com \
    --cc=hideaki.kimura@hp.com \
    --cc=jason.low2@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=mgorman@suse.de \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.