All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Bharata B Rao <bharata.rao@gmail.com>,
	Li Zefan <lizf@cn.fujitsu.com>, Ingo Molnar <mingo@elte.hu>,
	Paul Menage <menage@google.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cpuacct: add a branch prediction
Date: Thu, 26 Feb 2009 17:29:15 -0800	[thread overview]
Message-ID: <20090227012915.GF6634@linux.vnet.ibm.com> (raw)
In-Reply-To: <20090227095856.ef8c1c05.kamezawa.hiroyu@jp.fujitsu.com>

On Fri, Feb 27, 2009 at 09:58:56AM +0900, KAMEZAWA Hiroyuki wrote:
> On Thu, 26 Feb 2009 08:45:09 -0800
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Feb 26, 2009 at 09:06:24PM +0900, KAMEZAWA Hiroyuki wrote:
> > > Peter Zijlstra wrote:
> > > > On Thu, 2009-02-26 at 20:17 +0900, KAMEZAWA Hiroyuki wrote:
> > > >> Peter Zijlstra wrote:
> > > >> > On Thu, 2009-02-26 at 19:28 +0900, KAMEZAWA Hiroyuki wrote:
> > > >> >
> > > >> >> Taking hierarchy mutex while reading will make read-side stable.
> > > >> >
> > > >> > We're talking about scheduling here, taking a mutex to stop scheduling
> > > >> > won't work, nor will it be acceptible to use anything that will.
> > > >> >
> > > >> No mutex is necessary, anyway.
> > > >> hierarchy-walker function completely works well under rcu read lock,
> > > >> if small jitter is allowed.
> > > >
> > > > Right, should be doable -- and looking at the code, we have this
> > > > horrible 32 bit exception in there that locks the rq in order to read
> > > > the 64bit value.
> > > >
> > > > Would be grand to get rid of that,. how bad would it be for userspace to
> > > > get the occasionally fubarred value?
> > > >
> > > >From view of user-support saler, if terrible broken value is reported,
> > > it will be user-incident and annoy me(us) ;)
> > > 
> > > I'd like to get rid of rq->lock, too..Hmm.. some routine like
> > > atomic64_read() can help this ? (But I don't want to use atomic_t here..)
> > 
> > atomic64_read() will not help you on a 32-bit machine.  Here is the
> > sequence of events that will cause the aforementioned user incidents and
> > consequent annoyance:
> > 
> > o	The value of the counter is (2^32)-1, or 0xffffffff.
> > 
> > o	CPU 0 reads the high-order 32 bits of the counter, getting zero.
> > 
> > o	CPU 1 increments the low-order 32 bits of the counter, resulting
> > 	in zero, but notes that there is a carry out of this field.
> > 
> > o	CPU 0 reads the low-order 32 bits of the counter, getting zero.
> > 
> > o	CPU 1 increments the high-order 32 bits of the counter, so that
> > 	the new value of the counter is 2^32, or 0x100000000.
> > 
> > So CPU 0 gets a value that is -way- off.
> > 
> > The usual trick is something like the following for counter read:
> > 
> > 1.	Read the high-order 32 bits of the counter.
> > 
> > 2.	Do a memory barrier, smp_mb().
> > 
> > 3.	Read the low-order 32 bits of the counter.
> > 
> > 4.	Do another memory barrier, again smp_mb().
> > 
> > 5.	Read the high-order 32 bits of the counter again.
> > 
> > 	If it is the same as the value obtained in step 1 (or the previous
> > 	execution of step 5), then we are done.  (This works even in case
> > 	of complete 64-bit overflow, though we should be very lucky to
> > 	live that long!)  Otherwise, go to step 2.
> > 
> > But it is also necessary to modify the counter update:
> > 
> > 1.	Increment the low-order 32 bits of the counter.  If no overflow
> > 	occurred, we are done, otherwise, continue through this sequence
> > 	of steps.
> > 
> > 2.	Do a memory barrier, smp_mb().
> > 
> > 3.	Increment the high-order 32 bits of the counter.
> > 
> > How to detect overflow in step 1?  Well, if we are incrementing, we can
> > just test for the new value being zero.  Otherwise, if we are adding
> > a 32-bit number, if the new value of the low-order 32 bits of counter
> > is less than the old value, overflow occurred (but make sure that the
> > comparison is unsigned!).
> > 
> > This all assumes that you are adding a 32-bit quantity to the counter.
> > Adding 64-bit values is not much harder.
> > 
> > Does this approach work for you?
> > 
> 
> Thank you. I'll try some and post if it seems easy to read/merge.
> Hmm, but in your approach, can't we see the counter goes backword ?
> (if the reader see only low 32 bit is incremtend.)

Ouch, indeed!  The update would need to be atomic for my approach to
work.  My apologies for my confusion!

> Can't we use seq_counter in include/linux/seqlock.h ?
> There is only one writer and we don't need write-side lock.

Yes, seqlock should work fine, good point!

							Thanx, Paul

  reply	other threads:[~2009-02-27  1:29 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-26  7:40 [PATCH] cpuacct: add a branch prediction Li Zefan
2009-02-26  8:07 ` KAMEZAWA Hiroyuki
2009-02-26  8:17   ` Li Zefan
2009-02-26  8:22     ` KAMEZAWA Hiroyuki
2009-02-26  8:35       ` Li Zefan
2009-02-26  8:40         ` KAMEZAWA Hiroyuki
2009-02-26 10:10           ` Bharata B Rao
2009-02-26 10:28             ` KAMEZAWA Hiroyuki
2009-02-26 10:44               ` Peter Zijlstra
2009-02-26 10:55                 ` KAMEZAWA Hiroyuki
2009-02-26 11:22                   ` Peter Zijlstra
2009-02-26 11:17                 ` KAMEZAWA Hiroyuki
2009-02-26 11:28                   ` Peter Zijlstra
2009-02-26 12:06                     ` KAMEZAWA Hiroyuki
2009-02-26 12:20                       ` Peter Zijlstra
2009-02-26 12:26                         ` Ingo Molnar
2009-02-26 12:40                           ` Arnd Bergmann
2009-02-27  4:25                           ` Paul Mackerras
2009-02-26 16:45                       ` Paul E. McKenney
2009-02-27  0:58                         ` KAMEZAWA Hiroyuki
2009-02-27  1:29                           ` Paul E. McKenney [this message]
2009-02-27  3:22                             ` [RFC][PATCH] remove rq->lock from cpuacct cgroup (Was " KAMEZAWA Hiroyuki
2009-03-02 14:56                               ` Peter Zijlstra
2009-03-02 23:42                                 ` KAMEZAWA Hiroyuki
2009-03-03  7:51                                   ` Peter Zijlstra
2009-03-03  9:04                                     ` KAMEZAWA Hiroyuki
2009-03-03  9:40                                       ` Peter Zijlstra
2009-03-03 10:42                                         ` KAMEZAWA Hiroyuki
2009-03-03 10:44                                           ` KAMEZAWA Hiroyuki
2009-03-03 11:54                                           ` Peter Zijlstra
2009-03-04  6:32                                             ` [PATCH] remove rq->lock from cpuacct cgroup v2 KAMEZAWA Hiroyuki
2009-03-04  7:54                                               ` Bharata B Rao
2009-03-04  8:20                                                 ` KAMEZAWA Hiroyuki
2009-03-04  8:46                                                   ` KAMEZAWA Hiroyuki
2009-03-04 10:35                                                     ` Bharata B Rao
2009-03-04 12:11                                                   ` Bharata B Rao
2009-03-04 14:17                                                     ` KAMEZAWA Hiroyuki
2009-02-26  8:37 ` [PATCH] cpuacct: add a branch prediction Balbir Singh
2009-02-26  8:41   ` Li Zefan
2009-02-26 10:40     ` Balbir Singh
2009-02-26 10:43       ` Peter Zijlstra
2009-02-26  8:43   ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090227012915.GF6634@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=bharata.rao@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.