From: Peter Zijlstra <peterz@infradead.org>
To: Christoph Lameter <cl@linux.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
Andrew Morton <akpm@linux-foundation.org>,
Greg Thelen <gthelen@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Balbir Singh <bsingharora@gmail.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
linux-arch@vger.kernel.org
Subject: Re: [PATCH] memcg: remove unneeded preempt_disable
Date: Thu, 25 Aug 2011 20:40:13 +0200 [thread overview]
Message-ID: <1314297613.27911.83.camel@twins> (raw)
In-Reply-To: <alpine.DEB.2.00.1108251128460.27407@router.home>
On Thu, 2011-08-25 at 11:31 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, James Bottomley wrote:
>
> > On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
> > > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> > >
> > > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> > > > >
> > > > > I think I'll apply it, as the call frequency is low (correct?) and the
> > > > > problem will correct itself as other architectures implement their
> > > > > atomic this_cpu_foo() operations.
> > > >
> > > > Which leads me to wonder, can anything but x86 implement that this_cpu_*
> > > > muck? I doubt any of the risk chips can actually do all this.
> > > > Maybe Itanic, but then that seems to be dying fast.
> > >
> > > The cpu needs to have an RMW instruction that does something to a
> > > variable relative to a register that points to the per cpu base.
> > >
> > > Thats generally possible. The problem is how expensive the RMW is going to
> > > be.
> >
> > Risc systems generally don't have a single instruction for this, that's
> > correct. Obviously we can do it as a non atomic sequence: read
> > variable, compute relative, read, modify, write ... but there's
> > absolutely no point hand crafting that in asm since the compiler can
> > usually work it out nicely. And, of course, to have this atomic, we
> > have to use locks, which ends up being very expensive.
>
> ARM seems to have these LDREX/STREX instructions for that purpose which
> seem to be used for generating atomic instructions without lockes. I guess
> other RISC architectures have similar means of doing it?
Even with LL/SC and the CPU base in a register you need to do something
like:
again:
LL $target-reg, $cpubase-reg + offset
<foo>
SC $ret, $target-reg, $cpubase-reg + offset
if !$ret goto again
Its the +offset that's problematic, it either doesn't exist or is very
limited (a quick look at the MIPS instruction set gives a limit of 64k).
Without the +offset you need:
again:
$tmp-reg = $cpubase-reg
$tmp-reg += offset;
LL $target-reg, $tmp-reg
<foo>
SC $ret, $target-reg, $tmp-reg
if !$ret goto again
Which is wide open to migration races. Also, very often there are
constraints on LL/SC that mandate we use preempt_disable/enable around
its use, which pretty much voids the whole purpose, since if we disable
preemption we might as well just use C (ARM belongs in this class).
It does look POWERPC's lwarx/stwcx is sane enough, although the
instruction reference I found doesn't list what happens if the LL/SC
doesn't use the same effective address or has other loads/stores in
between, if its ok with those and simply fails the SC it should be good.
Still, creating atomic ops for per-cpu ops might be more expensive than
simply doing the preempt-disable/rmw/enable dance, dunno don't know
these archs that well.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-08-25 18:40 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-18 6:50 [PATCH] memcg: remove unneeded preempt_disable Greg Thelen
2011-08-18 6:52 ` KAMEZAWA Hiroyuki
2011-08-18 9:38 ` Johannes Weiner
2011-08-18 14:26 ` Valdis.Kletnieks
2011-08-18 14:41 ` Johannes Weiner
2011-08-18 18:27 ` Valdis.Kletnieks
2011-08-18 21:40 ` Andrew Morton
2011-08-19 0:00 ` Greg Thelen
2011-08-25 14:57 ` Peter Zijlstra
2011-08-25 15:11 ` Christoph Lameter
2011-08-25 16:20 ` James Bottomley
2011-08-25 16:31 ` Christoph Lameter
2011-08-25 16:34 ` James Bottomley
2011-08-25 17:07 ` Christoph Lameter
2011-08-25 18:34 ` James Bottomley
2011-08-25 18:46 ` Christoph Lameter
2011-08-25 18:49 ` Peter Zijlstra
2011-08-25 18:53 ` Peter Zijlstra
2011-08-25 19:05 ` Peter Zijlstra
2011-08-25 19:19 ` Christoph Lameter
2011-08-25 22:56 ` Peter Zijlstra
2011-08-25 19:29 ` James Bottomley
2011-08-25 23:01 ` Peter Zijlstra
2011-08-25 18:40 ` Peter Zijlstra [this message]
2011-09-06 9:58 ` Johannes Weiner
2011-09-06 10:04 ` KAMEZAWA Hiroyuki
2011-09-06 10:49 ` Johannes Weiner
2011-09-06 10:45 ` KAMEZAWA Hiroyuki
2011-09-06 18:04 ` Greg Thelen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1314297613.27911.83.camel@twins \
--to=peterz@infradead.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=bsingharora@gmail.com \
--cc=cl@linux.com \
--cc=gthelen@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).