Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  3:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816033343.GA31844@gondor.apana.org.au>

Herbert Xu writes:

> If you're referring to the code in sk_stream_mem_schedule
> then it's working as intended.  The atomicity guarantees

You mean it's intended that *sk->sk_prot->memory_pressure can end up
as 1 when sk->sk_prot->memory_allocated is small (less than
->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
than ->sysctl_mem[2])?  Because that's the effect of the current code.
If so I wonder why you bother computing it.

> that the atomic_add/atomic_sub won't be seen in parts by
> other readers.
> 
> We certainly do not need to see other atomic_add/atomic_sub
> operations immediately.
> 
> If you're referring to another code snippet please cite.
> 
> > I'd go so far as to say that anywhere where you want a non-"volatile"
> > atomic_read, either your code is buggy, or else an int would work just
> > as well.
> 
> An int won't work here because += and -= do not have the
> atomicity guarantees that atomic_add/atomic_sub do.  In
> particular, this may cause an atomic_read on another CPU
> to give a bogus reading.

The point is that guaranteeing the atomicity of the increment or
decrement does not suffice to make the code race-free.  In this case
the race arises from the fact that reading ->memory_allocated and
setting *->memory_pressure are separate operations.  To make that code
work properly you need a lock.  And once you have the lock an ordinary
int would suffice for ->memory_allocated.

Paul.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  3:43 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <18115.49465.34538.840034@cargo.ozlabs.ibm.com>

On Thu, Aug 16, 2007 at 01:15:05PM +1000, Paul Mackerras wrote:
> 
> But others can also reduce the reservation.  Also, the code sets and
> clears *sk->sk_prot->memory_pressure nonatomically with respect to the
> reads of sk->sk_prot->memory_allocated, so in fact the code doesn't
> guarantee any particular relationship between the two.

Yes others can reduce the reservation, but the point of this
is that the code doesn't care.  We'll either see the value
before or after the reduction and in either case we'll do
something sensible.

The worst that can happen is when we're just below the hard
limit and multiple CPUs fail to allocate but that's not really
a problem because if the machine is making progress at all
then we will eventually scale back and allow these allocations
to succeed.

As to the non-atomic operation on memory_pressue, that's OK
because we only ever assign values to it and never do other
operations such as += or -=.  Remember that int/long assignments
must be atomic or Linux won't run on your architecture.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Bill Fink @ 2007-08-16  3:37 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney
In-Reply-To: <alpine.LFD.0.999.0708151713080.16414@enigma.security.iitk.ac.in>

On Wed, 15 Aug 2007, Satyam Sharma wrote:

> (C)
> $ cat tp3.c
> int a;
> 
> void func(void)
> {
> 	*(volatile int *)&a = 10;
> 	*(volatile int *)&a = 20;
> }
> $ gcc -Os -S tp3.c
> $ cat tp3.s
> ...
> movl    $10, a
> movl    $20, a
> ...

I'm curious about one minor tangential point.  Why, instead of:

	b = *(volatile int *)&a;

why can't this just be expressed as:

	b = (volatile int)a;

Isn't it the contents of a that's volatile, i.e. it's value can change
invisibly to the compiler, and that's why you want to force a read from
memory?  Why do you need the "*(volatile int *)&" construct?

						-Bill

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  3:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <18115.49946.522011.832468@cargo.ozlabs.ibm.com>

On Thu, Aug 16, 2007 at 01:23:06PM +1000, Paul Mackerras wrote:
>
> In particular, atomic_read seems to lend itself to buggy uses.  People
> seem to do things like:
> 
> 	atomic_add(&v, something);
> 	if (atomic_read(&v) > something_else) ...

If you're referring to the code in sk_stream_mem_schedule
then it's working as intended.  The atomicity guarantees
that the atomic_add/atomic_sub won't be seen in parts by
other readers.

We certainly do not need to see other atomic_add/atomic_sub
operations immediately.

If you're referring to another code snippet please cite.

> I'd go so far as to say that anywhere where you want a non-"volatile"
> atomic_read, either your code is buggy, or else an int would work just
> as well.

An int won't work here because += and -= do not have the
atomicity guarantees that atomic_add/atomic_sub do.  In
particular, this may cause an atomic_read on another CPU
to give a bogus reading.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* RE: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP portsfrom the host TCP port space.
From: Sean Hefty @ 2007-08-16  3:27 UTC (permalink / raw)
  To: 'Jeff Garzik', Steve Wise
  Cc: netdev, rdreier, linux-kernel, general, David Miller
In-Reply-To: <46C3B5EF.5060409@garzik.org>

>It's not about being a niche.  It's about creating a maintainable
>software net stack that has predictable behavior.
>
>Needing to reach out of the RDMA sandbox and reserve net stack resources
>away from itself travels a path we've consistently avoided.

We need to ensure that we're also creating a maintainable kernel.  RDMA doesn't
use sockets, but that doesn't mean it's not part of the networking support
provided by the Linux kernel.  Making blanket statements that RDMA should stay
within a sandbox is equivalent to saying that RDMA should duplicate any network
related functionality that it might need.

>>> I will NACK any patch that opens up sockets to eat up ports or
>>> anything stupid like that.
>
>Ditto for me as well.

I agree that using a socket is the wrong approach, but my guess is that it was
suggested as a possibility because of the attempt to keep RDMA in its 'sandbox'.
The iWarp architecture implements RDMA over TCP; it just doesn't use sockets.
The Linux network stack doesn't easily support this possibility.  Are there any
reasonable ways to enable this to the degree necessary for iWarp?

- Sean

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  3:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <Pine.LNX.4.64.0708151917210.11736@schroedinger.engr.sgi.com>

Christoph Lameter writes:

> > But I have to say that I still don't know of a single place
> > where one would actually use the volatile variant.
> 
> I suspect that what you say is true after we have looked at all callers.

It seems that there could be a lot of places where atomic_t is used in
a non-atomic fashion, and that those uses are either buggy, or there
is some lock held at the time which guarantees that other CPUs aren't
changing the value.  In both cases there is no point in using
atomic_t; we might as well just use an ordinary int.

In particular, atomic_read seems to lend itself to buggy uses.  People
seem to do things like:

	atomic_add(&v, something);
	if (atomic_read(&v) > something_else) ...

and expect that there is some relationship between the value that the
atomic_add stored and the value that the atomic_read will return,
which there isn't.  People seem to think that using atomic_t magically
gets rid of races.  It doesn't.

I'd go so far as to say that anywhere where you want a non-"volatile"
atomic_read, either your code is buggy, or else an int would work just
as well.

Paul.


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  3:15 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816021105.GA30883@gondor.apana.org.au>

Herbert Xu writes:

> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> Yes I'm sure, because we don't care if others have increased
> the reservation.

But others can also reduce the reservation.  Also, the code sets and
clears *sk->sk_prot->memory_pressure nonatomically with respect to the
reads of sk->sk_prot->memory_allocated, so in fact the code doesn't
guarantee any particular relationship between the two.

That code looks like a beautiful example of buggy, racy code where
someone has sprinkled magic fix-the-races dust (otherwise known as
atomic_t) around in a vain attempt to fix the races.

That's assuming that all that stuff actually performs any useful
purpose, of course, and that there isn't some lock held by the
callers.  In the latter case it is pointless using atomic_t.

Paul.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  3:05 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <alpine.LFD.0.999.0708160800130.24380@enigma.security.iitk.ac.in>

Satyam Sharma writes:

> I can't speak for this particular case, but there could be similar code
> examples elsewhere, where we do the atomic ops on an atomic_t object
> inside a higher-level locking scheme that would take care of the kind of
> problem you're referring to here. It would be useful for such or similar
> code if the compiler kept the value of that atomic object in a register.

If there is a higher-level locking scheme then there is no point to
using atomic_t variables.  Atomic_t is specifically for the situation
where multiple CPUs are updating a variable without locking.

Paul.

^ permalink raw reply

* Re: [2.6 patch] remove Documentation/networking/net-modules.txt
From: Paul Gortmaker @ 2007-08-16  3:22 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Andrew Morton, jgarzik, netdev, linux-kernel
In-Reply-To: <20070814212604.GW18945@stusta.de>

On 8/14/07, Adrian Bunk <bunk@kernel.org> wrote:
> According to git, the only one who touched this file during the last
> 5 years was me when removing drivers...
>
> modinfo offers less ancient information.
>
> Signed-off-by: Adrian Bunk <bunk@kernel.org>
>

Fine by me for any of the old stuff I was responsible for...

Acked-by: Paul Gortmaker <paul.gortmaker@gmail.com>



> -
> -       8390 based Network Modules              (Paul Gortmaker, Nov 12, 1995)
> -       --------------------------
> -
> -(Includes: smc-ultra, ne, wd, 3c503, hp, hp-plus, e2100 and ac3200)
> -
> -The 8390 series of network drivers now support multiple card systems without
> -reloading the same module multiple times (memory efficient!) This is done by
> -specifying multiple comma separated values, such as:
> -
> -       insmod 3c503.o io=0x280,0x300,0x330,0x350  xcvr=0,1,0,1
> -

^ permalink raw reply

* Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Roland Dreier @ 2007-08-16  3:11 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, linux-kernel, general, David Miller
In-Reply-To: <46C3B5EF.5060409@garzik.org>

 > Needing to reach out of the RDMA sandbox and reserve net stack
 > resources away from itself travels a path we've consistently avoided.

Where did the idea of an "RDMA sandbox" come from?  Obviously no one
disagrees with keeping things clean and maintainable, but the idea
that RDMA is a second-class citizen that doesn't get any input into
the evolution of the networking code seems kind of offensive to me.

 - R.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-16  3:01 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <alpine.LFD.0.999.0708160800130.24380@enigma.security.iitk.ac.in>



On Thu, 16 Aug 2007, Satyam Sharma wrote:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > See sk_stream_mem_schedule in net/core/stream.c:
> > > 
> > >         /* Under limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >                 if (*sk->sk_prot->memory_pressure)
> > >                         *sk->sk_prot->memory_pressure = 0;
> > >                 return 1;
> > >         }
> > > 
> > >         /* Over hard limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> > >                 sk->sk_prot->enter_memory_pressure();
> > >                 goto suppress_allocation;
> > >         }
> > > 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> I can't speak for this particular case, but there could be similar code
> examples elsewhere, where we do the atomic ops on an atomic_t object
> inside a higher-level locking scheme that would take care of the kind of
> problem you're referring to here. It would be useful for such or similar
> code if the compiler kept the value of that atomic object in a register.

We might not be using atomic_t (and ops) if we already have a higher-level
locking scheme, actually. So as Herbert mentioned, such cases might just
not care. [ Too much of this thread, too little sleep, sorry! ]

Anyway, the problem, of course, is that this conversion to a stronger /
safer-by-default behaviour doesn't happen with zero cost to performance.
Converting atomic ops to "volatile" behaviour did add ~2K to kernel text
for archs such as i386 (possibly to important codepaths) that didn't have
those semantics already so it would be constructive to actually look at
those differences and see if there were really any heisenbugs that got
rectified. Or if there were legitimate optimizations that got wrongly
disabled. Onus lies on those proposing the modifications, I'd say ;-)

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-16  2:35 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Christoph Lameter, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <20070816021105.GA30883@gondor.apana.org.au>

On Thu, Aug 16, 2007 at 10:11:05AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > See sk_stream_mem_schedule in net/core/stream.c:
> > > 
> > >         /* Under limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >                 if (*sk->sk_prot->memory_pressure)
> > >                         *sk->sk_prot->memory_pressure = 0;
> > >                 return 1;
> > >         }
> > > 
> > >         /* Over hard limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> > >                 sk->sk_prot->enter_memory_pressure();
> > >                 goto suppress_allocation;
> > >         }
> > > 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> Yes I'm sure, because we don't care if others have increased
> the reservation.
> 
> Note that even if we did we'd be using barriers so volatile
> won't do us any good here.

If the load-coalescing is important to performance, why not load into
a local variable?

						Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-16  2:32 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu
In-Reply-To: <Pine.LNX.4.64.0708151839140.11520@schroedinger.engr.sgi.com>

On Wed, Aug 15, 2007 at 06:41:40PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Understood.  My point is not that the impact is precisely zero, but
> > rather that the impact on optimization is much less hurtful than the
> > problems that could arise otherwise, particularly as compilers become
> > more aggressive in their optimizations.
> 
> The problems arise because barriers are not used as required. Volatile 
> has wishy washy semantics and somehow marries memory barriers with data 
> access. It is clearer to separate the two. Conceptual cleanness usually 
> translates into better code. If one really wants the volatile then lets 
> make it explicit and use
> 
> 	atomic_read_volatile()

There are indeed architectures where you can cause gcc to emit memory
barriers in response to volatile.  I am assuming that we are -not-
making gcc do this.  Given this, then volatiles and memory barrier
instructions are orthogonal -- one controls the compiler, the other
controls the CPU.

						Thanx, Paul

^ permalink raw reply

* Re: skb_pull_rcsum - Fatal exception in interrupt
From: Herbert Xu @ 2007-08-16  2:31 UTC (permalink / raw)
  To: Alan J. Wylie; +Cc: netdev
In-Reply-To: <18115.5803.588114.372952@wylie.me.uk>

Alan J. Wylie <alan@wylie.me.uk> wrote:
>
> The photos, along with the following information are available at
> http://wylie.me.uk/skb_pull_rcsum/

The really important bit has scrolled off.  Try booting with
vga=<appropriate setting> to increase the resolution.  Or use
a serial console if you can.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-16  2:30 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang
In-Reply-To: <0de0c3d833b6f543bd75f74bb17a124b@kernel.crashing.org>

On Thu, Aug 16, 2007 at 03:30:44AM +0200, Segher Boessenkool wrote:
> >>>Part of the motivation here is to fix heisenbugs.  If I knew where 
> >>>they
> >>
> >>By the same token we should probably disable optimisations
> >>altogether since that too can create heisenbugs.
> >
> >Precisely the point -- use of volatile (whether in casts or on asms)
> >in these cases are intended to disable those optimizations likely to
> >result in heisenbugs.
> 
> The only thing volatile on an asm does is create a side effect
> on the asm statement; in effect, it tells the compiler "do not
> remove this asm even if you don't need any of its outputs".
> 
> It's not disabling optimisation likely to result in bugs,
> heisen- or otherwise; _not_ putting the volatile on an asm
> that needs it simply _is_ a bug :-)

Yep.  And the reason it is a bug is that it fails to disable
the relevant compiler optimizations.  So I suspect that we might
actually be saying the same thing here.

						Thanx, Paul

^ permalink raw reply

* Re: [PATCH 6/24] make atomic_read() behave consistently on frv
From: Paul E. McKenney @ 2007-08-16  2:27 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, csnook, dhowells, linux-kernel, linux-arch, torvalds,
	netdev, akpm, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	Arnd Bergmann
In-Reply-To: <46C3A3C5.5020103@yahoo.com.au>

On Thu, Aug 16, 2007 at 11:09:25AM +1000, Nick Piggin wrote:
> Paul E. McKenney wrote:
> >On Wed, Aug 15, 2007 at 11:30:05PM +1000, Nick Piggin wrote:
> 
> >>Especially since several big architectures don't have volatile in their
> >>atomic_get and _set, I think it would be a step backwards to add them in
> >>as a "just in case" thin now (unless there is a better reason).
> >
> >Good point, except that I would expect gcc's optimization to continue
> >to improve.  I would like the kernel to be able to take advantage of
> >improved optimization, which means that we are going to have to make
> >a few changes.  Adding volatile to atomic_get() and atomic_set() is
> >IMHO one of those changes.
> 
> What optimisations? gcc already does most of the things you need a
> barrier/volatile for, like reordering non-dependant loads and stores,
> and eliminating mem ops completely by caching in registers.

Yep, most of the currently practiced optimizations.  Given that CPU clock
frequencies are not rising as quickly as they once did, I would expect
that there will be added effort on implementing the known more-aggressive
optimization techniques, and on coming up with new ones as well.

Some of these new optimizations will likely be inappropriate for kernel
code, but I expect that some will be things that we want.

> >>As to your followup question of why to use it over ACCESS_ONCE. I
> >>guess, aside from consistency with the rest of the barrier APIs, you
> >>can use it in other primitives when you don't actually know what the
> >>caller is going to do or if it even will make an access. You could
> >>also use it between calls to _other_ primitives, etc... it just
> >>seems more flexible to me, but I haven't actually used such a thing
> >>in real code...
> >>
> >>ACCESS_ONCE doesn't seem as descriptive. What it results in is the
> >>memory location being loaded or stored (presumably once exactly),
> >>but I think the more general underlying idea is a barrier point.
> >
> >OK, first, I am not arguing that ACCESS_ONCE() can replace all current
> >uses of barrier().
> 
> OK. Well I also wasn't saying that ACCESS_ONCE should not be
> implemented. But if we want something like it, then it would make
> sense to have an equivalent barrier statement as well (ie. order()).

And I am not arguing against use of asms to implement the volatility
in atomic_read() and atomic_set(), and in fact it appears that one
of the architectures will be taking this approach.

Sounds like we might be in violent agreement.  ;-)

						Thanx, Paul

^ permalink raw reply

* Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Jeff Garzik @ 2007-08-16  2:26 UTC (permalink / raw)
  To: Steve Wise; +Cc: David Miller, mshefty, rdreier, netdev, linux-kernel, general
In-Reply-To: <46C310E1.7020503@opengridcomputing.com>

Steve Wise wrote:
> 
> 
> David Miller wrote:
>> From: Sean Hefty <mshefty@ichips.intel.com>
>> Date: Thu, 09 Aug 2007 14:40:16 -0700
>>
>>> Steve Wise wrote:
>>>> Any more comments?
>>> Does anyone have ideas on how to reserve the port space without using 
>>> a struct socket?
>>
>> How about we just remove the RDMA stack altogether?  I am not at all
>> kidding.  If you guys can't stay in your sand box and need to cause
>> problems for the normal network stack, it's unacceptable.  We were
>> told all along the if RDMA went into the tree none of this kind of
>> stuff would be an issue.
> 
> I think removing the RDMA stack is the wrong thing to do, and you 
> shouldn't just threaten to yank entire subsystems because you don't like 
> the technology.  Lets keep this constructive, can we?  RDMA should get 
> the respect of any other technology in Linux.  Maybe its a niche in your 
> opinion, but come on, there's more RDMA users than say, the sparc64 
> port.  Eh?

It's not about being a niche.  It's about creating a maintainable 
software net stack that has predictable behavior.

Needing to reach out of the RDMA sandbox and reserve net stack resources 
away from itself travels a path we've consistently avoided.


>> I will NACK any patch that opens up sockets to eat up ports or
>> anything stupid like that.
> 
> Got it.

Ditto for me as well.

	Jeff



^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Nick Piggin @ 2007-08-16  2:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Herbert Xu, heiko.carstens, horms, linux-kernel, rpjday, ak,
	netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch, zlynx,
	satyam, clameter, schwidefsky, Chris Snook, davem, wensong,
	wjiang
In-Reply-To: <3694fb2e4ed1e4d9bf873c0d050c911e@kernel.crashing.org>

Segher Boessenkool wrote:
>>> Part of the motivation here is to fix heisenbugs.  If I knew where they
>>
>>
>> By the same token we should probably disable optimisations
>> altogether since that too can create heisenbugs.
> 
> 
> Almost everything is a tradeoff; and so is this.  I don't
> believe most people would find disabling all compiler
> optimisations an acceptable price to pay for some peace
> of mind.

So why is this a good tradeoff?

I also think that just adding things to APIs in the hope it might fix
up some bugs isn't really a good road to go down. Where do you stop?

On the actual proposal to make atomic_operators volatile: I think the
better approach in the long term, for both maintainability of the
code and education of coders, is to make the use of barriers _more_
explicit rather than sprinkling these "just in case" ones around.

You may get rid of a few atomic_read heisenbugs (in noise when
compared to all bugs), but if the coder was using a regular atomic
load, or a test_bit (which is also atomic), etc. then they're going
to have problems.

It would be better for Linux if everyone was to have better awareness
of barriers than to hide some of the cases where they're required.
A pretty large number of bugs I see in lock free code in the VM is
due to memory ordering problems. It's hard to find those bugs, or
even be aware when you're writing buggy code if you don't have some
feel for barriers.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-16  2:22 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang
In-Reply-To: <f24994ecd2ecfcda6e01b289a25c1e47@kernel.crashing.org>

On Thu, Aug 16, 2007 at 03:23:28AM +0200, Segher Boessenkool wrote:
> >>>>No; compilation units have nothing to do with it, GCC can optimise
> >>>>across compilation unit boundaries just fine, if you tell it to
> >>>>compile more than one compilation unit at once.
> >>>
> >>>Last I checked, the Linux kernel build system did compile each .c 
> >>>file
> >>>as a separate compilation unit.
> >>
> >>I have some patches to use -combine -fwhole-program for Linux.
> >>Highly experimental, you need a patched bleeding edge toolchain.
> >>If there's interest I'll clean it up and put it online.
> >>
> >>David Woodhouse had some similar patches about a year ago.
> >
> >Sounds exciting...  ;-)
> 
> Yeah, the breakage is *quite* spectacular :-)

I bet!!!  ;-)

> >>>>>In many cases, the compiler also has to assume that
> >>>>>msleep_interruptible()
> >>>>>might call back into a function in the current compilation unit, 
> >>>>>thus
> >>>>>possibly modifying global static variables.
> >>>>
> >>>>It most often is smart enough to see what compilation-unit-local
> >>>>variables might be modified that way, though :-)
> >>>
> >>>Yep.  For example, if it knows the current value of a given such 
> >>>local
> >>>variable, and if all code paths that would change some other variable
> >>>cannot be reached given that current value of the first variable.
> >>
> >>Or the most common thing: if neither the address of the translation-
> >>unit local variable nor the address of any function writing to that
> >>variable can "escape" from that translation unit, nothing outside
> >>the translation unit can write to the variable.
> >
> >But there is usually at least one externally callable function in
> >a .c file.
> 
> Of course, but often none of those will (indirectly) write a certain
> static variable.

But there has to be some path to the static functions, assuming that
they are not dead code.  Yes, there can be cases where the compiler
knows enough about the state of the variables to rule out some of code
paths to them, but I have no idea how often this happens in kernel
code.

						Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-16  2:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <18115.45316.702491.681906@cargo.ozlabs.ibm.com>



On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > See sk_stream_mem_schedule in net/core/stream.c:
> > 
> >         /* Under limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> >                 if (*sk->sk_prot->memory_pressure)
> >                         *sk->sk_prot->memory_pressure = 0;
> >                 return 1;
> >         }
> > 
> >         /* Over hard limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> >                 sk->sk_prot->enter_memory_pressure();
> >                 goto suppress_allocation;
> >         }
> > 
> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

I can't speak for this particular case, but there could be similar code
examples elsewhere, where we do the atomic ops on an atomic_t object
inside a higher-level locking scheme that would take care of the kind of
problem you're referring to here. It would be useful for such or similar
code if the compiler kept the value of that atomic object in a register.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Chris Friesen @ 2007-08-16  2:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Paul Mackerras, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816020851.GA30809@gondor.apana.org.au>

Herbert Xu wrote:

> But I have to say that I still don't know of a single place
> where one would actually use the volatile variant.

Given that many of the existing users do currently have "volatile", are 
you comfortable simply removing that behaviour from them?  Are you sure 
that you will not introduce any issues?

Forcing a re-read is only a performance penalty.  Removing it can cause 
behavioural changes.

I would be more comfortable making the default match the majority of the 
current implementations (ie: volatile semantics).  Then, if someone 
cares about performance they can explicitly validate the call path and 
convert it over to the non-volatile version.

Correctness before speed...

Chris

^ permalink raw reply

* Re: drivers/infiniband/mlx/mad.c misplaced ;
From: Kok, Auke @ 2007-08-16  2:19 UTC (permalink / raw)
  To: Joe Perches, Andy Whitcroft, Randy.Dunlap, Joel Schopp
  Cc: Dave Jones, Linux Kernel, rolandd, Chas Williams, Paul Mundt,
	isdn4linux, mikep, netdev, swen, linux390, linux-s390, jdike,
	user-mode-linux-devel, user-mode-linux-user, netfilter-devel,
	coreteam
In-Reply-To: <1187224811.5906.55.camel@localhost>

Joe Perches wrote:
> On Wed, 2007-08-15 at 19:58 -0400, Dave Jones wrote:
>> Signed-off-by: Dave Jones <davej@redhat.com>
>>
>> diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
>> index 3330917..0ed02b7 100644
>> --- a/drivers/infiniband/hw/mlx4/mad.c
>> +++ b/drivers/infiniband/hw/mlx4/mad.c
>> @@ -109,7 +109,7 @@ int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int ignore_mkey, int ignore_bkey,
>>  			   in_modifier, op_modifier,
>>  			   MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C);
>>  
>> -	if (!err);
>> +	if (!err)
> 
> There's more than a few of these (not inspected).
> 
> $ egrep -r --include=*.c "\bif[[:space:]]*\([^\)]*\)[[:space:]]*\;" * 
> arch/sh/boards/se/7343/io.c:    if (0) ;
> drivers/atm/iphase.c:     if (!desc1) ;
> drivers/infiniband/hw/mlx4/mad.c:       if (!err);
> drivers/isdn/capi/capiutil.c:   else if (c <= 0x0f);
> drivers/net/tokenring/ibmtr.c:  else if (ti->shared_ram_paging == 0xf);  /* No paging in adapter */
> drivers/s390/scsi/zfcp_erp.c:           if (status == ZFCP_ERP_SUCCEEDED) ;     /* no further action */
> fs/hostfs/hostfs_user.c:        if(attrs->ia_valid & HOSTFS_ATTR_CTIME) ;
> net/netfilter/xt_u32.c:         if (skb->len < 4 || pos > skb->len - 4);
> sound/pci/au88x0/au88x0_synth.c:                                if (eax == 0) ;

sounds like an excellent candidate check for checkpatch.pl !!!

Cheers,

Auke

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Christoph Lameter @ 2007-08-16  2:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Paul E. McKenney, Paul Mackerras, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <20070816020851.GA30809@gondor.apana.org.au>

On Thu, 16 Aug 2007, Herbert Xu wrote:

> > Do we have a consensus here? (hoping against hope, probably :-)
> 
> I can certainly agree with this.

I agree too.

> But I have to say that I still don't know of a single place
> where one would actually use the volatile variant.

I suspect that what you say is true after we have looked at all callers.



^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Christoph Lameter @ 2007-08-16  2:17 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <Pine.LNX.4.64.0708151915390.11706@schroedinger.engr.sgi.com>

On Wed, 15 Aug 2007, Christoph Lameter wrote:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> The cpu knows because the cacheline was not invalidated.

Crap my statement above is wrong..... We do not care that the 
value was changed otherwise we would have put a barrier in there.


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Christoph Lameter @ 2007-08-16  2:15 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <18115.45316.702491.681906@cargo.ozlabs.ibm.com>

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

The cpu knows because the cacheline was not invalidated.


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox