Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 24/24] document volatile atomic_read() behavior
From: Chris Friesen @ 2007-08-09 20:10 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Chris Snook, wjiang, wensong, heiko.carstens, linux-kernel, ak,
	netdev, horms, akpm, linux-arch, torvalds, schwidefsky, davem,
	zlynx, rpjday, jesper.juhl
In-Reply-To: <0a08872e608cf5f7a3d9c0fc746a1051@kernel.crashing.org>

Segher Boessenkool wrote:

> Anyway, what's the supposed advantage of *(volatile *) vs. using
> a real volatile object?  That you can access that same object in
> a non-volatile way?

That's my understanding.  That way accesses where you don't care about 
volatility may be optimised.

For instance, in cases where there are already other things controlling 
visibility (as are needed for atomic increment, for example) you don't 
need to make the access itself volatile.

Chris

^ permalink raw reply

* Re: [PATCH 24/24] document volatile atomic_read() behavior
From: Chris Snook @ 2007-08-09 20:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: wjiang, wensong, heiko.carstens, linux-kernel, ak, cfriesen,
	netdev, horms, akpm, linux-arch, torvalds, schwidefsky, davem,
	zlynx, rpjday, jesper.juhl
In-Reply-To: <0a08872e608cf5f7a3d9c0fc746a1051@kernel.crashing.org>

Segher Boessenkool wrote:
>>>> Explicit
>>>> +casting in atomic_read() ensures consistent behavior across 
>>>> architectures
>>>> +and compilers.
>>> Even modulo compiler bugs, what makes you believe that?
>>
>> When you declare a variable volatile, you don't actually tell the 
>> compiler where you want to override its default optimization behavior, 
>> giving it some freedom to guess your intentions incorrectly.  When you 
>> put the cast on the data access itself, there is no question about 
>> precisely where in the code you want to override the compiler's 
>> default optimization behavior.
> 
> ...except for the small point that this isn't how volatile works.
> 
> Rule of thumb: even people who know the semantics of volatile
> shouldn't use it.
> 
>> If the compiler doesn't do what you want with a volatile declaration, 
>> it might have a plausible excuse in the ambiguity of the C standard.  
>> If the compiler doesn't do what you want in a cast specific to a 
>> single dereference, it's just plain broken.
> 
> The other way around.  "volatile" has pretty weak semantics, and
> certainly not the semantics you think it has, or that you wish it
> had; but *(volatile XX *) doesn't have *any* semantics.  However
> much you cast that pointer it still doesn't point to a volatile
> object.
> 
> GCC will generally take the path of least surprise and perform a
> volatile access anyway, but this has only be decided recently (a
> year ago or so), and there very likely still are some bugs in
> that area.
> 
>> We try to be compatible with plausibly correct compilers, but if 
>> they're completely broken, we're screwed no matter what.
> 
> If you don't know what to expect, you're screwed for sure.
> 
> 
> Anyway, what's the supposed advantage of *(volatile *) vs. using
> a real volatile object?  That you can access that same object in
> a non-volatile way?

You'll have to take that up with Linus and the minds behind Volatile Considered 
Harmful, but the crux of it is that volatile objects are prone to compiler bugs 
too, and if we have to track down a compiler bug, it's a lot easier when we know 
exactly where the load is supposed to be because we deliberately put it there, 
rather than letting the compiler re-order everything that lacks a strict data 
dependency and trying to figure out where in a thousand lines of assembler the 
compiler should have put the load for the volatile object.

If we're going to assume that the compiler has bugs we'll never be able to find, 
we all need to find new careers.  If we're going to assume that it has bugs we 
*can* find, then let's use code that makes it easier to do that.

I initially proposed a patch that made all the objects volatile, on the grounds 
that this was a special case where there wasn't much room to have a 
misunderstanding that resulted in anything worse than wasted loads.  Linus 
objected, and now that I've seen all the responses to the new patchset, I 
understand exactly why.  If our compilers really suck as much as everyone says 
they do, it'll be much easier to detect that with volatile casts than with 
volatile declarations.

	-- Chris

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Chris Snook @ 2007-08-09 19:47 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Segher Boessenkool, wjiang, cfriesen, wensong, heiko.carstens,
	linux-kernel, ak, netdev, paulmck, horms, akpm, linux-arch,
	jesper.juhl, torvalds, zlynx, rpjday, schwidefsky, davem
In-Reply-To: <Pine.LNX.4.64.0708092123270.22584@anakin>

Geert Uytterhoeven wrote:
> On Thu, 9 Aug 2007, Chris Snook wrote:
>> Segher Boessenkool wrote:
>>>>> The only safe way to get atomic accesses is to write
>>>>> assembler code.  Are there any downsides to that?  I don't
>>>>> see any.
>>>> The assumption that aligned word reads and writes are atomic, and that
>>>> words are aligned unless explicitly packed otherwise, is endemic in the
>>>> kernel.  No sane compiler violates this assumption.  It's true that we're
>>>> not portable to insane compilers after this patch, but we never were in
>>>> the first place.
>>> You didn't answer my question: are there any downsides to using
>>> explicit coded-in-assembler accesses for atomic accesses?  You
>>> can handwave all you want that it should "just work" with
>>> volatile accesses, but volatility != atomicity, volatile in C
>>> is really badly defined, GCC never officially gave stronger
>>> guarantees, and we have a bugzilla full of PRs to show what a
>>> minefield it is.
>>>
>>> So, why not use the well-defined alternative?
>> Because we don't need to, and it hurts performance.
> 
> It hurts performance by implementing 32-bit atomic reads in assembler?

No, I misunderstood the question.  Implementing 32-bit atomic reads in assembler 
is redundant, because any sane compiler, *particularly* and optimizing compiler 
(and we're only in this mess because of optimizing compilers) will give us that 
automatically without the assembler.  Yes, it is legal for a compiler to violate 
this assumption.  It is also legal for us to refuse to maintain compatibility 
with compilers that suck this badly.  That decision was made a very long time 
ago, and I consider it the correct decision.

	-- Chris

^ permalink raw reply

* Re: [PATCH 24/24] document volatile atomic_read() behavior
From: Segher Boessenkool @ 2007-08-09 19:42 UTC (permalink / raw)
  To: Chris Snook
  Cc: wjiang, wensong, heiko.carstens, linux-kernel, ak, cfriesen,
	netdev, horms, akpm, linux-arch, torvalds, schwidefsky, davem,
	zlynx, rpjday, jesper.juhl
In-Reply-To: <46BB403D.10202@redhat.com>

>>> Explicit
>>> +casting in atomic_read() ensures consistent behavior across 
>>> architectures
>>> +and compilers.
>> Even modulo compiler bugs, what makes you believe that?
>
> When you declare a variable volatile, you don't actually tell the 
> compiler where you want to override its default optimization behavior, 
> giving it some freedom to guess your intentions incorrectly.  When you 
> put the cast on the data access itself, there is no question about 
> precisely where in the code you want to override the compiler's 
> default optimization behavior.

...except for the small point that this isn't how volatile works.

Rule of thumb: even people who know the semantics of volatile
shouldn't use it.

> If the compiler doesn't do what you want with a volatile declaration, 
> it might have a plausible excuse in the ambiguity of the C standard.  
> If the compiler doesn't do what you want in a cast specific to a 
> single dereference, it's just plain broken.

The other way around.  "volatile" has pretty weak semantics, and
certainly not the semantics you think it has, or that you wish it
had; but *(volatile XX *) doesn't have *any* semantics.  However
much you cast that pointer it still doesn't point to a volatile
object.

GCC will generally take the path of least surprise and perform a
volatile access anyway, but this has only be decided recently (a
year ago or so), and there very likely still are some bugs in
that area.

> We try to be compatible with plausibly correct compilers, but if 
> they're completely broken, we're screwed no matter what.

If you don't know what to expect, you're screwed for sure.

Anyway, what's the supposed advantage of *(volatile *) vs. using
a real volatile object?  That you can access that same object in
a non-volatile way?

Segher

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Chris Snook @ 2007-08-09 19:30 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: wjiang, rpjday, wensong, heiko.carstens, linux-kernel, ak, netdev,
	paulmck, horms, akpm, linux-arch, jesper.juhl, torvalds,
	schwidefsky, davem, cfriesen, zlynx
In-Reply-To: <06d22e947a5357cd53adb070225bc7c1@kernel.crashing.org>

Segher Boessenkool wrote:
>>> The compiler is within its rights to read a 32-bit quantity 16 bits at
>>> at time, even on a 32-bit machine.  I would be glad to help pummel any
>>> compiler writer that pulls such a dirty trick, but the C standard really
>>> does permit this.
>>
>> Yes, but we don't write code for these compilers.  There are countless 
>> pieces of kernel code which would break in this condition, and there 
>> doesn't seem to be any interest in fixing this.
> 
> "Other things are broken too".  Great argument :-)

We make plenty of practical assumptions in the kernel, and declare incorrect 
things which violate them, even in cases where there's no commandment from the 
heavens forbidding them.  Since the whole point of this exercise is to prevent 
badness with *optimizing* compilers, it's quite reasonable to declare broken any 
so-called optimizer which violates these trivial assumptions.

>>> In short, please retain atomic_set()'s volatility, especially on those
>>> architectures that declared the atomic_t's counter to be volatile.
>>
>> Like i386 and x86_64?  These used to have volatile in the atomic_t 
>> declaration.  We removed it, and the sky did not fall.
> 
> And this proves what?  Lots of stuff "works" by accident.

If something breaks because of this, it was already broken, but hidden a lot 
better.  I don't see much of a downside to exposing and fixing those bugs.

	-- Chris

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Chris Snook @ 2007-08-09 19:24 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl
In-Reply-To: <20070809184531.GH8424@linux.vnet.ibm.com>

Paul E. McKenney wrote:
> On Thu, Aug 09, 2007 at 02:13:52PM -0400, Chris Snook wrote:
>> Paul E. McKenney wrote:
>>> On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
>>>>                                If you're depending on volatile writes 
>>>> being visible to other CPUs, you're screwed either way, because the CPU 
>>>> can hold that data in cache as long as it wants before it writes it to 
>>>> memory.  When this finally does happen, it will happen atomically, which 
>>>> is all that atomic_set guarantees.  If you need to guarantee that the 
>>>> value is written to memory at a particular time in your execution 
>>>> sequence, you either have to read it from memory to force the compiler to 
>>>> store it first (and a volatile cast in atomic_read will suffice for this) 
>>>> or you have to use LOCK_PREFIX instructions which will invalidate remote 
>>>> cache lines containing the same variable.  This patch doesn't change 
>>>> either of these cases.
>>> The case that it -can- change is interactions with interrupt handlers.
>>> And NMI/SMI handlers, for that matter.
>> You have a point here, but only if you can guarantee that the interrupt 
>> handler is running on a processor sharing the cache that has the 
>> not-yet-written volatile value.  That implies a strictly non-SMP 
>> architecture.  At the moment, none of those have volatile in their 
>> declaration of atomic_t, so this patch can't break any of them.
> 
> This can also happen when using per-CPU variables.  And there are a
> number of per-CPU variables that are either atomic themselves or are
> structures containing atomic fields.

Accessing per-CPU variables in this fashion reliably already requires a suitable 
smp/non-smp read/write memory barrier.  I maintain that if we break anything 
with this change, it was really already broken, if less obviously.  Can you give 
a real or synthetic example of legitimate code that could break?

	-- Chris

^ permalink raw reply

* Re: 2.6.23-rc2-mm1
From: Jeff Garzik @ 2007-08-09 19:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Piotrowski, linux-kernel, Netdev
In-Reply-To: <20070809121034.3aac9da2.akpm@linux-foundation.org>

Andrew Morton wrote:
> umm, I was hoping to find out which of those two patches was the cuplrit.
> Almost surely it was 9ee6b32a47b9abc565466a9c3b127a5246b452e5?


Highly likely it is my patch in #ALL.

	Jeff



^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Segher Boessenkool @ 2007-08-09 19:17 UTC (permalink / raw)
  To: Chris Snook
  Cc: wjiang, rpjday, wensong, heiko.carstens, linux-kernel, ak, netdev,
	paulmck, horms, akpm, linux-arch, jesper.juhl, torvalds,
	schwidefsky, davem, cfriesen, zlynx
In-Reply-To: <46BB4B7B.4070007@redhat.com>

> If you need to guarantee that the value is written to memory at a 
> particular time in your execution sequence, you either have to read it 
> from memory to force the compiler to store it first

That isn't enough.  The CPU will happily read the datum back from
its own store queue before it ever hit memory or cache or any
external bus.  If you need the value to be globally visible before
another store, you use wmb(); if you need it before some read,
you use mb().

These concepts are completely separate from both atomicity and
volatility (which aren't the same themselves).

> No, but I can reason about it and be confident that this won't break 
> anything that isn't already broken.  At worst, this patch will make 
> any existing subtly incorrect uses of atomic_t much more obvious and 
> easier to track down.

PR22278, PR29753 -- both P2 bugs, and both about basic *(volatile *)
usage.  Yeah, it's definitely not problematic, esp. not if you want
to support older compilers than 4.2 too.</sarcasm>

Segher

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Geert Uytterhoeven @ 2007-08-09 19:25 UTC (permalink / raw)
  To: Chris Snook
  Cc: Segher Boessenkool, wjiang, cfriesen, wensong, heiko.carstens,
	linux-kernel, ak, netdev, paulmck, horms, akpm, linux-arch,
	jesper.juhl, torvalds, zlynx, rpjday, schwidefsky, davem
In-Reply-To: <46BB656D.6090408@redhat.com>

On Thu, 9 Aug 2007, Chris Snook wrote:
> Segher Boessenkool wrote:
> > > > The only safe way to get atomic accesses is to write
> > > > assembler code.  Are there any downsides to that?  I don't
> > > > see any.
> > > 
> > > The assumption that aligned word reads and writes are atomic, and that
> > > words are aligned unless explicitly packed otherwise, is endemic in the
> > > kernel.  No sane compiler violates this assumption.  It's true that we're
> > > not portable to insane compilers after this patch, but we never were in
> > > the first place.
> > 
> > You didn't answer my question: are there any downsides to using
> > explicit coded-in-assembler accesses for atomic accesses?  You
> > can handwave all you want that it should "just work" with
> > volatile accesses, but volatility != atomicity, volatile in C
> > is really badly defined, GCC never officially gave stronger
> > guarantees, and we have a bugzilla full of PRs to show what a
> > minefield it is.
> > 
> > So, why not use the well-defined alternative?
> 
> Because we don't need to, and it hurts performance.

It hurts performance by implementing 32-bit atomic reads in assembler?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Segher Boessenkool @ 2007-08-09 19:19 UTC (permalink / raw)
  To: Chris Snook
  Cc: wjiang, schwidefsky, wensong, heiko.carstens, linux-kernel, ak,
	cfriesen, netdev, paulmck, horms, akpm, linux-arch, jesper.juhl,
	davem, torvalds, zlynx, rpjday
In-Reply-To: <46BB656D.6090408@redhat.com>

>>>> The only safe way to get atomic accesses is to write
>>>> assembler code.  Are there any downsides to that?  I don't
>>>> see any.
>>>
>>> The assumption that aligned word reads and writes are atomic, and 
>>> that words are aligned unless explicitly packed otherwise, is 
>>> endemic in the kernel.  No sane compiler violates this assumption.  
>>> It's true that we're not portable to insane compilers after this 
>>> patch, but we never were in the first place.
>> You didn't answer my question: are there any downsides to using
>> explicit coded-in-assembler accesses for atomic accesses?  You
>> can handwave all you want that it should "just work" with
>> volatile accesses, but volatility != atomicity, volatile in C
>> is really badly defined, GCC never officially gave stronger
>> guarantees, and we have a bugzilla full of PRs to show what a
>> minefield it is.
>> So, why not use the well-defined alternative?
>
> Because we don't need to,

You don't need to use volatile objects, or accesses through
valatile-cast pointers, either.

> and it hurts performance.

Please show how it does this -- one load is one load either way,
and one store is one store.


Segher


^ permalink raw reply

* Re: 2.6.23-rc2-mm1
From: Andrew Morton @ 2007-08-09 19:10 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: linux-kernel, Netdev, Jeff Garzik
In-Reply-To: <6bffcb0e0708091204h38f137a7w6475a229199608c6@mail.gmail.com>

On Thu, 9 Aug 2007 21:04:35 +0200
"Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote:

> On 09/08/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu, 09 Aug 2007 15:23:41 +0200
> > Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> >
> > > Andrew Morton pisze:
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> > >
> > > I am experiencing some problems with 8139too
> > >
> > > [   28.847004] 8139too 0000:02:0d.0: region #0 not a PIO resource, aborting
> > > [   28.854722] Bad IO access at port 0 ()
> > > [   28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 bad_io_access()
> > > [   28.867415]  [<c040536b>] show_trace_log_lvl+0x1a/0x30
> > > [   28.873568]  [<c0405ff3>] show_trace+0x12/0x14
> > > [   28.879015]  [<c0406128>] dump_stack+0x16/0x18
> > > [   28.884451]  [<c052a904>] bad_io_access+0x58/0x5a
> > > [   28.890129]  [<c052a927>] pci_iounmap+0x21/0x2b
> > > [   28.895635]  [<c05b3746>] __rtl8139_cleanup_dev+0x75/0xc6
> > > [   28.902037]  [<c05b42c4>] rtl8139_init_one+0x59b/0xa9f
> > > [   28.908170]  [<c053e344>] pci_device_probe+0x44/0x5f
> > > [   28.914116]  [<c05a7cd0>] driver_probe_device+0xa7/0x19a
> > > [   28.920402]  [<c05a7f1a>] __driver_attach+0xa6/0xa8
> > > [   28.926236]  [<c05a71d3>] bus_for_each_dev+0x43/0x61
> > > [   28.932139]  [<c05a7b51>] driver_attach+0x19/0x1b
> > > [   28.937776]  [<c05a7504>] bus_add_driver+0x7e/0x1a5
> > > [   28.943567]  [<c05a80cf>] driver_register+0x45/0x75
> > > [   28.949358]  [<c053e4a5>] __pci_register_driver+0x56/0x84
> > > [   28.955678]  [<c0819118>] rtl8139_init_module+0x14/0x1c
> > > [   28.961832]  [<c0800521>] kernel_init+0x132/0x306
> > > [   28.967451]  [<c0404fb3>] kernel_thread_helper+0x7/0x14
> > > [   28.973588]  =======================
> > > [   28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned 0.
> > > [   28.986114] initcall 0xc0819104 ran for 161 msecs: rtl8139_init_module+0x0/0x1c()
> > >
> > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg
> > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config
> > >
> >
> > Please try reverting 8139too-force-media-setting-fix.patch, then
> > applying this:
> >
> >
> 
> Problem fixed, thanks!
> 

umm, I was hoping to find out which of those two patches was the cuplrit.
Almost surely it was 9ee6b32a47b9abc565466a9c3b127a5246b452e5?

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Chris Snook @ 2007-08-09 19:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: wjiang, cfriesen, wensong, heiko.carstens, linux-kernel, ak,
	netdev, paulmck, horms, akpm, linux-arch, jesper.juhl, torvalds,
	zlynx, rpjday, schwidefsky, davem
In-Reply-To: <1abc2c621d6b62b3ac9f489d4d18806a@kernel.crashing.org>

Segher Boessenkool wrote:
>>> The only safe way to get atomic accesses is to write
>>> assembler code.  Are there any downsides to that?  I don't
>>> see any.
>>
>> The assumption that aligned word reads and writes are atomic, and that 
>> words are aligned unless explicitly packed otherwise, is endemic in 
>> the kernel.  No sane compiler violates this assumption.  It's true 
>> that we're not portable to insane compilers after this patch, but we 
>> never were in the first place.
> 
> You didn't answer my question: are there any downsides to using
> explicit coded-in-assembler accesses for atomic accesses?  You
> can handwave all you want that it should "just work" with
> volatile accesses, but volatility != atomicity, volatile in C
> is really badly defined, GCC never officially gave stronger
> guarantees, and we have a bugzilla full of PRs to show what a
> minefield it is.
> 
> So, why not use the well-defined alternative?

Because we don't need to, and it hurts performance.

^ permalink raw reply

* Re: 2.6.23-rc2-mm1
From: Michal Piotrowski @ 2007-08-09 19:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Netdev, Jeff Garzik
In-Reply-To: <20070809113710.bc0e3b80.akpm@linux-foundation.org>

On 09/08/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 09 Aug 2007 15:23:41 +0200
> Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
> > Andrew Morton pisze:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> >
> > I am experiencing some problems with 8139too
> >
> > [   28.847004] 8139too 0000:02:0d.0: region #0 not a PIO resource, aborting
> > [   28.854722] Bad IO access at port 0 ()
> > [   28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 bad_io_access()
> > [   28.867415]  [<c040536b>] show_trace_log_lvl+0x1a/0x30
> > [   28.873568]  [<c0405ff3>] show_trace+0x12/0x14
> > [   28.879015]  [<c0406128>] dump_stack+0x16/0x18
> > [   28.884451]  [<c052a904>] bad_io_access+0x58/0x5a
> > [   28.890129]  [<c052a927>] pci_iounmap+0x21/0x2b
> > [   28.895635]  [<c05b3746>] __rtl8139_cleanup_dev+0x75/0xc6
> > [   28.902037]  [<c05b42c4>] rtl8139_init_one+0x59b/0xa9f
> > [   28.908170]  [<c053e344>] pci_device_probe+0x44/0x5f
> > [   28.914116]  [<c05a7cd0>] driver_probe_device+0xa7/0x19a
> > [   28.920402]  [<c05a7f1a>] __driver_attach+0xa6/0xa8
> > [   28.926236]  [<c05a71d3>] bus_for_each_dev+0x43/0x61
> > [   28.932139]  [<c05a7b51>] driver_attach+0x19/0x1b
> > [   28.937776]  [<c05a7504>] bus_add_driver+0x7e/0x1a5
> > [   28.943567]  [<c05a80cf>] driver_register+0x45/0x75
> > [   28.949358]  [<c053e4a5>] __pci_register_driver+0x56/0x84
> > [   28.955678]  [<c0819118>] rtl8139_init_module+0x14/0x1c
> > [   28.961832]  [<c0800521>] kernel_init+0x132/0x306
> > [   28.967451]  [<c0404fb3>] kernel_thread_helper+0x7/0x14
> > [   28.973588]  =======================
> > [   28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned 0.
> > [   28.986114] initcall 0xc0819104 ran for 161 msecs: rtl8139_init_module+0x0/0x1c()
> >
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config
> >
>
> Please try reverting 8139too-force-media-setting-fix.patch, then
> applying this:
>
>

Problem fixed, thanks!

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Segher Boessenkool @ 2007-08-09 18:51 UTC (permalink / raw)
  To: Chris Snook
  Cc: wjiang, rpjday, wensong, heiko.carstens, linux-kernel, ak, netdev,
	paulmck, horms, akpm, linux-arch, jesper.juhl, torvalds,
	schwidefsky, davem, cfriesen, zlynx
In-Reply-To: <46BB4281.7010803@redhat.com>

>> The compiler is within its rights to read a 32-bit quantity 16 bits at
>> at time, even on a 32-bit machine.  I would be glad to help pummel any
>> compiler writer that pulls such a dirty trick, but the C standard 
>> really
>> does permit this.
>
> Yes, but we don't write code for these compilers.  There are countless 
> pieces of kernel code which would break in this condition, and there 
> doesn't seem to be any interest in fixing this.

"Other things are broken too".  Great argument :-)

> Sequence points enforce read-after-write ordering, not 
> write-after-write.

Sequence points order *all* side effects; sequence points exist in the
domain of the abstract sequential model of the C language only.  The
compiler translates that to machine code that is equivalent to that C
code under the "as-if" rule; but this is still in that abstract model,
which doesn't include things such as SMP, visibility by I/O devices,
store queues, etc. etc.

> We flush writes with reads for MMIO because of this effect as well as 
> the CPU/bus effects.

You cannot flush all MMIO writes with reads; this is a PCI-specific
thing.  And even then, you need more than just the read itself: you
have to make sure the read completed and returned data.

>> In short, please retain atomic_set()'s volatility, especially on those
>> architectures that declared the atomic_t's counter to be volatile.
>
> Like i386 and x86_64?  These used to have volatile in the atomic_t 
> declaration.  We removed it, and the sky did not fall.

And this proves what?  Lots of stuff "works" by accident.

Segher

^ permalink raw reply

* Re: [PATCH 1/1] af_packet: don't enable timestamps in mmap'ed sockets
From: Unai Uribarri @ 2007-08-09 18:50 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, linux-kernel
In-Reply-To: <20070809181823.GA32449@2ka.mipt.ru>

Do not enable timestamps automatically on mmap'ed AF_PACKET sockets.

---
commit d1d6e6bf196e31b6306fd0fef95f4190983c8a86
tree 22637506c0aafeabfbe05faf5352d0358c4d9460
parent 6a302358d87fedaf7bda12b8e909265ebf1ce674
author Unai Uribarri <unai.uribarri@optenet.com> Tue, 31 Jul 2007 20:38:42 +0200
committer Unai Uribarri <unai.uribarri@optenet.com> Tue, 31 Jul 2007 20:38:42 +0200

 net/packet/af_packet.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 1322d62..a4f2da3 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -640,10 +640,6 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, struct packe
        h->tp_snaplen = snaplen;
        h->tp_mac = macoff;
        h->tp_net = netoff;
-       if (skb->tstamp.tv64 == 0) {
-               __net_timestamp(skb);
-               sock_enable_timestamp(sk);
-       }
        tv = ktime_to_timeval(skb->tstamp);
        h->tp_sec = tv.tv_sec;
        h->tp_usec = tv.tv_usec;




!-------------------------------------------------------------flip-


Effectively disable timestamping when requested by SO_TIMESTAMP

---
commit 1fdf6bb534dfbc6e9bdf8958620b05d5334b15eb
tree ec4b577c1704f178f0f7c5d8d69af41454fc8f14
parent d1d6e6bf196e31b6306fd0fef95f4190983c8a86
author Unai Uribarri <unai.uribarri@optenet.com> Tue, 31 Jul 2007 20:43:00 +0200
committer Unai Uribarri <unai.uribarri@optenet.com> Tue, 31 Jul 2007 20:43:00 +0200

 net/core/sock.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index cfed7d4..3af2322 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -561,6 +561,7 @@ set_rcvbuf:
                } else {
                        sock_reset_flag(sk, SOCK_RCVTSTAMP);
                        sock_reset_flag(sk, SOCK_RCVTSTAMPNS);
+                       sock_disable_timestamp(sk);
                }
                break;





!-------------------------------------------------------------flip-


Automatically enable timestamping on mmap'ed AF_PACKET sockets.

---
commit 4564f367ff054bd8837cc6cb1cfb9a927c57054a
tree 3475d56cc7ea74052677c944ec0a6bf6e9f4817c
parent 1fdf6bb534dfbc6e9bdf8958620b05d5334b15eb
author Unai Uribarri <unai.uribarri@optenet.com> Tue, 31 Jul 2007 20:43:59 +0200
committer Unai Uribarri <unai.uribarri@optenet.com> Tue, 31 Jul 2007 20:43:59 +0200

 net/packet/af_packet.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index a4f2da3..5179daf 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1750,6 +1750,7 @@ static int packet_set_ring(struct sock *sk, struct tpacket_req *req, int closing

                po->pg_vec_pages = req->tp_block_size/PAGE_SIZE;
                po->prot_hook.func = po->pg_vec ? tpacket_rcv : packet_rcv;
+               sock_enable_timestamp(sk);
                skb_queue_purge(&sk->sk_receive_queue);
 #undef XC
                if (atomic_read(&po->mapped))




!-------------------------------------------------------------flip-



On jue, 2007-08-09 at 22:18 +0400, Evgeniy Polyakov wrote:
> On Thu, Aug 09, 2007 at 08:13:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> > On jue, 2007-08-09 at 18:33 +0400, Evgeniy Polyakov wrote:
> > > On Thu, Aug 09, 2007 at 04:21:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> > > > The attached patch removes the automatic timestamp activation, that
> > > > only mmap'ed AF_PACKET sockets perform. I known it can break user
> > > > applications, but I believe that it's the correct solution.
> > > 
> > > How tcpdump with mmap libpcap will work with it?
> > 
> > In Linux, you can enable timestamps on any socket executing:
> > 
> > int val = 1;
> > setsockopt(sock, SOL_SOCKET, SO_TIMESTAMP, &val, sizeof(val));
> > 
> > PD: Current release of tcpdump doesn't mmap the reception ring and
> > timestamps packets at user space with gettimeofday. It isn't the best
> > performing alternative, but it's portable.
>  
> IIRC, there was/is a libpcap which worked with mapped sockets, mybe not 
> official though. Any application which depened on having timestamps with
> packets will not work now. Why not to implement an
> absolutely_turn_off_timestamps option instead of breaking compatibility?
> 



^ permalink raw reply related

* [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Steve Wise @ 2007-08-09 18:49 UTC (permalink / raw)
  To: Roland Dreier, David S. Miller; +Cc: netdev, linux-kernel, OpenFabrics General
In-Reply-To: <46B883B5.8040702@opengridcomputing.com>

Any more comments?


Steve Wise wrote:
> Networking experts,
> 
> I'd like input on the patch below, and help in solving this bug 
> properly.  iWARP devices that support both native stack TCP and iWARP 
> (aka RDMA over TCP/IP/Ethernet) connections on the same interface need 
> the fix below or some similar fix to the RDMA connection manager.
> 
> This is a BUG in the Linux RDMA-CMA code as it stands today.
> 
> Here is the issue:
> 
> Consider an mpi cluster running mvapich2.  And the cluster runs 
> MPI/Sockets jobs concurrently with MPI/RDMA jobs.  It is possible, 
> without the patch below, for MPI/Sockets processes to mistakenly get 
> incoming RDMA connections and vice versa.  The way mvapich2 works is 
> that the ranks all bind and listen to a random port (retrying new random 
> ports if the bind fails with "in use").  Once they get a free port and
> bind/listen, they advertise that port number to the peers to do 
> connection setup.  Currently, without the patch below, the mpi/rdma 
> processes can end up binding/listening to the _same_ port number as the 
> mpi/sockets processes running over the native tcp stack.  This is due to 
> duplicate port spaces for native stack TCP and the rdma cm's RDMA_PS_TCP 
> port space.  If this happens, then the connections can get screwed up.
> 
> The correct solution in my mind is to use the host stack's TCP port 
> space for _all_ RDMA_PS_TCP port allocations.   The patch below is a 
> minimal delta to unify the port spaces by using the kernel stack to bind 
> ports.  This is done by allocating a kernel socket and binding to the 
> appropriate local addr/port.  It also allows the kernel stack to pick 
> ephemeral ports by virtue of just passing in port 0 on the kernel bind 
> operation.
> 
> There has been a discussion already on the RDMA list if anyone is 
> interested:
> 
> http://www.mail-archive.com/general@lists.openfabrics.org/msg05162.html
> 
> 
> Thanks,
> 
> Steve.
> 
> 
> ---
> 
> RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
> 
> This is needed for iwarp providers that support native and rdma
> connections over the same interface.
> 
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> ---
> 
> drivers/infiniband/core/cma.c |   27 ++++++++++++++++++++++++++-
> 1 files changed, 26 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 9e0ab04..e4d2d7f 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -111,6 +111,7 @@ struct rdma_id_private {
>     struct rdma_cm_id    id;
> 
>     struct rdma_bind_list    *bind_list;
> +    struct socket        *sock;
>     struct hlist_node    node;
>     struct list_head    list;
>     struct list_head    listen_list;
> @@ -695,6 +696,8 @@ static void cma_release_port(struct rdma
>         kfree(bind_list);
>     }
>     mutex_unlock(&lock);
> +    if (id_priv->sock)
> +        sock_release(id_priv->sock);
> }
> 
> void rdma_destroy_id(struct rdma_cm_id *id)
> @@ -1790,6 +1793,25 @@ static int cma_use_port(struct idr *ps,
>     return 0;
> }
> 
> +static int cma_get_tcp_port(struct rdma_id_private *id_priv)
> +{
> +    int ret;
> +    struct socket *sock;
> +
> +    ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
> +    if (ret)
> +        return ret;
> +    ret = sock->ops->bind(sock,
> +              (struct sockaddr *)&id_priv->id.route.addr.src_addr,
> +              ip_addr_size(&id_priv->id.route.addr.src_addr));
> +    if (ret) {
> +        sock_release(sock);
> +        return ret;
> +    }
> +    id_priv->sock = sock;
> +    return 0;   
> +}
> +
> static int cma_get_port(struct rdma_id_private *id_priv)
> {
>     struct idr *ps;
> @@ -1801,6 +1823,9 @@ static int cma_get_port(struct rdma_id_p
>         break;
>     case RDMA_PS_TCP:
>         ps = &tcp_ps;
> +        ret = cma_get_tcp_port(id_priv); /* Synch with native stack */
> +        if (ret)
> +            goto out;
>         break;
>     case RDMA_PS_UDP:
>         ps = &udp_ps;
> @@ -1815,7 +1840,7 @@ static int cma_get_port(struct rdma_id_p
>     else
>         ret = cma_use_port(ps, id_priv);
>     mutex_unlock(&lock);
> -
> +out:
>     return ret;
> }
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Paul E. McKenney @ 2007-08-09 18:45 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl
In-Reply-To: <46BB5960.9040009@redhat.com>

On Thu, Aug 09, 2007 at 02:13:52PM -0400, Chris Snook wrote:
> Paul E. McKenney wrote:
> >On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
> >>                                If you're depending on volatile writes 
> >>being visible to other CPUs, you're screwed either way, because the CPU 
> >>can hold that data in cache as long as it wants before it writes it to 
> >>memory.  When this finally does happen, it will happen atomically, which 
> >>is all that atomic_set guarantees.  If you need to guarantee that the 
> >>value is written to memory at a particular time in your execution 
> >>sequence, you either have to read it from memory to force the compiler to 
> >>store it first (and a volatile cast in atomic_read will suffice for this) 
> >>or you have to use LOCK_PREFIX instructions which will invalidate remote 
> >>cache lines containing the same variable.  This patch doesn't change 
> >>either of these cases.
> >
> >The case that it -can- change is interactions with interrupt handlers.
> >And NMI/SMI handlers, for that matter.
> 
> You have a point here, but only if you can guarantee that the interrupt 
> handler is running on a processor sharing the cache that has the 
> not-yet-written volatile value.  That implies a strictly non-SMP 
> architecture.  At the moment, none of those have volatile in their 
> declaration of atomic_t, so this patch can't break any of them.

This can also happen when using per-CPU variables.  And there are a
number of per-CPU variables that are either atomic themselves or are
structures containing atomic fields.

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Segher Boessenkool @ 2007-08-09 18:38 UTC (permalink / raw)
  To: Chris Snook
  Cc: wjiang, cfriesen, wensong, heiko.carstens, linux-kernel, ak,
	netdev, paulmck, horms, akpm, linux-arch, jesper.juhl, torvalds,
	zlynx, rpjday, schwidefsky, davem
In-Reply-To: <46BB3ECF.2070100@redhat.com>

>> The only safe way to get atomic accesses is to write
>> assembler code.  Are there any downsides to that?  I don't
>> see any.
>
> The assumption that aligned word reads and writes are atomic, and that 
> words are aligned unless explicitly packed otherwise, is endemic in 
> the kernel.  No sane compiler violates this assumption.  It's true 
> that we're not portable to insane compilers after this patch, but we 
> never were in the first place.

You didn't answer my question: are there any downsides to using
explicit coded-in-assembler accesses for atomic accesses?  You
can handwave all you want that it should "just work" with
volatile accesses, but volatility != atomicity, volatile in C
is really badly defined, GCC never officially gave stronger
guarantees, and we have a bugzilla full of PRs to show what a
minefield it is.

So, why not use the well-defined alternative?

Segher

^ permalink raw reply

* Re: [PATCH 1/1] af_packet: don't enable timestamps in mmap'ed sockets
From: Unai Uribarri @ 2007-08-09 18:44 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, linux-kernel
In-Reply-To: <20070809181823.GA32449@2ka.mipt.ru>

There is another option:

1. Move timestampt activation to packet_set_ring(), so it's activated
only once at setup instead of every time a packet arrives.
2. Fix sock_setsockopt() so setting SO_TIMESTAMP to 0 effectively
disables timestamp.

My first patch set did that. I sent that patches in mime multipart
format and they were removed from the list archives. I can resent them
in plain text if needed.


On jue, 2007-08-09 at 22:18 +0400, Evgeniy Polyakov wrote:
> On Thu, Aug 09, 2007 at 08:13:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> > On jue, 2007-08-09 at 18:33 +0400, Evgeniy Polyakov wrote:
> > > On Thu, Aug 09, 2007 at 04:21:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> > > > The attached patch removes the automatic timestamp activation, that
> > > > only mmap'ed AF_PACKET sockets perform. I known it can break user
> > > > applications, but I believe that it's the correct solution.
> > > 
> > > How tcpdump with mmap libpcap will work with it?
> > 
> > In Linux, you can enable timestamps on any socket executing:
> > 
> > int val = 1;
> > setsockopt(sock, SOL_SOCKET, SO_TIMESTAMP, &val, sizeof(val));
> > 
> > PD: Current release of tcpdump doesn't mmap the reception ring and
> > timestamps packets at user space with gettimeofday. It isn't the best
> > performing alternative, but it's portable.
>  
> IIRC, there was/is a libpcap which worked with mapped sockets, mybe not 
> official though. Any application which depened on having timestamps with
> packets will not work now. Why not to implement an
> absolutely_turn_off_timestamps option instead of breaking compatibility?
> 



^ permalink raw reply

* Re: 2.6.23-rc2-mm1
From: Andrew Morton @ 2007-08-09 18:37 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: linux-kernel, Netdev, Jeff Garzik
In-Reply-To: <46BB155D.8050501@googlemail.com>

On Thu, 09 Aug 2007 15:23:41 +0200
Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:

> Andrew Morton pisze:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> 
> I am experiencing some problems with 8139too
> 
> [   28.847004] 8139too 0000:02:0d.0: region #0 not a PIO resource, aborting
> [   28.854722] Bad IO access at port 0 ()
> [   28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 bad_io_access()
> [   28.867415]  [<c040536b>] show_trace_log_lvl+0x1a/0x30
> [   28.873568]  [<c0405ff3>] show_trace+0x12/0x14
> [   28.879015]  [<c0406128>] dump_stack+0x16/0x18
> [   28.884451]  [<c052a904>] bad_io_access+0x58/0x5a
> [   28.890129]  [<c052a927>] pci_iounmap+0x21/0x2b
> [   28.895635]  [<c05b3746>] __rtl8139_cleanup_dev+0x75/0xc6
> [   28.902037]  [<c05b42c4>] rtl8139_init_one+0x59b/0xa9f
> [   28.908170]  [<c053e344>] pci_device_probe+0x44/0x5f
> [   28.914116]  [<c05a7cd0>] driver_probe_device+0xa7/0x19a
> [   28.920402]  [<c05a7f1a>] __driver_attach+0xa6/0xa8
> [   28.926236]  [<c05a71d3>] bus_for_each_dev+0x43/0x61
> [   28.932139]  [<c05a7b51>] driver_attach+0x19/0x1b
> [   28.937776]  [<c05a7504>] bus_add_driver+0x7e/0x1a5
> [   28.943567]  [<c05a80cf>] driver_register+0x45/0x75
> [   28.949358]  [<c053e4a5>] __pci_register_driver+0x56/0x84
> [   28.955678]  [<c0819118>] rtl8139_init_module+0x14/0x1c
> [   28.961832]  [<c0800521>] kernel_init+0x132/0x306
> [   28.967451]  [<c0404fb3>] kernel_thread_helper+0x7/0x14
> [   28.973588]  =======================
> [   28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned 0.
> [   28.986114] initcall 0xc0819104 ran for 161 msecs: rtl8139_init_module+0x0/0x1c()
> 
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config
> 

Please try reverting 8139too-force-media-setting-fix.patch, then
applying this:


From: Andrew Morton <akpm@linux-foundation.org>

Revert git-netdev-all's 9ee6b32a47b9abc565466a9c3b127a5246b452e5

Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/net/8139too.c |   50 +++++++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 21 deletions(-)

diff -puN drivers/net/8139too.c~revert-8139too-clean-up-i-o-remapping drivers/net/8139too.c
--- a/drivers/net/8139too.c~revert-8139too-clean-up-i-o-remapping
+++ a/drivers/net/8139too.c
@@ -121,15 +121,8 @@
 
 
 /* enable PIO instead of MMIO, if CONFIG_8139TOO_PIO is selected */
-enum rtl_bar_map_info {
-	rtl_pio_bar	= 0,	/* PCI BAR #0: PIO */
-	rtl_mmio_bar	= 1,	/* PCI BAR #1: MMIO */
-};
-
 #ifdef CONFIG_8139TOO_PIO
-static int use_pio	= 1;
-#else
-static int use_pio;
+#define USE_IO_OPS 1
 #endif
 
 /* define to 1, 2 or 3 to enable copious debugging info */
@@ -620,17 +613,14 @@ MODULE_DESCRIPTION ("RealTek RTL-8139 Fa
 MODULE_LICENSE("GPL");
 MODULE_VERSION(DRV_VERSION);
 
-module_param(multicast_filter_limit, int, 0444);
+module_param(multicast_filter_limit, int, 0);
 module_param_array(media, int, NULL, 0);
 module_param_array(full_duplex, int, NULL, 0);
-module_param(debug, int, 0444);
-module_param(use_pio, int, 0444);
-
+module_param(debug, int, 0);
+MODULE_PARM_DESC (debug, "8139too bitmapped message enable number");
 MODULE_PARM_DESC (multicast_filter_limit, "8139too maximum number of filtered multicast addresses");
 MODULE_PARM_DESC (media, "8139too: Bits 4+9: force full duplex, bit 5: 100Mbps");
 MODULE_PARM_DESC (full_duplex, "8139too: Force full duplex for board(s) (1)");
-MODULE_PARM_DESC (debug, "8139too bitmapped message enable number");
-MODULE_PARM_DESC (use_pio, "Non-zero to enable PIO (rather than MMIO) register mapping");
 
 static int read_eeprom (void __iomem *ioaddr, int location, int addr_len);
 static int rtl8139_open (struct net_device *dev);
@@ -718,7 +708,13 @@ static void __rtl8139_cleanup_dev (struc
 	assert (tp->pci_dev != NULL);
 	pdev = tp->pci_dev;
 
-	pci_iounmap (pdev, tp->mmio_addr);
+#ifdef USE_IO_OPS
+	if (tp->mmio_addr)
+		ioport_unmap (tp->mmio_addr);
+#else
+	if (tp->mmio_addr)
+		pci_iounmap (pdev, tp->mmio_addr);
+#endif /* USE_IO_OPS */
 
 	/* it's ok to call this even if we have no regions to free */
 	pci_release_regions (pdev);
@@ -794,32 +790,32 @@ static int __devinit rtl8139_init_board 
 	DPRINTK("PIO region size == 0x%02X\n", pio_len);
 	DPRINTK("MMIO region size == 0x%02lX\n", mmio_len);
 
+#ifdef USE_IO_OPS
 	/* make sure PCI base addr 0 is PIO */
 	if (!(pio_flags & IORESOURCE_IO)) {
 		dev_err(&pdev->dev, "region #0 not a PIO resource, aborting\n");
 		rc = -ENODEV;
 		goto err_out;
 	}
-
 	/* check for weird/broken PCI region reporting */
 	if (pio_len < RTL_MIN_IO_SIZE) {
 		dev_err(&pdev->dev, "Invalid PCI I/O region size(s), aborting\n");
 		rc = -ENODEV;
 		goto err_out;
 	}
-
+#else
 	/* make sure PCI base addr 1 is MMIO */
 	if (!(mmio_flags & IORESOURCE_MEM)) {
 		dev_err(&pdev->dev, "region #1 not an MMIO resource, aborting\n");
 		rc = -ENODEV;
 		goto err_out;
 	}
-
 	if (mmio_len < RTL_MIN_IO_SIZE) {
 		dev_err(&pdev->dev, "Invalid PCI mem region size(s), aborting\n");
 		rc = -ENODEV;
 		goto err_out;
 	}
+#endif
 
 	rc = pci_request_regions (pdev, DRV_NAME);
 	if (rc)
@@ -829,16 +825,28 @@ static int __devinit rtl8139_init_board 
 	/* enable PCI bus-mastering */
 	pci_set_master (pdev);
 
-	/* iomap MMIO region */
-	ioaddr = pci_iomap(pdev, use_pio ? rtl_pio_bar : rtl_mmio_bar, 0);
+#ifdef USE_IO_OPS
+	ioaddr = ioport_map(pio_start, pio_len);
+	if (!ioaddr) {
+		dev_err(&pdev->dev, "cannot map PIO, aborting\n");
+		rc = -EIO;
+		goto err_out;
+	}
+	dev->base_addr = pio_start;
+	tp->mmio_addr = ioaddr;
+	tp->regs_len = pio_len;
+#else
+	/* ioremap MMIO region */
+	ioaddr = pci_iomap(pdev, 1, 0);
 	if (ioaddr == NULL) {
-		dev_err(&pdev->dev, "cannot map I/O regions, aborting\n");
+		dev_err(&pdev->dev, "cannot remap MMIO, aborting\n");
 		rc = -EIO;
 		goto err_out;
 	}
 	dev->base_addr = (long) ioaddr;
 	tp->mmio_addr = ioaddr;
 	tp->regs_len = mmio_len;
+#endif /* USE_IO_OPS */
 
 	/* Bring old chips out of low-power mode. */
 	RTL_W8 (HltClk, 'R');
_

^ permalink raw reply

* Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
From: Chris Snook @ 2007-08-09 18:13 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl
In-Reply-To: <20070809174150.GE8424@linux.vnet.ibm.com>

Paul E. McKenney wrote:
> On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
>>                                 If you're depending on volatile writes 
>> being visible to other CPUs, you're screwed either way, because the CPU can 
>> hold that data in cache as long as it wants before it writes it to memory.  
>> When this finally does happen, it will happen atomically, which is all that 
>> atomic_set guarantees.  If you need to guarantee that the value is written 
>> to memory at a particular time in your execution sequence, you either have 
>> to read it from memory to force the compiler to store it first (and a 
>> volatile cast in atomic_read will suffice for this) or you have to use 
>> LOCK_PREFIX instructions which will invalidate remote cache lines 
>> containing the same variable.  This patch doesn't change either of these 
>> cases.
> 
> The case that it -can- change is interactions with interrupt handlers.
> And NMI/SMI handlers, for that matter.

You have a point here, but only if you can guarantee that the interrupt handler 
is running on a processor sharing the cache that has the not-yet-written 
volatile value.  That implies a strictly non-SMP architecture.  At the moment, 
none of those have volatile in their declaration of atomic_t, so this patch 
can't break any of them.

	-- Chris

^ permalink raw reply

* Re: [PATCH 1/1] af_packet: don't enable timestamps in mmap'ed sockets
From: Evgeniy Polyakov @ 2007-08-09 18:18 UTC (permalink / raw)
  To: Unai Uribarri; +Cc: netdev, linux-kernel
In-Reply-To: <1186683234.24669.65.camel@localhost.localdomain>

On Thu, Aug 09, 2007 at 08:13:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> On jue, 2007-08-09 at 18:33 +0400, Evgeniy Polyakov wrote:
> > On Thu, Aug 09, 2007 at 04:21:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> > > The attached patch removes the automatic timestamp activation, that
> > > only mmap'ed AF_PACKET sockets perform. I known it can break user
> > > applications, but I believe that it's the correct solution.
> > 
> > How tcpdump with mmap libpcap will work with it?
> 
> In Linux, you can enable timestamps on any socket executing:
> 
> int val = 1;
> setsockopt(sock, SOL_SOCKET, SO_TIMESTAMP, &val, sizeof(val));
> 
> PD: Current release of tcpdump doesn't mmap the reception ring and
> timestamps packets at user space with gettimeofday. It isn't the best
> performing alternative, but it's portable.
 
IIRC, there was/is a libpcap which worked with mapped sockets, mybe not 
official though. Any application which depened on having timestamps with
packets will not work now. Why not to implement an
absolutely_turn_off_timestamps option instead of breaking compatibility?

-- 
	Evgeniy Polyakov

^ permalink raw reply

* Re: [PATCH RFC]: napi_struct V5
From: Shirley Ma @ 2007-08-09 18:16 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Miller, hadi, jgarzik, netdev, netdev-owner, rusty,
	shemminger
In-Reply-To: <adawsw4mwcw.fsf@cisco.com>

Hello Roland,

> Shirley, I think it would still be useful to run benchmarks of IPoIB
> on ehca with Dave's NAPI patches, both V5 that changed the "missed
> event" behavior and V6 that didn't.  At least I'm curious to know how
> much the difference is.
> 
>  - R.

The performance difference was huge before. Most of the time the poll more 
method was only polling one packet again and agin. 
Yes, we will definitely run the performance measurement, but I have 
lowered this task priority since Dave has V6 version. Actually IPoIB is 
not the only one device using rescheduling instead of polling more for 
NAPI. I think it's not a bad idea to do so.

Thanks
Shirley 

^ permalink raw reply

* Re: [PATCH] make atomic_t volatile on all architectures
From: Martin Schwidefsky @ 2007-08-09 18:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chuck Ebbert, Chris Snook, akpm, ak, heiko.carstens, davem,
	linux-kernel, netdev, wensong, horms, wjiang, cfriesen, zlynx
In-Reply-To: <alpine.LFD.0.999.0708091055080.25146@woody.linux-foundation.org>

On Thu, 2007-08-09 at 10:55 -0700, Linus Torvalds wrote:
> > You can use this forget() macro to make the compiler reread a variable:
> > 
> > #define forget(var) asm volatile ("" : "=m"(var))
> 
> No. That will also make the compiler "forget" any previous writes to it, 
> so it changes behaviour.
> 
> You'd have to use "+m".

Yes, though I would use "=m" on the output list and "m" on the input
list. The reason is that I've seen gcc fall on its face with an ICE on
s390 due to "+m". The explanation I've got from our compiler people was
quite esoteric, as far as I remember gcc splits "+m" to an input operand
and an output operand. Now it can happen that the compiler chooses two
different registers to access the same memory location. "+m" requires
that the two memory references are identical which causes the ICE if
they are not. I do not know if the current compilers still do this. Has
anyone else seen this happen ?

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply

* Re: [PATCH 1/1] af_packet: don't enable timestamps in mmap'ed sockets
From: Unai Uribarri @ 2007-08-09 18:13 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, linux-kernel
In-Reply-To: <20070809143322.GA5345@2ka.mipt.ru>

On jue, 2007-08-09 at 18:33 +0400, Evgeniy Polyakov wrote:
> On Thu, Aug 09, 2007 at 04:21:54PM +0200, Unai Uribarri (unai.uribarri@optenet.com) wrote:
> > The attached patch removes the automatic timestamp activation, that
> > only mmap'ed AF_PACKET sockets perform. I known it can break user
> > applications, but I believe that it's the correct solution.
> 
> How tcpdump with mmap libpcap will work with it?

In Linux, you can enable timestamps on any socket executing:

int val = 1;
setsockopt(sock, SOL_SOCKET, SO_TIMESTAMP, &val, sizeof(val));

PD: Current release of tcpdump doesn't mmap the reception ring and
timestamps packets at user space with gettimeofday. It isn't the best
performing alternative, but it's portable.




^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox