Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-17 19:49 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <1187376873.2615.2.camel@laptopd505.fenrus.org>

On Fri, Aug 17, 2007 at 11:54:33AM -0700, Arjan van de Ven wrote:
> 
> On Fri, 2007-08-17 at 12:50 -0600, Chris Friesen wrote:
> > Linus Torvalds wrote:
> > 
> > >  - in other words, the *only* possible meaning for "volatile" is a purely 
> > >    single-CPU meaning. And if you only have a single CPU involved in the 
> > >    process, the "volatile" is by definition pointless (because even 
> > >    without a volatile, the compiler is required to make the C code appear 
> > >    consistent as far as a single CPU is concerned).
> > 
> > I assume you mean "except for IO-related code and 'random' values like 
> > jiffies" as you mention later on?  I assume other values set in 
> > interrupt handlers would count as "random" from a volatility perspective?
> > 
> > > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > > incorrect. It does NO SUCH THING. By arguing that, such people only show 
> > > that you have no idea what they are talking about.
> > 
> > What about reading values modified in interrupt handlers, as in your 
> > "random" case?  Or is this a bug where the user of atomic_read() is 
> > invalidly expecting a read each time it is called?
> 
> the interrupt handler case is an SMP case since you do not know
> beforehand what cpu your interrupt handler will run on.

With the exception of per-CPU variables, yes.

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Linus Torvalds @ 2007-08-17 19:08 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <46C5EDF9.3090507@nortel.com>



On Fri, 17 Aug 2007, Chris Friesen wrote:
> 
> I assume you mean "except for IO-related code and 'random' values like
> jiffies" as you mention later on?

Yes. There *are* valid uses for "volatile", but they have remained the 
same for the last few years:
 - "jiffies"
 - internal per-architecture IO implementations that can do them as normal 
   stores.

> I assume other values set in interrupt handlers would count as "random" 
> from a volatility perspective?

I don't really see any valid case. I can imagine that you have your own 
"jiffy" counter in a driver, but what's the point, really? I'd suggest not 
using volatile, and using barriers instead.

> 
> > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > incorrect. It does NO SUCH THING. By arguing that, such people only 
> > show that you have no idea what they are talking about.

> What about reading values modified in interrupt handlers, as in your 
> "random" case?  Or is this a bug where the user of atomic_read() is 
> invalidly expecting a read each time it is called?

Quite frankly, the biggest reason for using "volatile" on jiffies was 
really historic. So even the "random" case is not really a very strong 
one. You'll notice that anybody who is actually careful will be using 
sequence locks for the jiffy accesses, if only because the *full* jiffy 
count is actually a 64-bit value, and so you cannot get it atomically on a 
32-bit architecture even on a single CPU (ie a timer interrupt might 
happen in between reading the low and the high word, so "volatile" is only 
used for the low 32 bits).

So even for jiffies, we actually have:

	extern u64 __jiffy_data jiffies_64;
	extern unsigned long volatile __jiffy_data jiffies;

where the *real* jiffies is not volatile: the volatile one is using linker 
tricks to alias the low 32 bits:

 - arch/i386/kernel/vmlinux.lds.S:

	...
	jiffies = jiffies_64;
	...

and the only reason we do all these games is (a) it works and (b) it's 
legacy.

Note how I do *not* say "(c) it's a good idea".

			Linus


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Arjan van de Ven @ 2007-08-17 18:54 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <46C5EDF9.3090507@nortel.com>


On Fri, 2007-08-17 at 12:50 -0600, Chris Friesen wrote:
> Linus Torvalds wrote:
> 
> >  - in other words, the *only* possible meaning for "volatile" is a purely 
> >    single-CPU meaning. And if you only have a single CPU involved in the 
> >    process, the "volatile" is by definition pointless (because even 
> >    without a volatile, the compiler is required to make the C code appear 
> >    consistent as far as a single CPU is concerned).
> 
> I assume you mean "except for IO-related code and 'random' values like 
> jiffies" as you mention later on?  I assume other values set in 
> interrupt handlers would count as "random" from a volatility perspective?
> 
> > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > incorrect. It does NO SUCH THING. By arguing that, such people only show 
> > that you have no idea what they are talking about.
> 
> What about reading values modified in interrupt handlers, as in your 
> "random" case?  Or is this a bug where the user of atomic_read() is 
> invalidly expecting a read each time it is called?

the interrupt handler case is an SMP case since you do not know
beforehand what cpu your interrupt handler will run on.




^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-17 18:56 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <alpine.LFD.0.999.0708172040370.3666@enigma.security.iitk.ac.in>

On Sat, Aug 18, 2007 at 12:01:38AM +0530, Satyam Sharma wrote:
> 
> 
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> > > 
> > > On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> > > 
> > > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > > > 
> > > > > First of all, I think this illustrates that what you want
> > > > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > > > macro occurs a lot more times in your patch than atomic
> > > > > reads/sets.  So *assuming* that it was necessary at all,
> > > > > then having an ordered variant of the atomic_read/atomic_set
> > > > > ops could do just as well.
> > > > 
> > > > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > > > to maintain ordering, then I could just use them instead of having to
> > > > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > > > different patch.)
> > > 
> > > +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> > > [...]
> > > Also, this gives *zero* "re-ordering" guarantees that your code wants
> > > as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> > > probably you don't care about) *nor* w.r.t. compiler re-ordering
> > > (which you definitely _do_ care about).
> > 
> > You are correct about CPU re-ordering (and about the fact that this
> > example doesn't care about it), but not about compiler re-ordering.
> > 
> > The compiler is prohibited from moving a volatile access across a sequence
> > point.  One example of a sequence point is a statement boundary.  Because
> > all of the volatile accesses in this code are separated by statement
> > boundaries, a conforming compiler is prohibited from reordering them.
> 
> Yes, you're right, and I believe precisely this was discussed elsewhere
> as well today.
> 
> But I'd call attention to what Herbert mentioned there. You're using
> ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:
> 
> * Member "completed" of struct rcu_ctrlblk is a long.
> * Per-cpu variable rcu_flipctr is an array of ints.
> * Members "rcu_read_lock_nesting" and "rcu_flipctr_idx" of
>   struct task_struct are ints.
> 
> So are you saying you're "having to use" this volatile-access macro
> because you *couldn't* declare all the above as atomic_t and thus just
> expect the right thing to happen by using the atomic ops API by default,
> because it lacks volatile access semantics (on x86)?
> 
> If so, then I wonder if using the volatile access cast is really the
> best way to achieve (at least in terms of code clarity) the kind of
> re-ordering guarantees it wants there. (there could be alternative
> solutions, such as using barrier(), or that at bottom of this mail)
> 
> What I mean is this: If you switch to atomic_t, and x86 switched to
> make atomic_t have "volatile" semantics by default, the statements
> would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
> and atomic_read() statements, and nothing in there that clearly makes
> it *explicit* that the code is correct (and not buggy) simply because
> of the re-ordering guarantees that the C "volatile" type-qualifier
> keyword gives us as per the standard. But now we're firmly in
> "subjective" territory, so you or anybody could legitimately disagree.

In any case, given Linus's note, it appears that atomic_read() and
atomic_set() won't consistently have volatile semantics, at least
not while the compiler generates such ugly code for volatile accesses.
So I will continue with my current approach.

In any case, I will not be using atomic_inc() or atomic_add() in this
code, as doing so would more than double the overhead, even on machines
that are the most efficient at implementing atomic operations.

> > > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > > > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > > > 
> > > > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > > > 	was ordered by the compiler after
> > > > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > > > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > > > 	before the rcu_flipctr.
> > > > 
> > > > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > > > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > > > 	would believe that it could take the then-clause of the
> > > > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > > > 	variable had not yet been incremented, an RCU updater would
> > > > 	be within its rights to assume that there were no RCU reads
> > > > 	in progress, thus possibly yanking a data structure out from
> > > > 	under the reader in the SMI/NMI function.
> > > > 
> > > > 	Fatal outcome.  Note that only one CPU is involved here
> > > > 	because these are all either per-CPU or per-task variables.
> > > 
> > > Ok, so you don't care about CPU re-ordering. Still, I should let you know
> > > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> > > want is a full compiler optimization barrier().
> > 
> > No.  See above.
> 
> True, *(volatile foo *)& _will_ work for this case.
> 
> But multiple calls to barrier() (granted, would invalidate all other
> optimizations also) would work as well, would it not?

They work, but are a bit slower.  So they do work, but not as well.

> [ Interestingly, if you declared all those objects mentioned earlier as
>   atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
>   for atomic_{read,set}_volatile(), the bugs you want to avoid would still
>   be there. "volatile" the C language type-qualifier does have compiler
>   re-ordering semantics you mentioned earlier, but the "volatile" that
>   applies to inline asm()s gives no re-ordering guarantees. ]

Well, that certainly would be a point in favor of "volatile" over inline
asms.  ;-)

> > > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > > > 	was ordered by the compiler to follow the
> > > > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > > > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > > > 	would incorrectly take the "else" clause of the enclosing "if"
> > > > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > > > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > > > 	write the new value into rcu_flipctr_idx.
> > > > 
> > > > 	Well and good so far.  But the problem arises in
> > > > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > > > 	Depending on exactly how subsequent events played out, this could
> > > > 	result in either prematurely ending grace periods or never-ending
> > > > 	grace periods, both of which are fatal outcomes.
> > > > 
> > > > And the following are not needed in the current version of the
> > > > patch, but will be in a future version that either avoids disabling
> > > > irqs or that dispenses with the smp_read_barrier_depends() that I
> > > > have 99% convinced myself is unneeded:
> > > > 
> > > > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > > > 
> > > > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > > > 
> > > > Furthermore, in that future version, irq handlers can cause the same
> > > > mischief that SMI/NMI handlers can in this version.
> 
> So don't remove the local_irq_save/restore, which is well-established and
> well-understood for such cases (it doesn't help you with SMI/NMI,
> admittedly). This isn't really about RCU or per-cpu vars as such, it's
> just about racy code where you don't want to get hit by a concurrent
> interrupt (it does turn out that doing things in a _particular order_ will
> not cause fatal/buggy behaviour, but it's still a race issue, after all).

The local_irq_save/restore are something like 30% of the overhead of
these two functions, so will be looking hard at getting rid of them.
Doing so allows the scheduling-clock interrupt to get into the mix,
and also allows preemption.  The goal would be to find some trick that
suppresses preemption, fends off the grace-period-computation code
invoked from the the scheduling-clock interrupt, and otherwise keeps
things on an even keel.

> > > > Next, looking at __rcu_read_unlock():
> > > > 
> > > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > > > 	was reordered by the compiler to follow the
> > > > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > > > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > > > 	the two, this nested rcu_read_lock() would incorrectly believe
> > > > 	that it was protected by an enclosing RCU read-side critical
> > > > 	section as described in the first reversal discussed for
> > > > 	__rcu_read_lock() above.  Again, fatal outcome.
> > > > 
> > > > This is what we have now.  It is not hard to imagine situations that
> > > > interact with -both- interrupt handlers -and- other CPUs, as described
> > > > earlier.
> 
> Unless somebody's going for a lockless implementation, such situations
> normally use spin_lock_irqsave() based locking (or local_irq_save for
> those who care only for current CPU) -- problem with the patch in question,
> is that you want to prevent races with concurrent SMI/NMIs as well, which
> is not something that a lot of code needs to consider.

Or that needs to resolve similar races with IRQs without disabling them.
One reason to avoid disabling IRQs is to avoid degrading scheduling
latency.  In any case, I do agree that the amount of code that must
worry about this is quite small at the moment.  I believe that it
will become more common, but would imagine that this belief might not
be universal.  Yet, anyway.  ;-)

> [ Curiously, another thread is discussing something similar also:
>   http://lkml.org/lkml/2007/8/15/393 "RFC: do get_rtc_time() correctly" ]
> 
> Anyway, I didn't look at the code in that patch very much in detail, but
> why couldn't you implement some kind of synchronization variable that lets
> rcu_read_lock() or rcu_read_unlock() -- when being called from inside an
> NMI or SMI handler -- know that it has concurrently interrupted an ongoing
> rcu_read_{un}lock() and so must do things differently ... (?)

Given some low-level details of the current implementation, I could
imagine manipulating rcu_read_lock_nesting on entry to and exit from
all NMI/SMI handlers, but would like to avoid that kind of architecture
dependency.  I am not confident of locating all of them, for one thing...

> I'm also wondering if there's other code that's not using locking in the
> kernel that faces similar issues, and what they've done to deal with it
> (if anything). Such bugs would be subtle, and difficult to diagnose.

Agreed!

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH] lockdep: annotate rcu_read_{,un}lock()
From: Paul E. McKenney @ 2007-08-17 18:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, herbert, 123.oleg, netdev, linux-kernel,
	David Miller, Daniel Walker, josht, minyard
In-Reply-To: <20070817155357.GD8464@linux.vnet.ibm.com>

On Fri, Aug 17, 2007 at 08:53:57AM -0700, Paul E. McKenney wrote:
> On Fri, Aug 17, 2007 at 09:56:45AM +0200, Peter Zijlstra wrote:
> > On Thu, 2007-08-16 at 09:01 -0700, Paul E. McKenney wrote:
> > > On Thu, Aug 16, 2007 at 04:25:07PM +0200, Peter Zijlstra wrote:
> > > > 
> > > > There seem to be some unbalanced rcu_read_{,un}lock() issues of late,
> > > > how about doing something like this:
> > > 
> > > This will break when rcu_read_lock() and rcu_read_unlock() are invoked
> > > from NMI/SMI handlers -- the raw_local_irq_save() in lock_acquire() will
> > > not mask NMIs or SMIs.
> > > 
> > > One approach would be to check for being in an NMI/SMI handler, and
> > > to avoid calling lock_acquire() and lock_release() in those cases.
> > 
> > It seems:
> > 
> > #define nmi_enter()		do { lockdep_off(); __irq_enter(); } while (0)
> > #define nmi_exit()		do { __irq_exit(); lockdep_on(); } while (0)
> > 
> > Should make it all work out just fine. (for NMIs at least, /me fully
> > ignorant of the workings of SMIs)
> 
> Very good point, at least for NMIs on i386 and x86_64.  Can't say that I
> know much about SMIs myself.  Or about whatever equivalents to NMIs and
> SMIs might exist on other platforms.  :-/  Of course, the other platforms
> could be handled by making the RCU lockdep operate only on i386 and x86_64
> if required.
> 
> Corey, any advice on SMI handlers?  Is there something like nmi_enter()
> and nmi_exit() that allows disabing lockdep?
> 
> > > Another approach would be to use sparse, which has checks for
> > > rcu_read_lock()/rcu_read_unlock() nesting.
> > 
> > Yeah, but one more method can never hurt, no? :-)
> 
> Excellent point!
> 
> I guess the next thing for me is to do a performance check.  Looks like
> CONFIG_DEBUG_LOCK_ALLOC is not on by default, so not violently worried
> about performance, but would be good to know.

4-CPU 1.8GHz Opteron 844 running 2.6.22 CONFIG_PREEMPT_NONE with Peter's
patch running rcu_read_lock()/rcu_read_unlock() in a tight loop:

1 CPU:  91.7ns
2 CPU: 335.2ns
3 CPU: 534.3ns  (But bimodal: two CPUs at 640.7ns, one at 321.6ns)
4 CPU: 784.0ns  (again bimodal: three CPUs at 895.9ns, one at 448.2ns)

Running without Peter's patch measures the loop overhead of about 1.2ns.

So not crippling, but definitely a debug-only option that is not enabled
by default in production.

In summary, seems like an eminently reasonable approach to me, assuming
that the NMI/SMI issues can be resolved across the platforms that have
them.  Good stuff, Peter!!!

						Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Chris Friesen @ 2007-08-17 18:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <alpine.LFD.0.999.0708170929580.30176@woody.linux-foundation.org>

Linus Torvalds wrote:

>  - in other words, the *only* possible meaning for "volatile" is a purely 
>    single-CPU meaning. And if you only have a single CPU involved in the 
>    process, the "volatile" is by definition pointless (because even 
>    without a volatile, the compiler is required to make the C code appear 
>    consistent as far as a single CPU is concerned).

I assume you mean "except for IO-related code and 'random' values like 
jiffies" as you mention later on?  I assume other values set in 
interrupt handlers would count as "random" from a volatility perspective?

> So anybody who argues for "volatile" fixing bugs is fundamentally 
> incorrect. It does NO SUCH THING. By arguing that, such people only show 
> that you have no idea what they are talking about.

What about reading values modified in interrupt handlers, as in your 
"random" case?  Or is this a bug where the user of atomic_read() is 
invalidly expecting a read each time it is called?

Chris

^ permalink raw reply

* Re: [PATCH] lockdep: annotate rcu_read_{,un}lock()
From: Corey Minyard @ 2007-08-17 18:48 UTC (permalink / raw)
  To: paulmck
  Cc: Peter Zijlstra, Ingo Molnar, herbert, 123.oleg, netdev,
	linux-kernel, David Miller, Daniel Walker, josht
In-Reply-To: <20070817155357.GD8464@linux.vnet.ibm.com>

Paul E. McKenney wrote:
> On Fri, Aug 17, 2007 at 09:56:45AM +0200, Peter Zijlstra wrote:
>   
>> On Thu, 2007-08-16 at 09:01 -0700, Paul E. McKenney wrote:
>>     
>>> On Thu, Aug 16, 2007 at 04:25:07PM +0200, Peter Zijlstra wrote:
>>>       
>>>> There seem to be some unbalanced rcu_read_{,un}lock() issues of late,
>>>> how about doing something like this:
>>>>         
>>> This will break when rcu_read_lock() and rcu_read_unlock() are invoked
>>> from NMI/SMI handlers -- the raw_local_irq_save() in lock_acquire() will
>>> not mask NMIs or SMIs.
>>>
>>> One approach would be to check for being in an NMI/SMI handler, and
>>> to avoid calling lock_acquire() and lock_release() in those cases.
>>>       
>> It seems:
>>
>> #define nmi_enter()		do { lockdep_off(); __irq_enter(); } while (0)
>> #define nmi_exit()		do { __irq_exit(); lockdep_on(); } while (0)
>>
>> Should make it all work out just fine. (for NMIs at least, /me fully
>> ignorant of the workings of SMIs)
>>     
>
> Very good point, at least for NMIs on i386 and x86_64.  Can't say that I
> know much about SMIs myself.  Or about whatever equivalents to NMIs and
> SMIs might exist on other platforms.  :-/  Of course, the other platforms
> could be handled by making the RCU lockdep operate only on i386 and x86_64
> if required.
>
> Corey, any advice on SMI handlers?  Is there something like nmi_enter()
> and nmi_exit() that allows disabing lockdep?
>   
You will certainly need something like nmi_enter() and nmi_exit() for 
SMIs, since they can occur at any time like NMIs.  As far as anything 
else, you just have to be extremely careful and remember that it can 
occur anyplace.  But you already know that :).

It would be nice if the PowerPC board vendors would tie watchdog 
pretimeouts and some type of timer into the SMI input.  It would make 
debugging certain problems much easier.  And all those Marvell bridge 
chips have a watchdog pretimeout and I haven't seen any board vendor 
wire it up :(.

-corey


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-17 18:38 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang
In-Reply-To: <c298f0e3f9e49c9dd2d326115d156084@kernel.crashing.org>



On Fri, 17 Aug 2007, Segher Boessenkool wrote:

> > > atomic_dec() already has volatile behavior everywhere, so this is
> > > semantically
> > > okay, but this code (and any like it) should be calling cpu_relax() each
> > > iteration through the loop, unless there's a compelling reason not to.
> > > I'll
> > > allow that for some hardware drivers (possibly this one) such a compelling
> > > reason may exist, but hardware-independent core subsystems probably have
> > > no
> > > excuse.
> > 
> > No it does not have any volatile semantics. atomic_dec() can be reordered
> > at will by the compiler within the current basic unit if you do not add a
> > barrier.
> 
> "volatile" has nothing to do with reordering.

If you're talking of "volatile" the type-qualifier keyword, then
http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
otherwise.

> atomic_dec() writes
> to memory, so it _does_ have "volatile semantics", implicitly, as
> long as the compiler cannot optimise the atomic variable away
> completely -- any store counts as a side effect.

I don't think an atomic_dec() implemented as an inline "asm volatile"
or one that uses a "forget" macro would have the same re-ordering
guarantees as an atomic_dec() that uses a volatile access cast.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-17 18:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <20070817143101.GA8464@linux.vnet.ibm.com>



On Fri, 17 Aug 2007, Paul E. McKenney wrote:

> On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> > 
> > On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> > 
> > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > > 
> > > > First of all, I think this illustrates that what you want
> > > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > > macro occurs a lot more times in your patch than atomic
> > > > reads/sets.  So *assuming* that it was necessary at all,
> > > > then having an ordered variant of the atomic_read/atomic_set
> > > > ops could do just as well.
> > > 
> > > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > > to maintain ordering, then I could just use them instead of having to
> > > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > > different patch.)
> > 
> > +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> > [...]
> > Also, this gives *zero* "re-ordering" guarantees that your code wants
> > as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> > probably you don't care about) *nor* w.r.t. compiler re-ordering
> > (which you definitely _do_ care about).
> 
> You are correct about CPU re-ordering (and about the fact that this
> example doesn't care about it), but not about compiler re-ordering.
> 
> The compiler is prohibited from moving a volatile access across a sequence
> point.  One example of a sequence point is a statement boundary.  Because
> all of the volatile accesses in this code are separated by statement
> boundaries, a conforming compiler is prohibited from reordering them.

Yes, you're right, and I believe precisely this was discussed elsewhere
as well today.

But I'd call attention to what Herbert mentioned there. You're using
ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:

* Member "completed" of struct rcu_ctrlblk is a long.
* Per-cpu variable rcu_flipctr is an array of ints.
* Members "rcu_read_lock_nesting" and "rcu_flipctr_idx" of
  struct task_struct are ints.

So are you saying you're "having to use" this volatile-access macro
because you *couldn't* declare all the above as atomic_t and thus just
expect the right thing to happen by using the atomic ops API by default,
because it lacks volatile access semantics (on x86)?

If so, then I wonder if using the volatile access cast is really the
best way to achieve (at least in terms of code clarity) the kind of
re-ordering guarantees it wants there. (there could be alternative
solutions, such as using barrier(), or that at bottom of this mail)

What I mean is this: If you switch to atomic_t, and x86 switched to
make atomic_t have "volatile" semantics by default, the statements
would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
and atomic_read() statements, and nothing in there that clearly makes
it *explicit* that the code is correct (and not buggy) simply because
of the re-ordering guarantees that the C "volatile" type-qualifier
keyword gives us as per the standard. But now we're firmly in
"subjective" territory, so you or anybody could legitimately disagree.


> > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > > 
> > > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > > 	was ordered by the compiler after
> > > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > > 	before the rcu_flipctr.
> > > 
> > > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > > 	would believe that it could take the then-clause of the
> > > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > > 	variable had not yet been incremented, an RCU updater would
> > > 	be within its rights to assume that there were no RCU reads
> > > 	in progress, thus possibly yanking a data structure out from
> > > 	under the reader in the SMI/NMI function.
> > > 
> > > 	Fatal outcome.  Note that only one CPU is involved here
> > > 	because these are all either per-CPU or per-task variables.
> > 
> > Ok, so you don't care about CPU re-ordering. Still, I should let you know
> > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> > want is a full compiler optimization barrier().
> 
> No.  See above.

True, *(volatile foo *)& _will_ work for this case.

But multiple calls to barrier() (granted, would invalidate all other
optimizations also) would work as well, would it not?

[ Interestingly, if you declared all those objects mentioned earlier as
  atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
  for atomic_{read,set}_volatile(), the bugs you want to avoid would still
  be there. "volatile" the C language type-qualifier does have compiler
  re-ordering semantics you mentioned earlier, but the "volatile" that
  applies to inline asm()s gives no re-ordering guarantees. ]


> > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > > 	was ordered by the compiler to follow the
> > > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > > 	would incorrectly take the "else" clause of the enclosing "if"
> > > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > > 	write the new value into rcu_flipctr_idx.
> > > 
> > > 	Well and good so far.  But the problem arises in
> > > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > > 	Depending on exactly how subsequent events played out, this could
> > > 	result in either prematurely ending grace periods or never-ending
> > > 	grace periods, both of which are fatal outcomes.
> > > 
> > > And the following are not needed in the current version of the
> > > patch, but will be in a future version that either avoids disabling
> > > irqs or that dispenses with the smp_read_barrier_depends() that I
> > > have 99% convinced myself is unneeded:
> > > 
> > > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > > 
> > > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > > 
> > > Furthermore, in that future version, irq handlers can cause the same
> > > mischief that SMI/NMI handlers can in this version.

So don't remove the local_irq_save/restore, which is well-established and
well-understood for such cases (it doesn't help you with SMI/NMI,
admittedly). This isn't really about RCU or per-cpu vars as such, it's
just about racy code where you don't want to get hit by a concurrent
interrupt (it does turn out that doing things in a _particular order_ will
not cause fatal/buggy behaviour, but it's still a race issue, after all).


> > > Next, looking at __rcu_read_unlock():
> > > 
> > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > > 	was reordered by the compiler to follow the
> > > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > > 	the two, this nested rcu_read_lock() would incorrectly believe
> > > 	that it was protected by an enclosing RCU read-side critical
> > > 	section as described in the first reversal discussed for
> > > 	__rcu_read_lock() above.  Again, fatal outcome.
> > > 
> > > This is what we have now.  It is not hard to imagine situations that
> > > interact with -both- interrupt handlers -and- other CPUs, as described
> > > earlier.

Unless somebody's going for a lockless implementation, such situations
normally use spin_lock_irqsave() based locking (or local_irq_save for
those who care only for current CPU) -- problem with the patch in question,
is that you want to prevent races with concurrent SMI/NMIs as well, which
is not something that a lot of code needs to consider.

[ Curiously, another thread is discussing something similar also:
  http://lkml.org/lkml/2007/8/15/393 "RFC: do get_rtc_time() correctly" ]

Anyway, I didn't look at the code in that patch very much in detail, but
why couldn't you implement some kind of synchronization variable that lets
rcu_read_lock() or rcu_read_unlock() -- when being called from inside an
NMI or SMI handler -- know that it has concurrently interrupted an ongoing
rcu_read_{un}lock() and so must do things differently ... (?)

I'm also wondering if there's other code that's not using locking in the
kernel that faces similar issues, and what they've done to deal with it
(if anything). Such bugs would be subtle, and difficult to diagnose.


Satyam

^ permalink raw reply

* [PATCH 6/7 v2] fs_enet: Be an of_platform device when CONFIG_PPC_CPM_NEW_BINDING is set.
From: Scott Wood @ 2007-08-17 18:17 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev

The existing OF glue code was crufty and broken.  Rather than fix it, it
will be removed, and the ethernet driver now talks to the device tree
directly.

The old, non-CONFIG_PPC_CPM_NEW_BINDING code can go away once CPM
platforms are dropped from arch/ppc (which will hopefully be soon), and
existing arch/powerpc boards that I wasn't able to test on for this
patchset get converted (which should be even sooner).

mii-bitbang.c now also uses the generic bitbang code.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
Sorry, the previous version of this patch had bb_ops in the wrong
place, and wouldn't build when not using the new binding.

 drivers/net/fs_enet/Kconfig        |    1 +
 drivers/net/fs_enet/fs_enet-main.c |  286 ++++++++++++++++++++++--
 drivers/net/fs_enet/fs_enet.h      |   55 +-----
 drivers/net/fs_enet/mac-fcc.c      |   89 ++++++--
 drivers/net/fs_enet/mac-fec.c      |   23 ++-
 drivers/net/fs_enet/mac-scc.c      |   55 +++--
 drivers/net/fs_enet/mii-bitbang.c  |  438 ++++++++++++++++-------------------
 drivers/net/fs_enet/mii-fec.c      |  140 ++++++++++++-
 include/linux/fs_enet_pd.h         |    5 +
 9 files changed, 740 insertions(+), 352 deletions(-)

diff --git a/drivers/net/fs_enet/Kconfig b/drivers/net/fs_enet/Kconfig
index e27ee21..2765e49 100644
--- a/drivers/net/fs_enet/Kconfig
+++ b/drivers/net/fs_enet/Kconfig
@@ -11,6 +11,7 @@ config FS_ENET_HAS_SCC
 config FS_ENET_HAS_FCC
 	bool "Chip has an FCC usable for ethernet"
 	depends on FS_ENET && CPM2
+	select MDIO_BITBANG
 	default y
 
 config FS_ENET_HAS_FEC
diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index 0ec30a8..7bf29a2 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -44,12 +44,18 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
+
 #include "fs_enet.h"
 
 /*************************************************/
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static char version[] __devinitdata =
     DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")" "\n";
+#endif
 
 MODULE_AUTHOR("Pantelis Antoniou <panto@intracom.gr>");
 MODULE_DESCRIPTION("Freescale Ethernet Driver");
@@ -953,6 +959,7 @@ static int fs_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 extern int fs_mii_connect(struct net_device *dev);
 extern void fs_mii_disconnect(struct net_device *dev);
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static struct net_device *fs_init_instance(struct device *dev,
 		struct fs_platform_info *fpi)
 {
@@ -1132,6 +1139,7 @@ static int fs_cleanup_instance(struct net_device *ndev)
 
 	return 0;
 }
+#endif
 
 /**************************************************************************************/
 
@@ -1140,35 +1148,279 @@ void *fs_enet_immap = NULL;
 
 static int setup_immap(void)
 {
-	phys_addr_t paddr = 0;
-	unsigned long size = 0;
-
 #ifdef CONFIG_CPM1
-	paddr = IMAP_ADDR;
-	size = 0x10000;	/* map 64K */
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	fs_enet_immap = mpc8xx_immr;
+#else
+	fs_enet_immap = ioremap(IMAP_ADDR, 0x4000);
+	WARN_ON(!fs_enet_immap);
 #endif
-
-#ifdef CONFIG_CPM2
-	paddr = CPM_MAP_ADDR;
-	size = 0x40000;	/* map 256 K */
+#elif defined(CONFIG_CPM2)
+	fs_enet_immap = cpm2_immr;
 #endif
-	fs_enet_immap = ioremap(paddr, size);
-	if (fs_enet_immap == NULL)
-		return -EBADF;	/* XXX ahem; maybe just BUG_ON? */
 
 	return 0;
 }
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static void cleanup_immap(void)
 {
-	if (fs_enet_immap != NULL) {
-		iounmap(fs_enet_immap);
-		fs_enet_immap = NULL;
-	}
+#if defined(CONFIG_CPM1)
+	iounmap(fs_enet_immap);
+#endif
 }
+#endif
 
 /**************************************************************************************/
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+static int __devinit find_phy(struct device_node *np,
+                              struct fs_platform_info *fpi)
+{
+	struct device_node *phynode, *mdionode;
+	struct resource res;
+	int ret = 0, len;
+
+	const u32 *data = of_get_property(np, "phy-handle", &len);
+	if (!data || len != 4)
+		return -EINVAL;
+
+	phynode = of_find_node_by_phandle(*data);
+	if (!phynode)
+		return -EINVAL;
+
+	mdionode = of_get_parent(phynode);
+	if (!phynode)
+		goto out_put_phy;
+
+	ret = of_address_to_resource(mdionode, 0, &res);
+	if (ret)
+		goto out_put_mdio;
+
+	data = of_get_property(phynode, "reg", &len);
+	if (!data || len != 4)
+		goto out_put_mdio;
+
+	snprintf(fpi->bus_id, 16, PHY_ID_FMT, res.start, *data);
+
+out_put_mdio:
+	of_node_put(mdionode);
+out_put_phy:
+	of_node_put(phynode);
+	return ret;
+}
+
+#ifdef CONFIG_FS_ENET_HAS_FEC
+#define IS_FEC(match) ((match)->data == &fs_fec_ops)
+#else
+#define IS_FEC(match) 0
+#endif
+
+static int __devinit fs_enet_probe(struct of_device *ofdev,
+                                   const struct of_device_id *match)
+{
+	struct net_device *ndev;
+	struct fs_enet_private *fep;
+	struct fs_platform_info *fpi;
+	const u32 *data;
+	const u8 *mac_addr;
+	int privsize, len, ret = -ENODEV;
+
+	fpi = kzalloc(sizeof(*fpi), GFP_KERNEL);
+	if (!fpi)
+		return -ENOMEM;
+
+	if (!IS_FEC(match)) {
+		data = of_get_property(ofdev->node, "fsl,cpm-command", &len);
+		if (!data || len != 4)
+			goto out_free_fpi;
+
+		fpi->cp_command = *data;
+	}
+
+	fpi->rx_ring = 32;
+	fpi->tx_ring = 32;
+	fpi->rx_copybreak = 240;
+	fpi->use_napi = 0;
+	fpi->napi_weight = 17;
+
+	ret = find_phy(ofdev->node, fpi);
+	if (ret)
+		goto out_free_fpi;
+
+	privsize = sizeof(*fep) +
+	           sizeof(struct sk_buff **) *
+	           (fpi->rx_ring + fpi->tx_ring);
+
+	ndev = alloc_etherdev(privsize);
+	if (!ndev) {
+		ret = -ENOMEM;
+		goto out_free_fpi;
+	}
+
+	SET_MODULE_OWNER(ndev);
+	dev_set_drvdata(&ofdev->dev, ndev);
+
+	fep = netdev_priv(ndev);
+	fep->dev = &ofdev->dev;
+	fep->fpi = fpi;
+	fep->ops = match->data;
+
+	ret = fep->ops->setup_data(ndev);
+	if (ret)
+		goto out_free_dev;
+
+	fep->rx_skbuff = (struct sk_buff **)&fep[1];
+	fep->tx_skbuff = fep->rx_skbuff + fpi->rx_ring;
+
+	spin_lock_init(&fep->lock);
+	spin_lock_init(&fep->tx_lock);
+
+	mac_addr = of_get_mac_address(ofdev->node);
+	if (mac_addr)
+		memcpy(ndev->dev_addr, mac_addr, 6);
+
+	ret = fep->ops->allocate_bd(ndev);
+	if (ret)
+		goto out_cleanup_data;
+
+	fep->rx_bd_base = fep->ring_base;
+	fep->tx_bd_base = fep->rx_bd_base + fpi->rx_ring;
+
+	fep->tx_ring = fpi->tx_ring;
+	fep->rx_ring = fpi->rx_ring;
+
+	ndev->open = fs_enet_open;
+	ndev->hard_start_xmit = fs_enet_start_xmit;
+	ndev->tx_timeout = fs_timeout;
+	ndev->watchdog_timeo = 2 * HZ;
+	ndev->stop = fs_enet_close;
+	ndev->get_stats = fs_enet_get_stats;
+	ndev->set_multicast_list = fs_set_multicast_list;
+	if (fpi->use_napi) {
+		ndev->poll = fs_enet_rx_napi;
+		ndev->weight = fpi->napi_weight;
+	}
+	ndev->ethtool_ops = &fs_ethtool_ops;
+	ndev->do_ioctl = fs_ioctl;
+
+	init_timer(&fep->phy_timer_list);
+
+	netif_carrier_off(ndev);
+
+	ret = register_netdev(ndev);
+	if (ret)
+		goto out_free_bd;
+
+	printk(KERN_INFO "%s: fs_enet: %02x:%02x:%02x:%02x:%02x:%02x\n",
+	       ndev->name,
+	       ndev->dev_addr[0], ndev->dev_addr[1], ndev->dev_addr[2],
+	       ndev->dev_addr[3], ndev->dev_addr[4], ndev->dev_addr[5]);
+
+	return 0;
+
+out_free_bd:
+	fep->ops->free_bd(ndev);
+out_cleanup_data:
+	fep->ops->cleanup_data(ndev);
+out_free_dev:
+	free_netdev(ndev);
+	dev_set_drvdata(&ofdev->dev, NULL);
+out_free_fpi:
+	kfree(fpi);
+	return ret;
+}
+
+static int fs_enet_remove(struct of_device *ofdev)
+{
+	struct net_device *ndev = dev_get_drvdata(&ofdev->dev);
+	struct fs_enet_private *fep = netdev_priv(ndev);
+
+	unregister_netdev(ndev);
+
+	fep->ops->free_bd(ndev);
+	fep->ops->cleanup_data(ndev);
+	dev_set_drvdata(fep->dev, NULL);
+
+	free_netdev(ndev);
+	return 0;
+}
+
+static struct of_device_id fs_enet_match[] = {
+#ifdef CONFIG_FS_ENET_HAS_SCC
+	{
+		.compatible = "fsl,cpm1-scc-enet",
+		.data = (void *)&fs_scc_ops,
+	},
+#endif
+#ifdef CONFIG_FS_ENET_HAS_FCC
+	{
+		.compatible = "fsl,cpm2-fcc-enet",
+		.data = (void *)&fs_fcc_ops,
+	},
+#endif
+#ifdef CONFIG_FS_ENET_HAS_FEC
+	{
+		.compatible = "fsl,pq1-fec-enet",
+		.data = (void *)&fs_fec_ops,
+	},
+#endif
+	{}
+};
+
+static struct of_platform_driver fs_enet_driver = {
+	.name	= "fs_enet",
+	.match_table = fs_enet_match,
+	.probe = fs_enet_probe,
+	.remove = fs_enet_remove,
+};
+
+static int __init fs_init(void)
+{
+	int r = setup_immap();
+	if (r != 0)
+		return r;
+
+#ifdef CONFIG_CPM2
+	r = fs_enet_mdio_bb_init();
+	if (r != 0)
+		goto out_mdio_bb;
+#endif
+#ifdef CONFIG_8xx
+	r = fs_enet_mdio_fec_init();
+	if (r != 0)
+		goto out_mdio_fec;
+#endif
+
+	r = of_register_platform_driver(&fs_enet_driver);
+	if (r != 0)
+		goto out_driver;
+
+	return 0;
+
+out_driver:
+#ifdef CONFIG_8xx
+	fs_enet_mdio_fec_exit();
+out_mdio_fec:
+#endif
+#ifdef CONFIG_CPM2
+	fs_enet_mdio_bb_exit();
+out_mdio_bb:
+#endif
+	return r;
+}
+
+static void __exit fs_cleanup(void)
+{
+	of_unregister_platform_driver(&fs_enet_driver);
+#ifdef CONFIG_8xx
+	fs_enet_mdio_fec_exit();
+#endif
+#ifdef CONFIG_CPM2
+	fs_enet_mdio_bb_exit();
+#endif
+}
+#else
 static int __devinit fs_enet_probe(struct device *dev)
 {
 	struct net_device *ndev;
@@ -1282,7 +1534,7 @@ static void __exit fs_cleanup(void)
 	driver_unregister(&fs_enet_scc_driver);
 	cleanup_immap();
 }
-
+#endif
 /**************************************************************************************/
 
 module_init(fs_init);
diff --git a/drivers/net/fs_enet/fs_enet.h b/drivers/net/fs_enet/fs_enet.h
index f8c7ee8..a130332 100644
--- a/drivers/net/fs_enet/fs_enet.h
+++ b/drivers/net/fs_enet/fs_enet.h
@@ -24,19 +24,6 @@ struct fec_info {
 #include <asm/cpm2.h>
 #endif
 
-/* This is used to operate with pins.
-  Note that the actual port size may
-    be different; cpm(s) handle it OK  */
-struct bb_info {
-	u8 mdio_dat_msk;
-	u8 mdio_dir_msk;
-	u8 *mdio_dir;
-	u8 *mdio_dat;
-	u8 mdc_msk;
-	u8 *mdc_dat;
-	int delay;
-};
-
 /* hw driver ops */
 struct fs_ops {
 	int (*setup_data)(struct net_device *dev);
@@ -85,47 +72,11 @@ struct phy_info {
 #define ENET_RX_ALIGN  16
 #define ENET_RX_FRSIZE L1_CACHE_ALIGN(PKT_MAXBUF_SIZE + ENET_RX_ALIGN - 1)
 
-struct fs_enet_mii_bus {
-	struct list_head list;
-	spinlock_t mii_lock;
-	const struct fs_mii_bus_info *bus_info;
-	int refs;
-	u32 usage_map;
-
-	int (*mii_read)(struct fs_enet_mii_bus *bus,
-			int phy_id, int location);
-
-	void (*mii_write)(struct fs_enet_mii_bus *bus,
-			int phy_id, int location, int value);
-
-	union {
-		struct {
-			unsigned int mii_speed;
-			void *fecp;
-		} fec;
-
-		struct {
-			/* note that the actual port size may */
-			/* be different; cpm(s) handle it OK  */
-			u8 mdio_msk;
-			u8 *mdio_dir;
-			u8 *mdio_dat;
-			u8 mdc_msk;
-			u8 *mdc_dir;
-			u8 *mdc_dat;
-		} bitbang;
-
-		struct {
-			u16 lpa;
-		} fixed;
-	};
-};
-
 struct fs_enet_private {
 	struct device *dev;	/* pointer back to the device (must be initialized first) */
 	spinlock_t lock;	/* during all ops except TX pckt processing */
 	spinlock_t tx_lock;	/* during fs_start_xmit and fs_tx         */
-	const struct fs_platform_info *fpi;
+	struct fs_platform_info *fpi;
 	const struct fs_ops *ops;
 	int rx_ring, tx_ring;
 	dma_addr_t ring_mem_addr;
@@ -144,7 +95,6 @@ struct fs_enet_private {
 	u32 msg_enable;
 	struct mii_if_info mii_if;
 	unsigned int last_mii_status;
-	struct fs_enet_mii_bus *mii_bus;
 	int interrupt;
 
 	struct phy_device *phydev;
@@ -187,8 +137,9 @@ struct fs_enet_private {
 
 /***************************************************************************/
 int fs_enet_mdio_bb_init(void);
-int fs_mii_fixed_init(struct fs_enet_mii_bus *bus);
+void fs_enet_mdio_bb_exit(void);
 int fs_enet_mdio_fec_init(void);
+void fs_enet_mdio_fec_exit(void);
 
 void fs_init_bds(struct net_device *dev);
 void fs_cleanup_bds(struct net_device *dev);
diff --git a/drivers/net/fs_enet/mac-fcc.c b/drivers/net/fs_enet/mac-fcc.c
index 8b30361..dc943fd 100644
--- a/drivers/net/fs_enet/mac-fcc.c
+++ b/drivers/net/fs_enet/mac-fcc.c
@@ -42,6 +42,10 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_device.h>
+#endif
+
 #include "fs_enet.h"
 
 /*************************************************/
@@ -74,33 +78,64 @@
 
 #define MAX_CR_CMD_LOOPS	10000
 
-static inline int fcc_cr_cmd(struct fs_enet_private *fep, u32 mcn, u32 op)
+static inline int fcc_cr_cmd(struct fs_enet_private *fep, u32 op)
 {
 	const struct fs_platform_info *fpi = fep->fpi;
 	cpm2_map_t *immap = fs_enet_immap;
 	cpm_cpm2_t *cpmp = &immap->im_cpm;
-	u32 v;
 	int i;
 
-	/* Currently I don't know what feature call will look like. But 
-	   I guess there'd be something like do_cpm_cmd() which will require page & sblock */
-	v = mk_cr_cmd(fpi->cp_page, fpi->cp_block, mcn, op);
-	W32(cpmp, cp_cpcr, v | CPM_CR_FLG);
+	W32(cpmp, cp_cpcr, fpi->cp_command | op | CPM_CR_FLG);
 	for (i = 0; i < MAX_CR_CMD_LOOPS; i++)
 		if ((R32(cpmp, cp_cpcr) & CPM_CR_FLG) == 0)
-			break;
-
-	if (i >= MAX_CR_CMD_LOOPS) {
-		printk(KERN_ERR "%s(): Not able to issue CPM command\n",
-		       __FUNCTION__);
-		return 1;
-	}
+			return 0;
 
-	return 0;
+	printk(KERN_ERR "%s(): Not able to issue CPM command\n",
+	       __FUNCTION__);
+	return 1;
 }
 
 static int do_pd_setup(struct fs_enet_private *fep)
 {
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct of_device *ofdev = to_of_device(fep->dev);
+	struct fs_platform_info *fpi = fep->fpi;
+	int ret = -EINVAL;
+
+	fep->interrupt = of_irq_to_resource(ofdev->node, 0, NULL);
+	if (fep->interrupt == NO_IRQ)
+		goto out;
+
+	fep->fcc.fccp = of_iomap(ofdev->node, 0);
+	if (!fep->fcc.fccp)
+		goto out;
+
+	fep->fcc.ep = of_iomap(ofdev->node, 1);
+	if (!fep->fcc.ep)
+		goto out_fccp;
+
+	fep->fcc.fcccp = of_iomap(ofdev->node, 2);
+	if (!fep->fcc.fcccp)
+		goto out_ep;
+
+	fep->fcc.mem = (void *)cpm_dpalloc(128, 8);
+	fpi->dpram_offset = (u32)cpm2_immr;
+	if (IS_ERR_VALUE(fpi->dpram_offset)) {
+		ret = fpi->dpram_offset;
+		goto out_fcccp;
+	}
+
+	return 0;
+
+out_fcccp:
+	iounmap(fep->fcc.fcccp);
+out_ep:
+	iounmap(fep->fcc.ep);
+out_fccp:
+	iounmap(fep->fcc.fccp);
+out:
+	return ret;
+#else
 	struct platform_device *pdev = to_platform_device(fep->dev);
 	struct resource *r;
 
@@ -138,6 +173,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 		return -EINVAL;
 
 	return 0;
+#endif
 }
 
 #define FCC_NAPI_RX_EVENT_MSK	(FCC_ENET_RXF | FCC_ENET_RXB)
@@ -148,11 +184,17 @@ static int do_pd_setup(struct fs_enet_private *fep)
 static int setup_data(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	const struct fs_platform_info *fpi = fep->fpi;
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
+	struct fs_platform_info *fpi = fep->fpi;
+
+	fpi->cp_command = (fpi->cp_page << 26) |
+	                  (fpi->cp_block << 21) |
+	                  (12 << 6);
 
 	fep->fcc.idx = fs_get_fcc_index(fpi->fs_no);
 	if ((unsigned int)fep->fcc.idx >= 3)	/* max 3 FCCs */
 		return -EINVAL;
+#endif
 
 	if (do_pd_setup(fep) != 0)
 		return -EINVAL;
@@ -226,7 +268,7 @@ static void set_multicast_one(struct net_device *dev, const u8 *mac)
 	W16(ep, fen_taddrh, taddrh);
 	W16(ep, fen_taddrm, taddrm);
 	W16(ep, fen_taddrl, taddrl);
-	fcc_cr_cmd(fep, 0x0C, CPM_CR_SET_GADDR);
+	fcc_cr_cmd(fep, CPM_CR_SET_GADDR);
 }
 
 static void set_multicast_finish(struct net_device *dev)
@@ -281,7 +323,7 @@ static void restart(struct net_device *dev)
 
 	/* clear everything (slow & steady does it) */
 	for (i = 0; i < sizeof(*ep); i++)
-		out_8((char *)ep + i, 0);
+		out_8((u8 __iomem *)ep + i, 0);
 
 	/* get physical address */
 	rx_bd_base_phys = fep->ring_mem_addr;
@@ -397,7 +439,7 @@ static void restart(struct net_device *dev)
 			S8(fcccp, fcc_gfemr, 0x20);
 	}
 
-	fcc_cr_cmd(fep, 0x0c, CPM_CR_INIT_TRX);
+	fcc_cr_cmd(fep, CPM_CR_INIT_TRX);
 
 	/* clear events */
 	W16(fccp, fcc_fcce, 0xffff);
@@ -515,23 +557,22 @@ int get_regs(struct net_device *dev, void *p, int *sizep)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 
-	if (*sizep < sizeof(fcc_t) + sizeof(fcc_c_t) + sizeof(fcc_enet_t))
+	if (*sizep < sizeof(fcc_t) + sizeof(fcc_enet_t) + 1)
 		return -EINVAL;
 
 	memcpy_fromio(p, fep->fcc.fccp, sizeof(fcc_t));
 	p = (char *)p + sizeof(fcc_t);
 
-	memcpy_fromio(p, fep->fcc.fcccp, sizeof(fcc_c_t));
-	p = (char *)p + sizeof(fcc_c_t);
-
 	memcpy_fromio(p, fep->fcc.ep, sizeof(fcc_enet_t));
+	p = (char *)p + sizeof(fcc_enet_t);
 
+	memcpy_fromio(p, fep->fcc.fcccp, 1);
 	return 0;
 }
 
 int get_regs_len(struct net_device *dev)
 {
-	return sizeof(fcc_t) + sizeof(fcc_c_t) + sizeof(fcc_enet_t);
+	return sizeof(fcc_t) + sizeof(fcc_enet_t) + 1;
 }
 
 /* Some transmit errors cause the transmitter to shut
@@ -551,7 +592,7 @@ void tx_restart(struct net_device *dev)
 	udelay(10);
 	S32(fccp, fcc_gfmr, FCC_GFMR_ENT);
 
-	fcc_cr_cmd(fep, 0x0C, CPM_CR_RESTART_TX);
+	fcc_cr_cmd(fep, CPM_CR_RESTART_TX);
 }
 
 /*************************************************************************/
diff --git a/drivers/net/fs_enet/mac-fec.c b/drivers/net/fs_enet/mac-fec.c
index 04b4f80..d1259b5 100644
--- a/drivers/net/fs_enet/mac-fec.c
+++ b/drivers/net/fs_enet/mac-fec.c
@@ -43,6 +43,10 @@
 #include <asm/commproc.h>
 #endif
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_device.h>
+#endif
+
 #include "fs_enet.h"
 #include "fec.h"
 
@@ -95,6 +99,19 @@ static int whack_reset(fec_t * fecp)
 
 static int do_pd_setup(struct fs_enet_private *fep)
 {
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct of_device *ofdev = to_of_device(fep->dev);
+
+	fep->interrupt = of_irq_to_resource(ofdev->node, 0, NULL);
+	if (fep->interrupt == NO_IRQ)
+		return -EINVAL;
+
+	fep->fec.fecp = of_iomap(ofdev->node, 0);
+	if (!fep->fcc.fccp)
+		return -EINVAL;
+
+	return 0;
+#else
 	struct platform_device *pdev = to_platform_device(fep->dev); 
 	struct resource	*r;
 	
@@ -110,7 +127,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 		return -EINVAL;
 
 	return 0;
-	
+#endif
 }
 
 #define FEC_NAPI_RX_EVENT_MSK	(FEC_ENET_RXF | FEC_ENET_RXB)
@@ -317,7 +334,7 @@ static void restart(struct net_device *dev)
 	 * Clear any outstanding interrupt.
 	 */
 	FW(fecp, ievent, 0xffc0);
-#ifndef CONFIG_PPC_MERGE
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	FW(fecp, ivec, (fep->interrupt / 2) << 29);
 #else
 	FW(fecp, ivec, (virq_to_hw(fep->interrupt) / 2) << 29);
@@ -419,7 +436,7 @@ static void stop(struct net_device *dev)
 
 static void pre_request_irq(struct net_device *dev, int irq)
 {
-#ifndef CONFIG_PPC_MERGE
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	immap_t *immap = fs_enet_immap;
 	u32 siel;
 
diff --git a/drivers/net/fs_enet/mac-scc.c b/drivers/net/fs_enet/mac-scc.c
index 7540966..0cb0b71 100644
--- a/drivers/net/fs_enet/mac-scc.c
+++ b/drivers/net/fs_enet/mac-scc.c
@@ -43,6 +43,10 @@
 #include <asm/commproc.h>
 #endif
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
+
 #include "fs_enet.h"
 
 /*************************************************/
@@ -89,27 +93,38 @@
 
 static inline int scc_cr_cmd(struct fs_enet_private *fep, u32 op)
 {
-	cpm8xx_t *cpmp = &((immap_t *)fs_enet_immap)->im_cpm;
-	u32 v, ch;
-	int i = 0;
+	const struct fs_platform_info *fpi = fep->fpi;
+	int i;
 
-	ch = fep->scc.idx << 2;
-	v = mk_cr_cmd(ch, op);
-	W16(cpmp, cp_cpcr, v | CPM_CR_FLG);
+	W16(cpmp, cp_cpcr, fpi->cp_command | CPM_CR_FLG | (op << 8));
 	for (i = 0; i < MAX_CR_CMD_LOOPS; i++)
 		if ((R16(cpmp, cp_cpcr) & CPM_CR_FLG) == 0)
-			break;
+			return 0;
 
-	if (i >= MAX_CR_CMD_LOOPS) {
-		printk(KERN_ERR "%s(): Not able to issue CPM command\n",
-			__FUNCTION__);
-		return 1;
-	}
-	return 0;
+	printk(KERN_ERR "%s(): Not able to issue CPM command\n",
+		__FUNCTION__);
+	return 1;
 }
 
 static int do_pd_setup(struct fs_enet_private *fep)
 {
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct of_device *ofdev = to_of_device(fep->dev);
+
+	fep->interrupt = of_irq_to_resource(ofdev->node, 0, NULL);
+	if (fep->interrupt == NO_IRQ)
+		return -EINVAL;
+
+	fep->scc.sccp = of_iomap(ofdev->node, 0);
+	if (!fep->scc.sccp)
+		return -EINVAL;
+
+	fep->scc.ep = of_iomap(ofdev->node, 1);
+	if (!fep->scc.ep) {
+		iounmap(fep->scc.sccp);
+		return -EINVAL;
+	}
+#else
 	struct platform_device *pdev = to_platform_device(fep->dev);
 	struct resource *r;
 
@@ -129,6 +144,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 
 	if (fep->scc.ep == NULL)
 		return -EINVAL;
+#endif
 
 	return 0;
 }
@@ -141,12 +157,17 @@ static int do_pd_setup(struct fs_enet_private *fep)
 static int setup_data(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	const struct fs_platform_info *fpi = fep->fpi;
+
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct fs_platform_info *fpi = fep->fpi;
 
 	fep->scc.idx = fs_get_scc_index(fpi->fs_no);
-	if ((unsigned int)fep->fcc.idx > 4)	/* max 4 SCCs */
+	if ((unsigned int)fep->fcc.idx >= 4) /* max 4 SCCs */
 		return -EINVAL;
 
+	fpi->cp_command = fep->fcc.idx << 6;
+#endif
+
 	do_pd_setup(fep);
 
 	fep->scc.hthi = 0;
@@ -154,7 +175,7 @@ static int setup_data(struct net_device *dev)
 
 	fep->ev_napi_rx = SCC_NAPI_RX_EVENT_MSK;
 	fep->ev_rx = SCC_RX_EVENT;
-	fep->ev_tx = SCC_TX_EVENT;
+	fep->ev_tx = SCC_TX_EVENT | SCCE_ENET_TXE;
 	fep->ev_err = SCC_ERR_EVENT_MSK;
 
 	return 0;
@@ -395,7 +416,7 @@ static void stop(struct net_device *dev)
 
 static void pre_request_irq(struct net_device *dev, int irq)
 {
-#ifndef CONFIG_PPC_MERGE
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	immap_t *immap = fs_enet_immap;
 	u32 siel;
 
diff --git a/drivers/net/fs_enet/mii-bitbang.c b/drivers/net/fs_enet/mii-bitbang.c
index d384010..9262e31 100644
--- a/drivers/net/fs_enet/mii-bitbang.c
+++ b/drivers/net/fs_enet/mii-bitbang.c
@@ -14,301 +14,267 @@
 
 
 #include <linux/module.h>
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/string.h>
-#include <linux/ptrace.h>
-#include <linux/errno.h>
+#include <linux/init.h>
 #include <linux/ioport.h>
 #include <linux/slab.h>
 #include <linux/interrupt.h>
-#include <linux/init.h>
-#include <linux/delay.h>
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
-#include <linux/skbuff.h>
-#include <linux/spinlock.h>
 #include <linux/mii.h>
-#include <linux/ethtool.h>
-#include <linux/bitops.h>
 #include <linux/platform_device.h>
+#include <linux/mdio-bitbang.h>
 
-#include <asm/pgtable.h>
-#include <asm/irq.h>
-#include <asm/uaccess.h>
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
 
 #include "fs_enet.h"
 
-static int bitbang_prep_bit(u8 **datp, u8 *mskp,
-		struct fs_mii_bit *mii_bit)
-{
-	void *dat;
-	int adv;
-	u8 msk;
-
-	dat = (void*) mii_bit->offset;
-
-	adv = mii_bit->bit >> 3;
-	dat = (char *)dat + adv;
-
-	msk = 1 << (7 - (mii_bit->bit & 7));
-
-	*datp = dat;
-	*mskp = msk;
-
-	return 0;
-}
+struct bb_info {
+	struct mdio_bitbang_ctrl ctrl;
+	__be32 __iomem *dir;
+	__be32 __iomem *dat;
+	u32 mdio_msk;
+	u32 mdc_msk;
+	int delay;
+};
 
-static inline void bb_set(u8 *p, u8 m)
+/* FIXME: If any other users of GPIO crop up, then these will have to
+ * have some sort of global synchronization to avoid races with other
+ * pins on the same port.  The ideal solution would probably be to
+ * bind the ports to a GPIO driver, and have this be a client of it.
+ */
+static inline void bb_set(u32 __iomem *p, u32 m)
 {
-	out_8(p, in_8(p) | m);
+	out_be32(p, in_be32(p) | m);
 }
 
-static inline void bb_clr(u8 *p, u8 m)
+static inline void bb_clr(u32 __iomem *p, u32 m)
 {
-	out_8(p, in_8(p) & ~m);
+	out_be32(p, in_be32(p) & ~m);
 }
 
-static inline int bb_read(u8 *p, u8 m)
+static inline int bb_read(u32 __iomem *p, u32 m)
 {
-	return (in_8(p) & m) != 0;
+	return (in_be32(p) & m) != 0;
 }
 
-static inline void mdio_active(struct bb_info *bitbang)
+static inline void mdio_dir(struct mdio_bitbang_ctrl *ctrl, int dir)
 {
-	bb_set(bitbang->mdio_dir, bitbang->mdio_dir_msk);
-}
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
 
-static inline void mdio_tristate(struct bb_info *bitbang )
-{
-	bb_clr(bitbang->mdio_dir, bitbang->mdio_dir_msk);
+	if (dir)
+		bb_set(bitbang->dir, bitbang->mdio_msk);
+	else
+		bb_clr(bitbang->dir, bitbang->mdio_msk);
 }
 
-static inline int mdio_read(struct bb_info *bitbang )
+static inline int mdio_read(struct mdio_bitbang_ctrl *ctrl)
 {
-	return bb_read(bitbang->mdio_dat, bitbang->mdio_dat_msk);
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
+	return bb_read(bitbang->dat, bitbang->mdio_msk);
 }
 
-static inline void mdio(struct bb_info *bitbang , int what)
+static inline void mdio(struct mdio_bitbang_ctrl *ctrl, int what)
 {
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
+
 	if (what)
-		bb_set(bitbang->mdio_dat, bitbang->mdio_dat_msk);
+		bb_set(bitbang->dat, bitbang->mdio_msk);
 	else
-		bb_clr(bitbang->mdio_dat, bitbang->mdio_dat_msk);
+		bb_clr(bitbang->dat, bitbang->mdio_msk);
 }
 
-static inline void mdc(struct bb_info *bitbang , int what)
+static inline void mdc(struct mdio_bitbang_ctrl *ctrl, int what)
 {
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
+
 	if (what)
-		bb_set(bitbang->mdc_dat, bitbang->mdc_msk);
+		bb_set(bitbang->dat, bitbang->mdc_msk);
 	else
-		bb_clr(bitbang->mdc_dat, bitbang->mdc_msk);
+		bb_clr(bitbang->dat, bitbang->mdc_msk);
 }
 
-static inline void mii_delay(struct bb_info *bitbang )
+static struct mdio_bitbang_ops bb_ops = {
+	.owner = THIS_MODULE,
+	.set_mdc = mdc,
+	.set_mdio_dir = mdio_dir,
+	.set_mdio_data = mdio,
+	.get_mdio_data = mdio_read,
+};
+
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+static int __devinit fs_mii_bitbang_init(struct mii_bus *bus,
+                                         struct device_node *np)
 {
-	udelay(bitbang->delay);
+	struct resource res;
+	const u32 *data;
+	int mdio_pin, mdc_pin, len;
+	struct bb_info *bitbang = bus->priv;
+
+	int ret = of_address_to_resource(np, 0, &res);
+	if (ret)
+		return ret;
+
+	if (res.end - res.start < 13)
+		return -ENODEV;
+
+	/* This should really encode the pin number as well, but all
+	 * we get is an int, and the odds of multiple bitbang mdio buses
+	 * is low enough that it's not worth going too crazy.
+	 */
+	bus->id = res.start;
+
+	data = of_get_property(np, "fsl,mdio-pin", &len);
+	if (!data || len != 4)
+		return -ENODEV;
+	mdio_pin = *data;
+
+	data = of_get_property(np, "fsl,mdc-pin", &len);
+	if (!data || len != 4)
+		return -ENODEV;
+	mdc_pin = *data;
+
+	bitbang->dir = ioremap(res.start, res.end - res.start + 1);
+	if (!bitbang->dir)
+		return -ENOMEM;
+
+	bitbang->dat = bitbang->dir + 4;
+	bitbang->mdio_msk = 1 << (31 - mdio_pin);
+	bitbang->mdc_msk = 1 << (31 - mdc_pin);
+	bitbang->delay = 1; /* 1 us between operations */
+
+	return 0;
 }
 
-/* Utility to send the preamble, address, and register (common to read and write). */
-static void bitbang_pre(struct bb_info *bitbang , int read, u8 addr, u8 reg)
+static void __devinit add_phy(struct mii_bus *bus, struct device_node *np)
 {
-	int j;
-
-	/*
-	 * Send a 32 bit preamble ('1's) with an extra '1' bit for good measure.
-	 * The IEEE spec says this is a PHY optional requirement.  The AMD
-	 * 79C874 requires one after power up and one after a MII communications
-	 * error.  This means that we are doing more preambles than we need,
-	 * but it is safer and will be much more robust.
-	 */
+	const u32 *data;
+	int len, id, irq;
 
-	mdio_active(bitbang);
-	mdio(bitbang, 1);
-	for (j = 0; j < 32; j++) {
-		mdc(bitbang, 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-	}
+	data = of_get_property(np, "reg", &len);
+	if (!data || len != 4)
+		return;
 
-	/* send the start bit (01) and the read opcode (10) or write (10) */
-	mdc(bitbang, 0);
-	mdio(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, read);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, !read);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-
-	/* send the PHY address */
-	for (j = 0; j < 5; j++) {
-		mdc(bitbang, 0);
-		mdio(bitbang, (addr & 0x10) != 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		addr <<= 1;
-	}
+	id = *data;
+	bus->phy_mask &= ~(1 << id);
 
-	/* send the register address */
-	for (j = 0; j < 5; j++) {
-		mdc(bitbang, 0);
-		mdio(bitbang, (reg & 0x10) != 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		reg <<= 1;
-	}
+	irq = of_irq_to_resource(np, 0, NULL);
+	if (irq != NO_IRQ)
+		bus->irq[id] = irq;
 }
 
-static int fs_enet_mii_bb_read(struct mii_bus *bus , int phy_id, int location)
+static int __devinit fs_enet_mdio_probe(struct of_device *ofdev,
+                                        const struct of_device_id *match)
 {
-	u16 rdreg;
-	int ret, j;
-	u8 addr = phy_id & 0xff;
-	u8 reg = location & 0xff;
-	struct bb_info* bitbang = bus->priv;
-
-	bitbang_pre(bitbang, 1, addr, reg);
-
-	/* tri-state our MDIO I/O pin so we can read */
-	mdc(bitbang, 0);
-	mdio_tristate(bitbang);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-
-	/* check the turnaround bit: the PHY should be driving it to zero */
-	if (mdio_read(bitbang) != 0) {
-		/* PHY didn't drive TA low */
-		for (j = 0; j < 32; j++) {
-			mdc(bitbang, 0);
-			mii_delay(bitbang);
-			mdc(bitbang, 1);
-			mii_delay(bitbang);
-		}
-		ret = -1;
+	struct device_node *np = NULL;
+	struct mii_bus *new_bus;
+	struct bb_info *bitbang;
+	int ret = -ENOMEM;
+	int i;
+
+	bitbang = kzalloc(sizeof(struct bb_info), GFP_KERNEL);
+	if (!bitbang)
 		goto out;
-	}
 
-	mdc(bitbang, 0);
-	mii_delay(bitbang);
-
-	/* read 16 bits of register data, MSB first */
-	rdreg = 0;
-	for (j = 0; j < 16; j++) {
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		rdreg <<= 1;
-		rdreg |= mdio_read(bitbang);
-		mdc(bitbang, 0);
-		mii_delay(bitbang);
-	}
+	bitbang->ctrl.ops = &bb_ops;
+
+	new_bus = alloc_mdio_bitbang(&bitbang->ctrl);
+	if (!new_bus)
+		goto out_free_priv;
 
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
+	new_bus->name = "CPM2 Bitbanged MII",
 
-	ret = rdreg;
+	ret = fs_mii_bitbang_init(new_bus, ofdev->node);
+	if (ret)
+		goto out_free_bus;
+
+	new_bus->phy_mask = ~0;
+	new_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
+	if (!new_bus->irq)
+		goto out_unmap_regs;
+
+	for (i = 0; i < PHY_MAX_ADDR; i++)
+		new_bus->irq[i] = -1;
+
+	while ((np = of_get_next_child(ofdev->node, np)))
+		if (!strcmp(np->type, "ethernet-phy"))
+			add_phy(new_bus, np);
+
+	new_bus->dev = &ofdev->dev;
+	dev_set_drvdata(&ofdev->dev, new_bus);
+
+	ret = mdiobus_register(new_bus);
+	if (ret)
+		goto out_free_irqs;
+
+	return 0;
+
+out_free_irqs:
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(new_bus->irq);
+out_unmap_regs:
+	iounmap(bitbang->dir);
+out_free_bus:
+	kfree(new_bus);
+out_free_priv:
+	free_mdio_bitbang(new_bus);
 out:
 	return ret;
 }
 
-static int fs_enet_mii_bb_write(struct mii_bus *bus, int phy_id, int location, u16 val)
+static int fs_enet_mdio_remove(struct of_device *ofdev)
 {
-	int j;
-	struct bb_info* bitbang = bus->priv;
-
-	u8 addr = phy_id & 0xff;
-	u8 reg = location & 0xff;
-	u16 value = val & 0xffff;
-
-	bitbang_pre(bitbang, 0, addr, reg);
-
-	/* send the turnaround (10) */
-	mdc(bitbang, 0);
-	mdio(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-
-	/* write 16 bits of register data, MSB first */
-	for (j = 0; j < 16; j++) {
-		mdc(bitbang, 0);
-		mdio(bitbang, (value & 0x8000) != 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		value <<= 1;
-	}
+	struct mii_bus *bus = dev_get_drvdata(&ofdev->dev);
+	struct bb_info *bitbang = bus->priv;
+
+	mdiobus_unregister(bus);
+	free_mdio_bitbang(bus);
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(bus->irq);
+	iounmap(bitbang->dir);
+	kfree(bitbang);
+	kfree(bus);
 
-	/*
-	 * Tri-state the MDIO line.
-	 */
-	mdio_tristate(bitbang);
-	mdc(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
 	return 0;
 }
 
-static int fs_enet_mii_bb_reset(struct mii_bus *bus)
+static struct of_device_id fs_enet_mdio_bb_match[] = {
+	{
+		.compatible = "fsl,cpm2-mdio-bitbang",
+	},
+	{},
+};
+
+static struct of_platform_driver fs_enet_bb_mdio_driver = {
+	.name = "fsl-bb-mdio",
+	.match_table = fs_enet_mdio_bb_match,
+	.probe = fs_enet_mdio_probe,
+	.remove = fs_enet_mdio_remove,
+};
+
+int fs_enet_mdio_bb_init(void)
 {
-	/*nothing here - dunno how to reset it*/
-	return 0;
+	return of_register_platform_driver(&fs_enet_bb_mdio_driver);
 }
 
-static int fs_mii_bitbang_init(struct bb_info *bitbang, struct fs_mii_bb_platform_info* fmpi)
+void fs_enet_mdio_bb_exit(void)
 {
-	int r;
-
+	of_unregister_platform_driver(&fs_enet_bb_mdio_driver);
+}
+#else
+static int __devinit fs_mii_bitbang_init(struct bb_info *bitbang,
+                                         struct fs_mii_bb_platform_info *fmpi)
+{
+	bitbang->dir = (u32 __iomem *)fmpi->mdio_dir.offset;
+	bitbang->dat = (u32 __iomem *)fmpi->mdio_dat.offset;
+	bitbang->mdio_msk = 1U << (31 - fmpi->mdio_dat.bit);
+	bitbang->mdc_msk = 1U << (31 - fmpi->mdc_dat.bit);
 	bitbang->delay = fmpi->delay;
 
-	r = bitbang_prep_bit(&bitbang->mdio_dir,
-			 &bitbang->mdio_dir_msk,
-			 &fmpi->mdio_dir);
-	if (r != 0)
-		return r;
-
-	r = bitbang_prep_bit(&bitbang->mdio_dat,
-			 &bitbang->mdio_dat_msk,
-			 &fmpi->mdio_dat);
-	if (r != 0)
-		return r;
-
-	r = bitbang_prep_bit(&bitbang->mdc_dat,
-			 &bitbang->mdc_msk,
-			 &fmpi->mdc_dat);
-	if (r != 0)
-		return r;
-
 	return 0;
 }
 
-
 static int __devinit fs_enet_mdio_probe(struct device *dev)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -320,20 +286,19 @@ static int __devinit fs_enet_mdio_probe(struct device *dev)
 	if (NULL == dev)
 		return -EINVAL;
 
-	new_bus = kzalloc(sizeof(struct mii_bus), GFP_KERNEL);
+	bitbang = kzalloc(sizeof(struct bb_info), GFP_KERNEL);
 
-	if (NULL == new_bus)
+	if (NULL == bitbang)
 		return -ENOMEM;
 
-	bitbang = kzalloc(sizeof(struct bb_info), GFP_KERNEL);
+	bitbang->ctrl.ops = &bb_ops;
 
-	if (NULL == bitbang)
+	new_bus = alloc_mdio_bitbang(&bitbang->ctrl);
+
+	if (NULL == new_bus)
 		return -ENOMEM;
 
 	new_bus->name = "BB MII Bus",
-	new_bus->read = &fs_enet_mii_bb_read,
-	new_bus->write = &fs_enet_mii_bb_write,
-	new_bus->reset = &fs_enet_mii_bb_reset,
 	new_bus->id = pdev->id;
 
 	new_bus->phy_mask = ~0x9;
@@ -365,13 +330,12 @@ static int __devinit fs_enet_mdio_probe(struct device *dev)
 	return 0;
 
 bus_register_fail:
+	free_mdio_bitbang(new_bus);
 	kfree(bitbang);
-	kfree(new_bus);
 
 	return err;
 }
 
-
 static int fs_enet_mdio_remove(struct device *dev)
 {
 	struct mii_bus *bus = dev_get_drvdata(dev);
@@ -380,9 +344,7 @@ static int fs_enet_mdio_remove(struct device *dev)
 
 	dev_set_drvdata(dev, NULL);
 
-	iounmap((void *) (&bus->priv));
-	bus->priv = NULL;
-	kfree(bus);
+	free_mdio_bitbang(bus);
 
 	return 0;
 }
@@ -403,4 +365,4 @@ void fs_enet_mdio_bb_exit(void)
 {
 	driver_unregister(&fs_enet_bb_mdio_driver);
 }
-
+#endif
diff --git a/drivers/net/fs_enet/mii-fec.c b/drivers/net/fs_enet/mii-fec.c
index 53db696..7de3d05 100644
--- a/drivers/net/fs_enet/mii-fec.c
+++ b/drivers/net/fs_enet/mii-fec.c
@@ -36,6 +36,10 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
+
 #include "fs_enet.h"
 #include "fec.h"
 
@@ -47,6 +51,7 @@
 
 #define FEC_MII_LOOPS	10000
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static int match_has_phy (struct device *dev, void* data)
 {
 	struct platform_device* pdev = container_of(dev, struct platform_device, dev);
@@ -90,6 +95,7 @@ static int fs_mii_fec_init(struct fec_info* fec, struct fs_mii_fec_platform_info
 
 	return 0;
 }
+#endif
 
 static int fs_enet_fec_mii_read(struct mii_bus *bus , int phy_id, int location)
 {
@@ -145,6 +151,138 @@ static int fs_enet_fec_mii_reset(struct mii_bus *bus)
 	return 0;
 }
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+static void __devinit add_phy(struct mii_bus *bus, struct device_node *np)
+{
+	const u32 *data;
+	int len, id, irq;
+
+	data = of_get_property(np, "reg", &len);
+	if (!data || len != 4)
+		return;
+
+	id = *data;
+	bus->phy_mask &= ~(1 << id);
+
+	irq = of_irq_to_resource(np, 0, NULL);
+	if (irq != NO_IRQ)
+		bus->irq[id] = irq;
+}
+
+static int __devinit fs_enet_mdio_probe(struct of_device *ofdev,
+                                        const struct of_device_id *match)
+{
+	struct device_node *np = NULL;
+	struct resource res;
+	struct mii_bus *new_bus;
+	struct fec_info *fec;
+	int ret = -ENOMEM, i;
+
+	new_bus = kzalloc(sizeof(struct mii_bus), GFP_KERNEL);
+	if (!new_bus)
+		goto out;
+
+	fec = kzalloc(sizeof(struct fec_info), GFP_KERNEL);
+	if (!fec)
+		goto out_mii;
+
+	new_bus->priv = fec;
+	new_bus->name = "FEC MII Bus";
+	new_bus->read = &fs_enet_fec_mii_read;
+	new_bus->write = &fs_enet_fec_mii_write;
+	new_bus->reset = &fs_enet_fec_mii_reset;
+
+	ret = of_address_to_resource(ofdev->node, 0, &res);
+	if (ret)
+		return ret;
+
+	new_bus->id = res.start;
+
+	fec->fecp = ioremap(res.start, res.end - res.start + 1);
+	if (!fec->fecp)
+		goto out_fec;
+
+	fec->mii_speed = ((ppc_proc_freq + 4999999) / 5000000) << 1;
+
+	setbits32(&fec->fecp->fec_r_cntrl, FEC_RCNTRL_MII_MODE);
+	setbits32(&fec->fecp->fec_ecntrl, FEC_ECNTRL_PINMUX |
+	                                  FEC_ECNTRL_ETHER_EN);
+	out_be32(&fec->fecp->fec_ievent, FEC_ENET_MII);
+	out_be32(&fec->fecp->fec_mii_speed, fec->mii_speed);
+
+	new_bus->phy_mask = ~0;
+	new_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
+	if (!new_bus->irq)
+		goto out_unmap_regs;
+
+	for (i = 0; i < PHY_MAX_ADDR; i++)
+		new_bus->irq[i] = -1;
+
+	while ((np = of_get_next_child(ofdev->node, np)))
+		if (!strcmp(np->type, "ethernet-phy"))
+			add_phy(new_bus, np);
+
+	new_bus->dev = &ofdev->dev;
+	dev_set_drvdata(&ofdev->dev, new_bus);
+
+	ret = mdiobus_register(new_bus);
+	if (ret)
+		goto out_free_irqs;
+
+	return 0;
+
+out_free_irqs:
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(new_bus->irq);
+out_unmap_regs:
+	iounmap(fec->fecp);
+out_fec:
+	kfree(fec);
+out_mii:
+	kfree(new_bus);
+out:
+	return ret;
+}
+
+static int fs_enet_mdio_remove(struct of_device *ofdev)
+{
+	struct mii_bus *bus = dev_get_drvdata(&ofdev->dev);
+	struct fec_info *fec = bus->priv;
+
+	mdiobus_unregister(bus);
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(bus->irq);
+	iounmap(fec->fecp);
+	kfree(fec);
+	kfree(bus);
+
+	return 0;
+}
+
+static struct of_device_id fs_enet_mdio_fec_match[] = {
+	{
+		.compatible = "fsl,pq1-fec-mdio",
+	},
+	{},
+};
+
+static struct of_platform_driver fs_enet_fec_mdio_driver = {
+	.name = "fsl-fec-mdio",
+	.match_table = fs_enet_mdio_fec_match,
+	.probe = fs_enet_mdio_probe,
+	.remove = fs_enet_mdio_remove,
+};
+
+int fs_enet_mdio_fec_init(void)
+{
+	return of_register_platform_driver(&fs_enet_fec_mdio_driver);
+}
+
+void fs_enet_mdio_fec_exit(void)
+{
+	of_unregister_platform_driver(&fs_enet_fec_mdio_driver);
+}
+#else
 static int __devinit fs_enet_fec_mdio_probe(struct device *dev)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -235,4 +373,4 @@ void fs_enet_mdio_fec_exit(void)
 {
 	driver_unregister(&fs_enet_fec_mdio_driver);
 }
-
+#endif
diff --git a/include/linux/fs_enet_pd.h b/include/linux/fs_enet_pd.h
index 543cd3c..3b9e718 100644
--- a/include/linux/fs_enet_pd.h
+++ b/include/linux/fs_enet_pd.h
@@ -119,6 +119,7 @@ struct fs_platform_info {
 
 	u32 cp_page;		/* CPM page */
 	u32 cp_block;		/* CPM sblock */
+	u32 cp_command;		/* CPM page/sblock/mcn */
 
 	u32 clk_trx;		/* some stuff for pins & mux configuration*/
 	u32 clk_rx;
@@ -133,7 +134,11 @@ struct fs_platform_info {
 	u32 device_flags;
 
 	int phy_addr;		/* the phy address (-1 no phy) */
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	char bus_id[16];
+#else
 	const char*	bus_id;
+#endif
 	int phy_irq;		/* the phy irq (if it exists)  */
 
 	const struct fs_mii_bus_info *bus_info;
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 1/7] Generic bitbanged MDIO library
From: Scott Wood @ 2007-08-17 17:53 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

Previously, bitbanged MDIO was only supported in individual
hardware-specific drivers.  This code factors out the higher level
protocol implementation, reducing the hardware-specific portion to
functions setting direction, data, and clock.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/phy/Kconfig        |    9 ++
 drivers/net/phy/Makefile       |    1 +
 drivers/net/phy/mdio-bitbang.c |  187 ++++++++++++++++++++++++++++++++++++++++
 include/linux/mdio-bitbang.h   |   42 +++++++++
 4 files changed, 239 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/phy/mdio-bitbang.c
 create mode 100644 include/linux/mdio-bitbang.h

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index dd09011..72a98dd 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -76,4 +76,13 @@ config FIXED_MII_100_FDX
 	bool "Emulation for 100M Fdx fixed PHY behavior"
 	depends on FIXED_PHY
 
+config MDIO_BITBANG
+	tristate "Support for bitbanged MDIO buses"
+	help
+	  This module implements the MDIO bus protocol in software,
+	  for use by low level drivers that export the ability to
+	  drive the relevant pins.
+
+	  If in doubt, say N.
+
 endif # PHYLIB
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 8885650..3d6cc7b 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_VITESSE_PHY)	+= vitesse.o
 obj-$(CONFIG_BROADCOM_PHY)	+= broadcom.o
 obj-$(CONFIG_ICPLUS_PHY)	+= icplus.o
 obj-$(CONFIG_FIXED_PHY)		+= fixed.o
+obj-$(CONFIG_MDIO_BITBANG)	+= mdio-bitbang.o
diff --git a/drivers/net/phy/mdio-bitbang.c b/drivers/net/phy/mdio-bitbang.c
new file mode 100644
index 0000000..9bfc9ce
--- /dev/null
+++ b/drivers/net/phy/mdio-bitbang.c
@@ -0,0 +1,187 @@
+/*
+ * Bitbanged MDIO support.
+ *
+ * Author: Scott Wood <scottwood@freescale.com>
+ * Copyright (c) 2007 Freescale Semiconductor
+ *
+ * Based on CPM2 MDIO code which is:
+ *
+ * Copyright (c) 2003 Intracom S.A.
+ *  by Pantelis Antoniou <panto@intracom.gr>
+ *
+ * 2005 (c) MontaVista Software, Inc.
+ * Vitaly Bordug <vbordug@ru.mvista.com>
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2. This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/mdio-bitbang.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/delay.h>
+
+#define MDIO_READ 1
+#define MDIO_WRITE 0
+
+#define MDIO_SETUP_TIME 10
+#define MDIO_HOLD_TIME 10
+
+/* Minimum MDC period is 400 ns, plus some margin for error */
+#define MDIO_DELAY 250
+
+/* The PHY may take up to 300 ns to produce data, plus some margin
+ * for error.
+ */
+#define MDIO_READ_DELAY 350
+
+/* MDIO must already be configured as output. */
+static void mdio_bitbang_send_bit(struct mdio_bitbang_ctrl *ctrl, int val)
+{
+	const struct mdio_bitbang_ops *ops = ctrl->ops;
+
+	ops->set_mdio_data(ctrl, val);
+	ndelay(MDIO_DELAY);
+	ops->set_mdc(ctrl, 1);
+	ndelay(MDIO_DELAY);
+	ops->set_mdc(ctrl, 0);
+}
+
+/* MDIO must already be configured as input. */
+static int mdio_bitbang_get_bit(struct mdio_bitbang_ctrl *ctrl)
+{
+	const struct mdio_bitbang_ops *ops = ctrl->ops;
+
+	ndelay(MDIO_DELAY);
+	ops->set_mdc(ctrl, 1);
+	ndelay(MDIO_READ_DELAY);
+	ops->set_mdc(ctrl, 0);
+
+	return ops->get_mdio_data(ctrl);
+}
+
+/* MDIO must already be configured as output. */
+static void mdio_bitbang_send_num(struct mdio_bitbang_ctrl *ctrl,
+                                  u16 val, int bits)
+{
+	int i;
+
+	for (i = bits - 1; i >= 0; i--)
+		mdio_bitbang_send_bit(ctrl, (val >> i) & 1);
+}
+
+/* MDIO must already be configured as input. */
+static u16 mdio_bitbang_get_num(struct mdio_bitbang_ctrl *ctrl, int bits)
+{
+	int i;
+	u16 ret = 0;
+
+	for (i = bits - 1; i >= 0; i--) {
+		ret <<= 1;
+		ret |= mdio_bitbang_get_bit(ctrl);
+	}
+
+	return ret;
+}
+
+/* Utility to send the preamble, address, and
+ * register (common to read and write).
+ */
+static void mdio_bitbang_cmd(struct mdio_bitbang_ctrl *ctrl,
+                             int read, u8 phy, u8 reg)
+{
+	const struct mdio_bitbang_ops *ops = ctrl->ops;
+	int i;
+
+	ops->set_mdio_dir(ctrl, 1);
+
+	/*
+	 * Send a 32 bit preamble ('1's) with an extra '1' bit for good
+	 * measure.  The IEEE spec says this is a PHY optional
+	 * requirement.  The AMD 79C874 requires one after power up and
+	 * one after a MII communications error.  This means that we are
+	 * doing more preambles than we need, but it is safer and will be
+	 * much more robust.
+	 */
+
+	for (i = 0; i < 32; i++)
+		mdio_bitbang_send_bit(ctrl, 1);
+
+	/* send the start bit (01) and the read opcode (10) or write (10) */
+	mdio_bitbang_send_bit(ctrl, 0);
+	mdio_bitbang_send_bit(ctrl, 1);
+	mdio_bitbang_send_bit(ctrl, read);
+	mdio_bitbang_send_bit(ctrl, !read);
+
+	mdio_bitbang_send_num(ctrl, phy, 5);
+	mdio_bitbang_send_num(ctrl, reg, 5);
+}
+
+
+static int mdio_bitbang_read(struct mii_bus *bus, int phy, int reg)
+{
+	struct mdio_bitbang_ctrl *ctrl = bus->priv;
+	int ret, i;
+
+	mdio_bitbang_cmd(ctrl, MDIO_READ, phy, reg);
+	ctrl->ops->set_mdio_dir(ctrl, 0);
+
+	/* check the turnaround bit: the PHY should be driving it to zero */
+	if (mdio_bitbang_get_bit(ctrl) != 0) {
+		/* PHY didn't drive TA low -- flush any bits it
+		 * may be trying to send.
+		 */
+		for (i = 0; i < 32; i++)
+			mdio_bitbang_get_bit(ctrl);
+
+		return 0xffff;
+	}
+
+	ret = mdio_bitbang_get_num(ctrl, 16);
+	mdio_bitbang_get_bit(ctrl);
+	return ret;
+}
+
+static int mdio_bitbang_write(struct mii_bus *bus, int phy, int reg, u16 val)
+{
+	struct mdio_bitbang_ctrl *ctrl = bus->priv;
+
+	mdio_bitbang_cmd(ctrl, MDIO_WRITE, phy, reg);
+
+	/* send the turnaround (10) */
+	mdio_bitbang_send_bit(ctrl, 1);
+	mdio_bitbang_send_bit(ctrl, 0);
+
+	mdio_bitbang_send_num(ctrl, val, 16);
+
+	ctrl->ops->set_mdio_dir(ctrl, 0);
+	mdio_bitbang_get_bit(ctrl);
+	return 0;
+}
+
+struct mii_bus *alloc_mdio_bitbang(struct mdio_bitbang_ctrl *ctrl)
+{
+	struct mii_bus *bus;
+
+	bus = kzalloc(sizeof(struct mii_bus), GFP_KERNEL);
+	if (!bus)
+		return NULL;
+
+	__module_get(ctrl->ops->owner);
+
+	bus->read = mdio_bitbang_read;
+	bus->write = mdio_bitbang_write;
+	bus->priv = ctrl;
+
+	return bus;
+}
+
+void free_mdio_bitbang(struct mii_bus *bus)
+{
+	struct mdio_bitbang_ctrl *ctrl = bus->priv;
+
+	module_put(ctrl->ops->owner);
+	kfree(bus);
+}
diff --git a/include/linux/mdio-bitbang.h b/include/linux/mdio-bitbang.h
new file mode 100644
index 0000000..63a11dd
--- /dev/null
+++ b/include/linux/mdio-bitbang.h
@@ -0,0 +1,42 @@
+#ifndef __LINUX_MDIO_BITBANG_H
+#define __LINUX_MDIO_BITBANG_H
+
+#include <linux/phy.h>
+#include <linux/module.h>
+
+struct mdio_bitbang_ctrl;
+
+struct mdio_bitbang_ops {
+	struct module *owner;
+
+	/* Set the Management Data Clock high if level is one,
+	 * low if level is zero.
+	 */
+	void (*set_mdc)(struct mdio_bitbang_ctrl *ctrl, int level);
+
+	/* Configure the Management Data I/O pin as an input if
+	 * "output" is zero, or an output if "output" is one.
+	 */
+	void (*set_mdio_dir)(struct mdio_bitbang_ctrl *ctrl, int output);
+
+	/* Set the Management Data I/O pin high if value is one,
+	 * low if "value" is zero.  This may only be called
+	 * when the MDIO pin is configured as an output.
+	 */
+	void (*set_mdio_data)(struct mdio_bitbang_ctrl *ctrl, int value);
+
+	/* Retrieve the state Management Data I/O pin. */
+	int (*get_mdio_data)(struct mdio_bitbang_ctrl *ctrl);
+};
+
+struct mdio_bitbang_ctrl {
+	const struct mdio_bitbang_ops *ops;
+};
+
+/* The returned bus is not yet registered with the phy layer. */
+struct mii_bus *alloc_mdio_bitbang(struct mdio_bitbang_ctrl *ctrl);
+
+/* The bus must already have been unregistered. */
+void free_mdio_bitbang(struct mii_bus *bus);
+
+#endif
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 6/7] fs_enet: Be an of_platform device when CONFIG_PPC_CPM_NEW_BINDING is set.
From: Scott Wood @ 2007-08-17 17:54 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

The existing OF glue code was crufty and broken.  Rather than fix it, it
will be removed, and the ethernet driver now talks to the device tree
directly.

The old, non-CONFIG_PPC_CPM_NEW_BINDING code can go away once CPM
platforms are dropped from arch/ppc (which will hopefully be soon), and
existing arch/powerpc boards that I wasn't able to test on for this
patchset get converted (which should be even sooner).

mii-bitbang.c now also uses the generic bitbang code.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/fs_enet/Kconfig        |    1 +
 drivers/net/fs_enet/fs_enet-main.c |  286 ++++++++++++++++++++++--
 drivers/net/fs_enet/fs_enet.h      |   55 +-----
 drivers/net/fs_enet/mac-fcc.c      |   89 ++++++--
 drivers/net/fs_enet/mac-fec.c      |   23 ++-
 drivers/net/fs_enet/mac-scc.c      |   55 +++--
 drivers/net/fs_enet/mii-bitbang.c  |  438 ++++++++++++++++-------------------
 drivers/net/fs_enet/mii-fec.c      |  140 ++++++++++++-
 include/linux/fs_enet_pd.h         |    5 +
 9 files changed, 740 insertions(+), 352 deletions(-)

diff --git a/drivers/net/fs_enet/Kconfig b/drivers/net/fs_enet/Kconfig
index e27ee21..2765e49 100644
--- a/drivers/net/fs_enet/Kconfig
+++ b/drivers/net/fs_enet/Kconfig
@@ -11,6 +11,7 @@ config FS_ENET_HAS_SCC
 config FS_ENET_HAS_FCC
 	bool "Chip has an FCC usable for ethernet"
 	depends on FS_ENET && CPM2
+	select MDIO_BITBANG
 	default y
 
 config FS_ENET_HAS_FEC
diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index a4b76cd..33cdabf 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -44,12 +44,18 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
+
 #include "fs_enet.h"
 
 /*************************************************/
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static char version[] __devinitdata =
     DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")" "\n";
+#endif
 
 MODULE_AUTHOR("Pantelis Antoniou <panto@intracom.gr>");
 MODULE_DESCRIPTION("Freescale Ethernet Driver");
@@ -953,6 +959,7 @@ static int fs_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 extern int fs_mii_connect(struct net_device *dev);
 extern void fs_mii_disconnect(struct net_device *dev);
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static struct net_device *fs_init_instance(struct device *dev,
 		struct fs_platform_info *fpi)
 {
@@ -1132,6 +1139,7 @@ static int fs_cleanup_instance(struct net_device *ndev)
 
 	return 0;
 }
+#endif
 
 /**************************************************************************************/
 
@@ -1140,35 +1148,279 @@ void *fs_enet_immap = NULL;
 
 static int setup_immap(void)
 {
-	phys_addr_t paddr = 0;
-	unsigned long size = 0;
-
 #ifdef CONFIG_CPM1
-	paddr = IMAP_ADDR;
-	size = 0x10000;	/* map 64K */
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	fs_enet_immap = mpc8xx_immr;
+#else
+	fs_enet_immap = ioremap(IMAP_ADDR, 0x4000);
+	WARN_ON(!fs_enet_immap);
 #endif
-
-#ifdef CONFIG_CPM2
-	paddr = CPM_MAP_ADDR;
-	size = 0x40000;	/* map 256 K */
+#elif defined(CONFIG_CPM2)
+	fs_enet_immap = cpm2_immr;
 #endif
-	fs_enet_immap = ioremap(paddr, size);
-	if (fs_enet_immap == NULL)
-		return -EBADF;	/* XXX ahem; maybe just BUG_ON? */
 
 	return 0;
 }
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static void cleanup_immap(void)
 {
-	if (fs_enet_immap != NULL) {
-		iounmap(fs_enet_immap);
-		fs_enet_immap = NULL;
-	}
+#if defined(CONFIG_CPM1)
+	iounmap(fs_enet_immap);
+#endif
 }
+#endif
 
 /**************************************************************************************/
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+static int __devinit find_phy(struct device_node *np,
+                              struct fs_platform_info *fpi)
+{
+	struct device_node *phynode, *mdionode;
+	struct resource res;
+	int ret = 0, len;
+
+	const u32 *data = of_get_property(np, "phy-handle", &len);
+	if (!data || len != 4)
+		return -EINVAL;
+
+	phynode = of_find_node_by_phandle(*data);
+	if (!phynode)
+		return -EINVAL;
+
+	mdionode = of_get_parent(phynode);
+	if (!phynode)
+		goto out_put_phy;
+
+	ret = of_address_to_resource(mdionode, 0, &res);
+	if (ret)
+		goto out_put_mdio;
+
+	data = of_get_property(phynode, "reg", &len);
+	if (!data || len != 4)
+		goto out_put_mdio;
+
+	snprintf(fpi->bus_id, 16, PHY_ID_FMT, res.start, *data);
+
+out_put_mdio:
+	of_node_put(mdionode);
+out_put_phy:
+	of_node_put(phynode);
+	return ret;
+}
+
+#ifdef CONFIG_FS_ENET_HAS_FEC
+#define IS_FEC(match) ((match)->data == &fs_fec_ops)
+#else
+#define IS_FEC(match) 0
+#endif
+
+static int __devinit fs_enet_probe(struct of_device *ofdev,
+                                   const struct of_device_id *match)
+{
+	struct net_device *ndev;
+	struct fs_enet_private *fep;
+	struct fs_platform_info *fpi;
+	const u32 *data;
+	const u8 *mac_addr;
+	int privsize, len, ret = -ENODEV;
+
+	fpi = kzalloc(sizeof(*fpi), GFP_KERNEL);
+	if (!fpi)
+		return -ENOMEM;
+
+	if (!IS_FEC(match)) {
+		data = of_get_property(ofdev->node, "fsl,cpm-command", &len);
+		if (!data || len != 4)
+			goto out_free_fpi;
+
+		fpi->cp_command = *data;
+	}
+
+	fpi->rx_ring = 32;
+	fpi->tx_ring = 32;
+	fpi->rx_copybreak = 240;
+	fpi->use_napi = 0;
+	fpi->napi_weight = 17;
+
+	ret = find_phy(ofdev->node, fpi);
+	if (ret)
+		goto out_free_fpi;
+
+	privsize = sizeof(*fep) +
+	           sizeof(struct sk_buff **) *
+	           (fpi->rx_ring + fpi->tx_ring);
+
+	ndev = alloc_etherdev(privsize);
+	if (!ndev) {
+		ret = -ENOMEM;
+		goto out_free_fpi;
+	}
+
+	SET_MODULE_OWNER(ndev);
+	dev_set_drvdata(&ofdev->dev, ndev);
+
+	fep = netdev_priv(ndev);
+	fep->dev = &ofdev->dev;
+	fep->fpi = fpi;
+	fep->ops = match->data;
+
+	ret = fep->ops->setup_data(ndev);
+	if (ret)
+		goto out_free_dev;
+
+	fep->rx_skbuff = (struct sk_buff **)&fep[1];
+	fep->tx_skbuff = fep->rx_skbuff + fpi->rx_ring;
+
+	spin_lock_init(&fep->lock);
+	spin_lock_init(&fep->tx_lock);
+
+	mac_addr = of_get_mac_address(ofdev->node);
+	if (mac_addr)
+		memcpy(ndev->dev_addr, mac_addr, 6);
+
+	ret = fep->ops->allocate_bd(ndev);
+	if (ret)
+		goto out_cleanup_data;
+
+	fep->rx_bd_base = fep->ring_base;
+	fep->tx_bd_base = fep->rx_bd_base + fpi->rx_ring;
+
+	fep->tx_ring = fpi->tx_ring;
+	fep->rx_ring = fpi->rx_ring;
+
+	ndev->open = fs_enet_open;
+	ndev->hard_start_xmit = fs_enet_start_xmit;
+	ndev->tx_timeout = fs_timeout;
+	ndev->watchdog_timeo = 2 * HZ;
+	ndev->stop = fs_enet_close;
+	ndev->get_stats = fs_enet_get_stats;
+	ndev->set_multicast_list = fs_set_multicast_list;
+	if (fpi->use_napi) {
+		ndev->poll = fs_enet_rx_napi;
+		ndev->weight = fpi->napi_weight;
+	}
+	ndev->ethtool_ops = &fs_ethtool_ops;
+	ndev->do_ioctl = fs_ioctl;
+
+	init_timer(&fep->phy_timer_list);
+
+	netif_carrier_off(ndev);
+
+	ret = register_netdev(ndev);
+	if (ret)
+		goto out_free_bd;
+
+	printk(KERN_INFO "%s: fs_enet: %02x:%02x:%02x:%02x:%02x:%02x\n",
+	       ndev->name,
+	       ndev->dev_addr[0], ndev->dev_addr[1], ndev->dev_addr[2],
+	       ndev->dev_addr[3], ndev->dev_addr[4], ndev->dev_addr[5]);
+
+	return 0;
+
+out_free_bd:
+	fep->ops->free_bd(ndev);
+out_cleanup_data:
+	fep->ops->cleanup_data(ndev);
+out_free_dev:
+	free_netdev(ndev);
+	dev_set_drvdata(&ofdev->dev, NULL);
+out_free_fpi:
+	kfree(fpi);
+	return ret;
+}
+
+static int fs_enet_remove(struct of_device *ofdev)
+{
+	struct net_device *ndev = dev_get_drvdata(&ofdev->dev);
+	struct fs_enet_private *fep = netdev_priv(ndev);
+
+	unregister_netdev(ndev);
+
+	fep->ops->free_bd(ndev);
+	fep->ops->cleanup_data(ndev);
+	dev_set_drvdata(fep->dev, NULL);
+
+	free_netdev(ndev);
+	return 0;
+}
+
+static struct of_device_id fs_enet_match[] = {
+#ifdef CONFIG_FS_ENET_HAS_SCC
+	{
+		.compatible = "fsl,cpm1-scc-enet",
+		.data = (void *)&fs_scc_ops,
+	},
+#endif
+#ifdef CONFIG_FS_ENET_HAS_FCC
+	{
+		.compatible = "fsl,cpm2-fcc-enet",
+		.data = (void *)&fs_fcc_ops,
+	},
+#endif
+#ifdef CONFIG_FS_ENET_HAS_FEC
+	{
+		.compatible = "fsl,pq1-fec-enet",
+		.data = (void *)&fs_fec_ops,
+	},
+#endif
+	{}
+};
+
+static struct of_platform_driver fs_enet_driver = {
+	.name	= "fs_enet",
+	.match_table = fs_enet_match,
+	.probe = fs_enet_probe,
+	.remove = fs_enet_remove,
+};
+
+static int __init fs_init(void)
+{
+	int r = setup_immap();
+	if (r != 0)
+		return r;
+
+#ifdef CONFIG_CPM2
+	r = fs_enet_mdio_bb_init();
+	if (r != 0)
+		goto out_mdio_bb;
+#endif
+#ifdef CONFIG_8xx
+	r = fs_enet_mdio_fec_init();
+	if (r != 0)
+		goto out_mdio_fec;
+#endif
+
+	r = of_register_platform_driver(&fs_enet_driver);
+	if (r != 0)
+		goto out_driver;
+
+	return 0;
+
+out_driver:
+#ifdef CONFIG_8xx
+	fs_enet_mdio_fec_exit();
+out_mdio_fec:
+#endif
+#ifdef CONFIG_CPM2
+	fs_enet_mdio_bb_exit();
+out_mdio_bb:
+#endif
+	return r;
+}
+
+static void __exit fs_cleanup(void)
+{
+	of_unregister_platform_driver(&fs_enet_driver);
+#ifdef CONFIG_8xx
+	fs_enet_mdio_fec_exit();
+#endif
+#ifdef CONFIG_CPM2
+	fs_enet_mdio_bb_exit();
+#endif
+}
+#else
 static int __devinit fs_enet_probe(struct device *dev)
 {
 	struct net_device *ndev;
@@ -1282,7 +1534,7 @@ static void __exit fs_cleanup(void)
 	driver_unregister(&fs_enet_scc_driver);
 	cleanup_immap();
 }
-
+#endif
 /**************************************************************************************/
 
 module_init(fs_init);
diff --git a/drivers/net/fs_enet/fs_enet.h b/drivers/net/fs_enet/fs_enet.h
index f8c7ee8..a130332 100644
--- a/drivers/net/fs_enet/fs_enet.h
+++ b/drivers/net/fs_enet/fs_enet.h
@@ -24,19 +24,6 @@ struct fec_info {
 #include <asm/cpm2.h>
 #endif
 
-/* This is used to operate with pins.
-  Note that the actual port size may
-    be different; cpm(s) handle it OK  */
-struct bb_info {
-	u8 mdio_dat_msk;
-	u8 mdio_dir_msk;
-	u8 *mdio_dir;
-	u8 *mdio_dat;
-	u8 mdc_msk;
-	u8 *mdc_dat;
-	int delay;
-};
-
 /* hw driver ops */
 struct fs_ops {
 	int (*setup_data)(struct net_device *dev);
@@ -85,47 +72,11 @@ struct phy_info {
 #define ENET_RX_ALIGN  16
 #define ENET_RX_FRSIZE L1_CACHE_ALIGN(PKT_MAXBUF_SIZE + ENET_RX_ALIGN - 1)
 
-struct fs_enet_mii_bus {
-	struct list_head list;
-	spinlock_t mii_lock;
-	const struct fs_mii_bus_info *bus_info;
-	int refs;
-	u32 usage_map;
-
-	int (*mii_read)(struct fs_enet_mii_bus *bus,
-			int phy_id, int location);
-
-	void (*mii_write)(struct fs_enet_mii_bus *bus,
-			int phy_id, int location, int value);
-
-	union {
-		struct {
-			unsigned int mii_speed;
-			void *fecp;
-		} fec;
-
-		struct {
-			/* note that the actual port size may */
-			/* be different; cpm(s) handle it OK  */
-			u8 mdio_msk;
-			u8 *mdio_dir;
-			u8 *mdio_dat;
-			u8 mdc_msk;
-			u8 *mdc_dir;
-			u8 *mdc_dat;
-		} bitbang;
-
-		struct {
-			u16 lpa;
-		} fixed;
-	};
-};
-
 struct fs_enet_private {
 	struct device *dev;	/* pointer back to the device (must be initialized first) */
 	spinlock_t lock;	/* during all ops except TX pckt processing */
 	spinlock_t tx_lock;	/* during fs_start_xmit and fs_tx         */
-	const struct fs_platform_info *fpi;
+	struct fs_platform_info *fpi;
 	const struct fs_ops *ops;
 	int rx_ring, tx_ring;
 	dma_addr_t ring_mem_addr;
@@ -144,7 +95,6 @@ struct fs_enet_private {
 	u32 msg_enable;
 	struct mii_if_info mii_if;
 	unsigned int last_mii_status;
-	struct fs_enet_mii_bus *mii_bus;
 	int interrupt;
 
 	struct phy_device *phydev;
@@ -187,8 +137,9 @@ struct fs_enet_private {
 
 /***************************************************************************/
 int fs_enet_mdio_bb_init(void);
-int fs_mii_fixed_init(struct fs_enet_mii_bus *bus);
+void fs_enet_mdio_bb_exit(void);
 int fs_enet_mdio_fec_init(void);
+void fs_enet_mdio_fec_exit(void);
 
 void fs_init_bds(struct net_device *dev);
 void fs_cleanup_bds(struct net_device *dev);
diff --git a/drivers/net/fs_enet/mac-fcc.c b/drivers/net/fs_enet/mac-fcc.c
index 8b30361..10e2a2b 100644
--- a/drivers/net/fs_enet/mac-fcc.c
+++ b/drivers/net/fs_enet/mac-fcc.c
@@ -42,6 +42,10 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_device.h>
+#endif
+
 #include "fs_enet.h"
 
 /*************************************************/
@@ -74,33 +78,64 @@
 
 #define MAX_CR_CMD_LOOPS	10000
 
-static inline int fcc_cr_cmd(struct fs_enet_private *fep, u32 mcn, u32 op)
+static inline int fcc_cr_cmd(struct fs_enet_private *fep, u32 op)
 {
 	const struct fs_platform_info *fpi = fep->fpi;
 	cpm2_map_t *immap = fs_enet_immap;
 	cpm_cpm2_t *cpmp = &immap->im_cpm;
-	u32 v;
 	int i;
 
-	/* Currently I don't know what feature call will look like. But 
-	   I guess there'd be something like do_cpm_cmd() which will require page & sblock */
-	v = mk_cr_cmd(fpi->cp_page, fpi->cp_block, mcn, op);
-	W32(cpmp, cp_cpcr, v | CPM_CR_FLG);
+	W32(cpmp, cp_cpcr, fpi->cp_command | op | CPM_CR_FLG);
 	for (i = 0; i < MAX_CR_CMD_LOOPS; i++)
 		if ((R32(cpmp, cp_cpcr) & CPM_CR_FLG) == 0)
-			break;
-
-	if (i >= MAX_CR_CMD_LOOPS) {
-		printk(KERN_ERR "%s(): Not able to issue CPM command\n",
-		       __FUNCTION__);
-		return 1;
-	}
+			return 0;
 
-	return 0;
+	printk(KERN_ERR "%s(): Not able to issue CPM command\n",
+	       __FUNCTION__);
+	return 1;
 }
 
 static int do_pd_setup(struct fs_enet_private *fep)
 {
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct of_device *ofdev = to_of_device(fep->dev);
+	struct fs_platform_info *fpi = fep->fpi;
+	int ret = -EINVAL;
+
+	fep->interrupt = of_irq_to_resource(ofdev->node, 0, NULL);
+	if (fep->interrupt == NO_IRQ)
+		goto out;
+
+	fep->fcc.fccp = of_iomap(ofdev->node, 0);
+	if (!fep->fcc.fccp)
+		goto out;
+
+	fep->fcc.ep = of_iomap(ofdev->node, 1);
+	if (!fep->fcc.ep)
+		goto out_fccp;
+
+	fep->fcc.fcccp = of_iomap(ofdev->node, 2);
+	if (!fep->fcc.fcccp)
+		goto out_ep;
+
+	fep->fcc.mem = (void *)cpm_dpalloc(128, 8);
+	fpi->dpram_offset = (u32)cpm2_immr;
+	if (IS_ERR_VALUE(fpi->dpram_offset)) {
+		ret = fpi->dpram_offset;
+		goto out_fcccp;
+	}
+
+	return 0;
+
+out_fcccp:
+	iounmap(fep->fcc.fcccp);
+out_ep:
+	iounmap(fep->fcc.ep);
+out_fccp:
+	iounmap(fep->fcc.fccp);
+out:
+	return ret;
+#else
 	struct platform_device *pdev = to_platform_device(fep->dev);
 	struct resource *r;
 
@@ -138,6 +173,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 		return -EINVAL;
 
 	return 0;
+#endif
 }
 
 #define FCC_NAPI_RX_EVENT_MSK	(FCC_ENET_RXF | FCC_ENET_RXB)
@@ -148,11 +184,17 @@ static int do_pd_setup(struct fs_enet_private *fep)
 static int setup_data(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	const struct fs_platform_info *fpi = fep->fpi;
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
+	struct fs_platform_info *fpi = fep->fpi;
+
+	fpi->cp_command = (fpi->cp_page << 26) |
+	                  (fpi->cp_block << 21) |
+	                  (12 << 6);
 
 	fep->fcc.idx = fs_get_fcc_index(fpi->fs_no);
 	if ((unsigned int)fep->fcc.idx >= 3)	/* max 3 FCCs */
 		return -EINVAL;
+#endif
 
 	if (do_pd_setup(fep) != 0)
 		return -EINVAL;
@@ -226,7 +268,7 @@ static void set_multicast_one(struct net_device *dev, const u8 *mac)
 	W16(ep, fen_taddrh, taddrh);
 	W16(ep, fen_taddrm, taddrm);
 	W16(ep, fen_taddrl, taddrl);
-	fcc_cr_cmd(fep, 0x0C, CPM_CR_SET_GADDR);
+	fcc_cr_cmd(fep, CPM_CR_SET_GADDR);
 }
 
 static void set_multicast_finish(struct net_device *dev)
@@ -281,7 +323,7 @@ static void restart(struct net_device *dev)
 
 	/* clear everything (slow & steady does it) */
 	for (i = 0; i < sizeof(*ep); i++)
-		out_8((char *)ep + i, 0);
+		out_8((u8 __iomem *)ep + i, 0);
 
 	/* get physical address */
 	rx_bd_base_phys = fep->ring_mem_addr;
@@ -397,7 +439,7 @@ static void restart(struct net_device *dev)
 			S8(fcccp, fcc_gfemr, 0x20);
 	}
 
-	fcc_cr_cmd(fep, 0x0c, CPM_CR_INIT_TRX);
+	fcc_cr_cmd(fep, CPM_CR_INIT_TRX);
 
 	/* clear events */
 	W16(fccp, fcc_fcce, 0xffff);
@@ -515,23 +557,22 @@ int get_regs(struct net_device *dev, void *p, int *sizep)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 
-	if (*sizep < sizeof(fcc_t) + sizeof(fcc_c_t) + sizeof(fcc_enet_t))
+	if (*sizep < sizeof(fcc_t) + sizeof(fcc_enet_t) + 1)
 		return -EINVAL;
 
 	memcpy_fromio(p, fep->fcc.fccp, sizeof(fcc_t));
 	p = (char *)p + sizeof(fcc_t);
 
-	memcpy_fromio(p, fep->fcc.fcccp, sizeof(fcc_c_t));
-	p = (char *)p + sizeof(fcc_c_t);
-
 	memcpy_fromio(p, fep->fcc.ep, sizeof(fcc_enet_t));
+	p = (char *)p + sizeof(fcc_enet_t);
 
+	memcpy_fromio(p, fep->fcc.fcccp, 1);
 	return 0;
 }
 
 int get_regs_len(struct net_device *dev)
 {
-	return sizeof(fcc_t) + sizeof(fcc_c_t) + sizeof(fcc_enet_t);
+	return sizeof(fcc_t) + sizeof(fcc_enet_t) + 1;
 }
 
 /* Some transmit errors cause the transmitter to shut
@@ -551,7 +592,7 @@ void tx_restart(struct net_device *dev)
 	udelay(10);
 	S32(fccp, fcc_gfmr, FCC_GFMR_ENT);
 
-	fcc_cr_cmd(fep, 0x0C, CPM_CR_RESTART_TX);
+	fcc_cr_cmd(fep, CPM_CR_RESTART_TX);
 }
 
 /*************************************************************************/
diff --git a/drivers/net/fs_enet/mac-fec.c b/drivers/net/fs_enet/mac-fec.c
index 04b4f80..ef305e2 100644
--- a/drivers/net/fs_enet/mac-fec.c
+++ b/drivers/net/fs_enet/mac-fec.c
@@ -43,6 +43,10 @@
 #include <asm/commproc.h>
 #endif
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_device.h>
+#endif
+
 #include "fs_enet.h"
 #include "fec.h"
 
@@ -95,6 +99,19 @@ static int whack_reset(fec_t * fecp)
 
 static int do_pd_setup(struct fs_enet_private *fep)
 {
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct of_device *ofdev = to_of_device(fep->dev);
+
+	fep->interrupt = of_irq_to_resource(ofdev->node, 0, NULL);
+	if (fep->interrupt == NO_IRQ)
+		return -EINVAL;
+
+	fep->fec.fecp = of_iomap(ofdev->node, 0);
+	if (!fep->fcc.fccp)
+		return -EINVAL;
+
+	return 0;
+#else
 	struct platform_device *pdev = to_platform_device(fep->dev); 
 	struct resource	*r;
 	
@@ -110,7 +127,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 		return -EINVAL;
 
 	return 0;
-	
+#endif
 }
 
 #define FEC_NAPI_RX_EVENT_MSK	(FEC_ENET_RXF | FEC_ENET_RXB)
@@ -317,7 +334,7 @@ static void restart(struct net_device *dev)
 	 * Clear any outstanding interrupt.
 	 */
 	FW(fecp, ievent, 0xffc0);
-#ifndef CONFIG_PPC_MERGE
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	FW(fecp, ivec, (fep->interrupt / 2) << 29);
 #else
 	FW(fecp, ivec, (virq_to_hw(fep->interrupt) / 2) << 29);
@@ -419,7 +436,7 @@ static void stop(struct net_device *dev)
 
 static void pre_request_irq(struct net_device *dev, int irq)
 {
-#ifndef CONFIG_PPC_MERGE
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	immap_t *immap = fs_enet_immap;
 	u32 siel;
 
diff --git a/drivers/net/fs_enet/mac-scc.c b/drivers/net/fs_enet/mac-scc.c
index 7540966..6ec6b7d 100644
--- a/drivers/net/fs_enet/mac-scc.c
+++ b/drivers/net/fs_enet/mac-scc.c
@@ -43,6 +43,10 @@
 #include <asm/commproc.h>
 #endif
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
+
 #include "fs_enet.h"
 
 /*************************************************/
@@ -89,27 +93,38 @@
 
 static inline int scc_cr_cmd(struct fs_enet_private *fep, u32 op)
 {
-	cpm8xx_t *cpmp = &((immap_t *)fs_enet_immap)->im_cpm;
-	u32 v, ch;
-	int i = 0;
+	const struct fs_platform_info *fpi = fep->fpi;
+	int i;
 
-	ch = fep->scc.idx << 2;
-	v = mk_cr_cmd(ch, op);
-	W16(cpmp, cp_cpcr, v | CPM_CR_FLG);
+	W16(cpmp, cp_cpcr, fpi->cp_command | CPM_CR_FLG | (op << 8));
 	for (i = 0; i < MAX_CR_CMD_LOOPS; i++)
 		if ((R16(cpmp, cp_cpcr) & CPM_CR_FLG) == 0)
-			break;
+			return 0;
 
-	if (i >= MAX_CR_CMD_LOOPS) {
-		printk(KERN_ERR "%s(): Not able to issue CPM command\n",
-			__FUNCTION__);
-		return 1;
-	}
-	return 0;
+	printk(KERN_ERR "%s(): Not able to issue CPM command\n",
+		__FUNCTION__);
+	return 1;
 }
 
 static int do_pd_setup(struct fs_enet_private *fep)
 {
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct of_device *ofdev = to_of_device(fep->dev);
+
+	fep->interrupt = of_irq_to_resource(ofdev->node, 0, NULL);
+	if (fep->interrupt == NO_IRQ)
+		return -EINVAL;
+
+	fep->scc.sccp = of_iomap(ofdev->node, 0);
+	if (!fep->scc.sccp)
+		return -EINVAL;
+
+	fep->scc.ep = of_iomap(ofdev->node, 1);
+	if (!fep->scc.ep) {
+		iounmap(fep->scc.sccp);
+		return -EINVAL;
+	}
+#else
 	struct platform_device *pdev = to_platform_device(fep->dev);
 	struct resource *r;
 
@@ -129,6 +144,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 
 	if (fep->scc.ep == NULL)
 		return -EINVAL;
+#endif
 
 	return 0;
 }
@@ -141,12 +157,17 @@ static int do_pd_setup(struct fs_enet_private *fep)
 static int setup_data(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	const struct fs_platform_info *fpi = fep->fpi;
+
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	struct fs_platform_info *fpi = fep->fpi;
 
 	fep->scc.idx = fs_get_scc_index(fpi->fs_no);
-	if ((unsigned int)fep->fcc.idx > 4)	/* max 4 SCCs */
+	if ((unsigned int)fep->fcc.idx >= 4) /* max 4 SCCs */
 		return -EINVAL;
 
+	fpi->cp_command = fep->fcc.idx << 6;
+#endif
+
 	do_pd_setup(fep);
 
 	fep->scc.hthi = 0;
@@ -154,7 +175,7 @@ static int setup_data(struct net_device *dev)
 
 	fep->ev_napi_rx = SCC_NAPI_RX_EVENT_MSK;
 	fep->ev_rx = SCC_RX_EVENT;
-	fep->ev_tx = SCC_TX_EVENT;
+	fep->ev_tx = SCC_TX_EVENT | SCCE_ENET_TXE;
 	fep->ev_err = SCC_ERR_EVENT_MSK;
 
 	return 0;
@@ -395,7 +416,7 @@ static void stop(struct net_device *dev)
 
 static void pre_request_irq(struct net_device *dev, int irq)
 {
-#ifndef CONFIG_PPC_MERGE
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	immap_t *immap = fs_enet_immap;
 	u32 siel;
 
diff --git a/drivers/net/fs_enet/mii-bitbang.c b/drivers/net/fs_enet/mii-bitbang.c
index d384010..fc92ec7 100644
--- a/drivers/net/fs_enet/mii-bitbang.c
+++ b/drivers/net/fs_enet/mii-bitbang.c
@@ -14,301 +14,267 @@
 
 
 #include <linux/module.h>
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/string.h>
-#include <linux/ptrace.h>
-#include <linux/errno.h>
+#include <linux/init.h>
 #include <linux/ioport.h>
 #include <linux/slab.h>
 #include <linux/interrupt.h>
-#include <linux/init.h>
-#include <linux/delay.h>
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
-#include <linux/skbuff.h>
-#include <linux/spinlock.h>
 #include <linux/mii.h>
-#include <linux/ethtool.h>
-#include <linux/bitops.h>
 #include <linux/platform_device.h>
+#include <linux/mdio-bitbang.h>
 
-#include <asm/pgtable.h>
-#include <asm/irq.h>
-#include <asm/uaccess.h>
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
 
 #include "fs_enet.h"
 
-static int bitbang_prep_bit(u8 **datp, u8 *mskp,
-		struct fs_mii_bit *mii_bit)
-{
-	void *dat;
-	int adv;
-	u8 msk;
-
-	dat = (void*) mii_bit->offset;
-
-	adv = mii_bit->bit >> 3;
-	dat = (char *)dat + adv;
-
-	msk = 1 << (7 - (mii_bit->bit & 7));
-
-	*datp = dat;
-	*mskp = msk;
-
-	return 0;
-}
+struct bb_info {
+	struct mdio_bitbang_ctrl ctrl;
+	__be32 __iomem *dir;
+	__be32 __iomem *dat;
+	u32 mdio_msk;
+	u32 mdc_msk;
+	int delay;
+};
 
-static inline void bb_set(u8 *p, u8 m)
+/* FIXME: If any other users of GPIO crop up, then these will have to
+ * have some sort of global synchronization to avoid races with other
+ * pins on the same port.  The ideal solution would probably be to
+ * bind the ports to a GPIO driver, and have this be a client of it.
+ */
+static inline void bb_set(u32 __iomem *p, u32 m)
 {
-	out_8(p, in_8(p) | m);
+	out_be32(p, in_be32(p) | m);
 }
 
-static inline void bb_clr(u8 *p, u8 m)
+static inline void bb_clr(u32 __iomem *p, u32 m)
 {
-	out_8(p, in_8(p) & ~m);
+	out_be32(p, in_be32(p) & ~m);
 }
 
-static inline int bb_read(u8 *p, u8 m)
+static inline int bb_read(u32 __iomem *p, u32 m)
 {
-	return (in_8(p) & m) != 0;
+	return (in_be32(p) & m) != 0;
 }
 
-static inline void mdio_active(struct bb_info *bitbang)
+static inline void mdio_dir(struct mdio_bitbang_ctrl *ctrl, int dir)
 {
-	bb_set(bitbang->mdio_dir, bitbang->mdio_dir_msk);
-}
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
 
-static inline void mdio_tristate(struct bb_info *bitbang )
-{
-	bb_clr(bitbang->mdio_dir, bitbang->mdio_dir_msk);
+	if (dir)
+		bb_set(bitbang->dir, bitbang->mdio_msk);
+	else
+		bb_clr(bitbang->dir, bitbang->mdio_msk);
 }
 
-static inline int mdio_read(struct bb_info *bitbang )
+static inline int mdio_read(struct mdio_bitbang_ctrl *ctrl)
 {
-	return bb_read(bitbang->mdio_dat, bitbang->mdio_dat_msk);
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
+	return bb_read(bitbang->dat, bitbang->mdio_msk);
 }
 
-static inline void mdio(struct bb_info *bitbang , int what)
+static inline void mdio(struct mdio_bitbang_ctrl *ctrl, int what)
 {
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
+
 	if (what)
-		bb_set(bitbang->mdio_dat, bitbang->mdio_dat_msk);
+		bb_set(bitbang->dat, bitbang->mdio_msk);
 	else
-		bb_clr(bitbang->mdio_dat, bitbang->mdio_dat_msk);
+		bb_clr(bitbang->dat, bitbang->mdio_msk);
 }
 
-static inline void mdc(struct bb_info *bitbang , int what)
+static inline void mdc(struct mdio_bitbang_ctrl *ctrl, int what)
 {
+	struct bb_info *bitbang = container_of(ctrl, struct bb_info, ctrl);
+
 	if (what)
-		bb_set(bitbang->mdc_dat, bitbang->mdc_msk);
+		bb_set(bitbang->dat, bitbang->mdc_msk);
 	else
-		bb_clr(bitbang->mdc_dat, bitbang->mdc_msk);
+		bb_clr(bitbang->dat, bitbang->mdc_msk);
 }
 
-static inline void mii_delay(struct bb_info *bitbang )
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+static int __devinit fs_mii_bitbang_init(struct mii_bus *bus,
+                                         struct device_node *np)
 {
-	udelay(bitbang->delay);
+	struct resource res;
+	const u32 *data;
+	int mdio_pin, mdc_pin, len;
+	struct bb_info *bitbang = bus->priv;
+
+	int ret = of_address_to_resource(np, 0, &res);
+	if (ret)
+		return ret;
+
+	if (res.end - res.start < 13)
+		return -ENODEV;
+
+	/* This should really encode the pin number as well, but all
+	 * we get is an int, and the odds of multiple bitbang mdio buses
+	 * is low enough that it's not worth going too crazy.
+	 */
+	bus->id = res.start;
+
+	data = of_get_property(np, "fsl,mdio-pin", &len);
+	if (!data || len != 4)
+		return -ENODEV;
+	mdio_pin = *data;
+
+	data = of_get_property(np, "fsl,mdc-pin", &len);
+	if (!data || len != 4)
+		return -ENODEV;
+	mdc_pin = *data;
+
+	bitbang->dir = ioremap(res.start, res.end - res.start + 1);
+	if (!bitbang->dir)
+		return -ENOMEM;
+
+	bitbang->dat = bitbang->dir + 4;
+	bitbang->mdio_msk = 1 << (31 - mdio_pin);
+	bitbang->mdc_msk = 1 << (31 - mdc_pin);
+	bitbang->delay = 1; /* 1 us between operations */
+
+	return 0;
 }
 
-/* Utility to send the preamble, address, and register (common to read and write). */
-static void bitbang_pre(struct bb_info *bitbang , int read, u8 addr, u8 reg)
+static void __devinit add_phy(struct mii_bus *bus, struct device_node *np)
 {
-	int j;
-
-	/*
-	 * Send a 32 bit preamble ('1's) with an extra '1' bit for good measure.
-	 * The IEEE spec says this is a PHY optional requirement.  The AMD
-	 * 79C874 requires one after power up and one after a MII communications
-	 * error.  This means that we are doing more preambles than we need,
-	 * but it is safer and will be much more robust.
-	 */
+	const u32 *data;
+	int len, id, irq;
 
-	mdio_active(bitbang);
-	mdio(bitbang, 1);
-	for (j = 0; j < 32; j++) {
-		mdc(bitbang, 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-	}
+	data = of_get_property(np, "reg", &len);
+	if (!data || len != 4)
+		return;
 
-	/* send the start bit (01) and the read opcode (10) or write (10) */
-	mdc(bitbang, 0);
-	mdio(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, read);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, !read);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-
-	/* send the PHY address */
-	for (j = 0; j < 5; j++) {
-		mdc(bitbang, 0);
-		mdio(bitbang, (addr & 0x10) != 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		addr <<= 1;
-	}
+	id = *data;
+	bus->phy_mask &= ~(1 << id);
 
-	/* send the register address */
-	for (j = 0; j < 5; j++) {
-		mdc(bitbang, 0);
-		mdio(bitbang, (reg & 0x10) != 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		reg <<= 1;
-	}
+	irq = of_irq_to_resource(np, 0, NULL);
+	if (irq != NO_IRQ)
+		bus->irq[id] = irq;
 }
 
-static int fs_enet_mii_bb_read(struct mii_bus *bus , int phy_id, int location)
+static struct mdio_bitbang_ops bb_ops = {
+	.owner = THIS_MODULE,
+	.set_mdc = mdc,
+	.set_mdio_dir = mdio_dir,
+	.set_mdio_data = mdio,
+	.get_mdio_data = mdio_read,
+};
+
+static int __devinit fs_enet_mdio_probe(struct of_device *ofdev,
+                                        const struct of_device_id *match)
 {
-	u16 rdreg;
-	int ret, j;
-	u8 addr = phy_id & 0xff;
-	u8 reg = location & 0xff;
-	struct bb_info* bitbang = bus->priv;
-
-	bitbang_pre(bitbang, 1, addr, reg);
-
-	/* tri-state our MDIO I/O pin so we can read */
-	mdc(bitbang, 0);
-	mdio_tristate(bitbang);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-
-	/* check the turnaround bit: the PHY should be driving it to zero */
-	if (mdio_read(bitbang) != 0) {
-		/* PHY didn't drive TA low */
-		for (j = 0; j < 32; j++) {
-			mdc(bitbang, 0);
-			mii_delay(bitbang);
-			mdc(bitbang, 1);
-			mii_delay(bitbang);
-		}
-		ret = -1;
+	struct device_node *np = NULL;
+	struct mii_bus *new_bus;
+	struct bb_info *bitbang;
+	int ret = -ENOMEM;
+	int i;
+
+	bitbang = kzalloc(sizeof(struct bb_info), GFP_KERNEL);
+	if (!bitbang)
 		goto out;
-	}
 
-	mdc(bitbang, 0);
-	mii_delay(bitbang);
-
-	/* read 16 bits of register data, MSB first */
-	rdreg = 0;
-	for (j = 0; j < 16; j++) {
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		rdreg <<= 1;
-		rdreg |= mdio_read(bitbang);
-		mdc(bitbang, 0);
-		mii_delay(bitbang);
-	}
+	bitbang->ctrl.ops = &bb_ops;
+
+	new_bus = alloc_mdio_bitbang(&bitbang->ctrl);
+	if (!new_bus)
+		goto out_free_priv;
 
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
+	new_bus->name = "CPM2 Bitbanged MII",
 
-	ret = rdreg;
+	ret = fs_mii_bitbang_init(new_bus, ofdev->node);
+	if (ret)
+		goto out_free_bus;
+
+	new_bus->phy_mask = ~0;
+	new_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
+	if (!new_bus->irq)
+		goto out_unmap_regs;
+
+	for (i = 0; i < PHY_MAX_ADDR; i++)
+		new_bus->irq[i] = -1;
+
+	while ((np = of_get_next_child(ofdev->node, np)))
+		if (!strcmp(np->type, "ethernet-phy"))
+			add_phy(new_bus, np);
+
+	new_bus->dev = &ofdev->dev;
+	dev_set_drvdata(&ofdev->dev, new_bus);
+
+	ret = mdiobus_register(new_bus);
+	if (ret)
+		goto out_free_irqs;
+
+	return 0;
+
+out_free_irqs:
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(new_bus->irq);
+out_unmap_regs:
+	iounmap(bitbang->dir);
+out_free_bus:
+	kfree(new_bus);
+out_free_priv:
+	free_mdio_bitbang(new_bus);
 out:
 	return ret;
 }
 
-static int fs_enet_mii_bb_write(struct mii_bus *bus, int phy_id, int location, u16 val)
+static int fs_enet_mdio_remove(struct of_device *ofdev)
 {
-	int j;
-	struct bb_info* bitbang = bus->priv;
-
-	u8 addr = phy_id & 0xff;
-	u8 reg = location & 0xff;
-	u16 value = val & 0xffff;
-
-	bitbang_pre(bitbang, 0, addr, reg);
-
-	/* send the turnaround (10) */
-	mdc(bitbang, 0);
-	mdio(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-	mdc(bitbang, 0);
-	mdio(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
-
-	/* write 16 bits of register data, MSB first */
-	for (j = 0; j < 16; j++) {
-		mdc(bitbang, 0);
-		mdio(bitbang, (value & 0x8000) != 0);
-		mii_delay(bitbang);
-		mdc(bitbang, 1);
-		mii_delay(bitbang);
-		value <<= 1;
-	}
+	struct mii_bus *bus = dev_get_drvdata(&ofdev->dev);
+	struct bb_info *bitbang = bus->priv;
+
+	mdiobus_unregister(bus);
+	free_mdio_bitbang(bus);
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(bus->irq);
+	iounmap(bitbang->dir);
+	kfree(bitbang);
+	kfree(bus);
 
-	/*
-	 * Tri-state the MDIO line.
-	 */
-	mdio_tristate(bitbang);
-	mdc(bitbang, 0);
-	mii_delay(bitbang);
-	mdc(bitbang, 1);
-	mii_delay(bitbang);
 	return 0;
 }
 
-static int fs_enet_mii_bb_reset(struct mii_bus *bus)
+static struct of_device_id fs_enet_mdio_bb_match[] = {
+	{
+		.compatible = "fsl,cpm2-mdio-bitbang",
+	},
+	{},
+};
+
+static struct of_platform_driver fs_enet_bb_mdio_driver = {
+	.name = "fsl-bb-mdio",
+	.match_table = fs_enet_mdio_bb_match,
+	.probe = fs_enet_mdio_probe,
+	.remove = fs_enet_mdio_remove,
+};
+
+int fs_enet_mdio_bb_init(void)
 {
-	/*nothing here - dunno how to reset it*/
-	return 0;
+	return of_register_platform_driver(&fs_enet_bb_mdio_driver);
 }
 
-static int fs_mii_bitbang_init(struct bb_info *bitbang, struct fs_mii_bb_platform_info* fmpi)
+void fs_enet_mdio_bb_exit(void)
 {
-	int r;
-
+	of_unregister_platform_driver(&fs_enet_bb_mdio_driver);
+}
+#else
+static int __devinit fs_mii_bitbang_init(struct bb_info *bitbang,
+                                         struct fs_mii_bb_platform_info *fmpi)
+{
+	bitbang->dir = (u32 __iomem *)fmpi->mdio_dir.offset;
+	bitbang->dat = (u32 __iomem *)fmpi->mdio_dat.offset;
+	bitbang->mdio_msk = 1U << (31 - fmpi->mdio_dat.bit);
+	bitbang->mdc_msk = 1U << (31 - fmpi->mdc_dat.bit);
 	bitbang->delay = fmpi->delay;
 
-	r = bitbang_prep_bit(&bitbang->mdio_dir,
-			 &bitbang->mdio_dir_msk,
-			 &fmpi->mdio_dir);
-	if (r != 0)
-		return r;
-
-	r = bitbang_prep_bit(&bitbang->mdio_dat,
-			 &bitbang->mdio_dat_msk,
-			 &fmpi->mdio_dat);
-	if (r != 0)
-		return r;
-
-	r = bitbang_prep_bit(&bitbang->mdc_dat,
-			 &bitbang->mdc_msk,
-			 &fmpi->mdc_dat);
-	if (r != 0)
-		return r;
-
 	return 0;
 }
 
-
 static int __devinit fs_enet_mdio_probe(struct device *dev)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -320,20 +286,19 @@ static int __devinit fs_enet_mdio_probe(struct device *dev)
 	if (NULL == dev)
 		return -EINVAL;
 
-	new_bus = kzalloc(sizeof(struct mii_bus), GFP_KERNEL);
+	bitbang = kzalloc(sizeof(struct bb_info), GFP_KERNEL);
 
-	if (NULL == new_bus)
+	if (NULL == bitbang)
 		return -ENOMEM;
 
-	bitbang = kzalloc(sizeof(struct bb_info), GFP_KERNEL);
+	bitbang->ctrl.ops = &bb_ops;
 
-	if (NULL == bitbang)
+	new_bus = alloc_mdio_bitbang(&bitbang->ctrl);
+
+	if (NULL == new_bus)
 		return -ENOMEM;
 
 	new_bus->name = "BB MII Bus",
-	new_bus->read = &fs_enet_mii_bb_read,
-	new_bus->write = &fs_enet_mii_bb_write,
-	new_bus->reset = &fs_enet_mii_bb_reset,
 	new_bus->id = pdev->id;
 
 	new_bus->phy_mask = ~0x9;
@@ -365,13 +330,12 @@ static int __devinit fs_enet_mdio_probe(struct device *dev)
 	return 0;
 
 bus_register_fail:
+	free_mdio_bitbang(new_bus);
 	kfree(bitbang);
-	kfree(new_bus);
 
 	return err;
 }
 
-
 static int fs_enet_mdio_remove(struct device *dev)
 {
 	struct mii_bus *bus = dev_get_drvdata(dev);
@@ -380,9 +344,7 @@ static int fs_enet_mdio_remove(struct device *dev)
 
 	dev_set_drvdata(dev, NULL);
 
-	iounmap((void *) (&bus->priv));
-	bus->priv = NULL;
-	kfree(bus);
+	free_mdio_bitbang(bus);
 
 	return 0;
 }
@@ -403,4 +365,4 @@ void fs_enet_mdio_bb_exit(void)
 {
 	driver_unregister(&fs_enet_bb_mdio_driver);
 }
-
+#endif
diff --git a/drivers/net/fs_enet/mii-fec.c b/drivers/net/fs_enet/mii-fec.c
index 53db696..71e07b0 100644
--- a/drivers/net/fs_enet/mii-fec.c
+++ b/drivers/net/fs_enet/mii-fec.c
@@ -36,6 +36,10 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#include <linux/of_platform.h>
+#endif
+
 #include "fs_enet.h"
 #include "fec.h"
 
@@ -47,6 +51,7 @@
 
 #define FEC_MII_LOOPS	10000
 
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 static int match_has_phy (struct device *dev, void* data)
 {
 	struct platform_device* pdev = container_of(dev, struct platform_device, dev);
@@ -90,6 +95,7 @@ static int fs_mii_fec_init(struct fec_info* fec, struct fs_mii_fec_platform_info
 
 	return 0;
 }
+#endif
 
 static int fs_enet_fec_mii_read(struct mii_bus *bus , int phy_id, int location)
 {
@@ -145,6 +151,138 @@ static int fs_enet_fec_mii_reset(struct mii_bus *bus)
 	return 0;
 }
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+static void __devinit add_phy(struct mii_bus *bus, struct device_node *np)
+{
+	const u32 *data;
+	int len, id, irq;
+
+	data = of_get_property(np, "reg", &len);
+	if (!data || len != 4)
+		return;
+
+	id = *data;
+	bus->phy_mask &= ~(1 << id);
+
+	irq = of_irq_to_resource(np, 0, NULL);
+	if (irq != NO_IRQ)
+		bus->irq[id] = irq;
+}
+
+static int __devinit fs_enet_mdio_probe(struct of_device *ofdev,
+                                        const struct of_device_id *match)
+{
+	struct device_node *np = NULL;
+	struct resource res;
+	struct mii_bus *new_bus;
+	struct fec_info *fec;
+	int ret = -ENOMEM, i;
+
+	new_bus = kzalloc(sizeof(struct mii_bus), GFP_KERNEL);
+	if (!new_bus)
+		goto out;
+
+	fec = kzalloc(sizeof(struct fec_info), GFP_KERNEL);
+	if (!fec)
+		goto out_mii;
+
+	new_bus->priv = fec;
+	new_bus->name = "FEC MII Bus";
+	new_bus->read = &fs_enet_fec_mii_read;
+	new_bus->write = &fs_enet_fec_mii_write;
+	new_bus->reset = &fs_enet_fec_mii_reset;
+
+	ret = of_address_to_resource(ofdev->node, 0, &res);
+	if (ret)
+		return ret;
+
+	new_bus->id = res.start;
+
+	fec->fecp = ioremap(res.start, res.end - res.start + 1);
+	if (!fec->fecp)
+		goto out_fec;
+
+	fec->mii_speed = ((ppc_proc_freq + 4999999) / 5000000) << 1;
+
+	setbits32(&fec->fecp->fec_r_cntrl, FEC_RCNTRL_MII_MODE);
+	setbits32(&fec->fecp->fec_ecntrl, FEC_ECNTRL_PINMUX |
+	                                  FEC_ECNTRL_ETHER_EN);
+	out_be32(&fec->fecp->fec_ievent, FEC_ENET_MII);
+	out_be32(&fec->fecp->fec_mii_speed, fec->mii_speed);
+
+	new_bus->phy_mask = ~0;
+	new_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
+	if (!new_bus->irq)
+		goto out_unmap_regs;
+
+	for (i = 0; i < PHY_MAX_ADDR; i++)
+		new_bus->irq[i] = -1;
+
+	while ((np = of_get_next_child(ofdev->node, np)))
+		if (!strcmp(np->type, "ethernet-phy"))
+			add_phy(new_bus, np);
+
+	new_bus->dev = &ofdev->dev;
+	dev_set_drvdata(&ofdev->dev, new_bus);
+
+	ret = mdiobus_register(new_bus);
+	if (ret)
+		goto out_free_irqs;
+
+	return 0;
+
+out_free_irqs:
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(new_bus->irq);
+out_unmap_regs:
+	iounmap(fec->fecp);
+out_fec:
+	kfree(fec);
+out_mii:
+	kfree(new_bus);
+out:
+	return ret;
+}
+
+static int fs_enet_mdio_remove(struct of_device *ofdev)
+{
+	struct mii_bus *bus = dev_get_drvdata(&ofdev->dev);
+	struct fec_info *fec = bus->priv;
+
+	mdiobus_unregister(bus);
+	dev_set_drvdata(&ofdev->dev, NULL);
+	kfree(bus->irq);
+	iounmap(fec->fecp);
+	kfree(fec);
+	kfree(bus);
+
+	return 0;
+}
+
+static struct of_device_id fs_enet_mdio_fec_match[] = {
+	{
+		.compatible = "fsl,pq1-fec-mdio",
+	},
+	{},
+};
+
+static struct of_platform_driver fs_enet_fec_mdio_driver = {
+	.name = "fsl-fec-mdio",
+	.match_table = fs_enet_mdio_fec_match,
+	.probe = fs_enet_mdio_probe,
+	.remove = fs_enet_mdio_remove,
+};
+
+int fs_enet_mdio_fec_init(void)
+{
+	return of_register_platform_driver(&fs_enet_fec_mdio_driver);
+}
+
+void fs_enet_mdio_fec_exit(void)
+{
+	of_unregister_platform_driver(&fs_enet_fec_mdio_driver);
+}
+#else
 static int __devinit fs_enet_fec_mdio_probe(struct device *dev)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -235,4 +373,4 @@ void fs_enet_mdio_fec_exit(void)
 {
 	driver_unregister(&fs_enet_fec_mdio_driver);
 }
-
+#endif
diff --git a/include/linux/fs_enet_pd.h b/include/linux/fs_enet_pd.h
index 543cd3c..3b9e718 100644
--- a/include/linux/fs_enet_pd.h
+++ b/include/linux/fs_enet_pd.h
@@ -119,6 +119,7 @@ struct fs_platform_info {
 
 	u32 cp_page;		/* CPM page */
 	u32 cp_block;		/* CPM sblock */
+	u32 cp_command;		/* CPM page/sblock/mcn */
 
 	u32 clk_trx;		/* some stuff for pins & mux configuration*/
 	u32 clk_rx;
@@ -133,7 +134,11 @@ struct fs_platform_info {
 	u32 device_flags;
 
 	int phy_addr;		/* the phy address (-1 no phy) */
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	char bus_id[16];
+#else
 	const char*	bus_id;
+#endif
 	int phy_irq;		/* the phy irq (if it exists)  */
 
 	const struct fs_mii_bus_info *bus_info;
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 7/7] fs_enet: sparse fixes
From: Scott Wood @ 2007-08-17 17:54 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

Mostly a bunch of __iomem annotations.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/fs_enet/fs_enet-main.c |   16 +++++-----
 drivers/net/fs_enet/fs_enet.h      |   30 +++++++++---------
 drivers/net/fs_enet/mac-fcc.c      |   60 ++++++++++++++++++++---------------
 drivers/net/fs_enet/mac-fec.c      |   34 ++++++++++----------
 drivers/net/fs_enet/mac-scc.c      |   37 +++++++++++-----------
 drivers/net/fs_enet/mii-fec.c      |    6 ++--
 6 files changed, 96 insertions(+), 87 deletions(-)

diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index 33cdabf..9de653a 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -62,7 +62,7 @@ MODULE_DESCRIPTION("Freescale Ethernet Driver");
 MODULE_LICENSE("GPL");
 MODULE_VERSION(DRV_MODULE_VERSION);
 
-int fs_enet_debug = -1;		/* -1 == use FS_ENET_DEF_MSG_ENABLE as value */
+static int fs_enet_debug = -1; /* -1 == use FS_ENET_DEF_MSG_ENABLE as value */
 module_param(fs_enet_debug, int, 0);
 MODULE_PARM_DESC(fs_enet_debug,
 		 "Freescale bitmapped debugging message enable value");
@@ -88,7 +88,7 @@ static int fs_enet_rx_napi(struct net_device *dev, int *budget)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 	const struct fs_platform_info *fpi = fep->fpi;
-	cbd_t *bdp;
+	cbd_t __iomem *bdp;
 	struct sk_buff *skb, *skbn, *skbt;
 	int received = 0;
 	u16 pkt_len, sc;
@@ -240,7 +240,7 @@ static int fs_enet_rx_non_napi(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 	const struct fs_platform_info *fpi = fep->fpi;
-	cbd_t *bdp;
+	cbd_t __iomem *bdp;
 	struct sk_buff *skb, *skbn, *skbt;
 	int received = 0;
 	u16 pkt_len, sc;
@@ -365,7 +365,7 @@ static int fs_enet_rx_non_napi(struct net_device *dev)
 static void fs_enet_tx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	cbd_t *bdp;
+	cbd_t __iomem *bdp;
 	struct sk_buff *skb;
 	int dirtyidx, do_wake, do_restart;
 	u16 sc;
@@ -513,7 +513,7 @@ fs_enet_interrupt(int irq, void *dev_id)
 void fs_init_bds(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	cbd_t *bdp;
+	cbd_t __iomem *bdp;
 	struct sk_buff *skb;
 	int i;
 
@@ -567,7 +567,7 @@ void fs_cleanup_bds(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 	struct sk_buff *skb;
-	cbd_t *bdp;
+	cbd_t __iomem *bdp;
 	int i;
 
 	/*
@@ -608,7 +608,7 @@ void fs_cleanup_bds(struct net_device *dev)
 static int fs_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	cbd_t *bdp;
+	cbd_t __iomem *bdp;
 	int curidx;
 	u16 sc;
 	unsigned long flags;
@@ -1144,7 +1144,7 @@ static int fs_cleanup_instance(struct net_device *ndev)
 /**************************************************************************************/
 
 /* handy pointer to the immap */
-void *fs_enet_immap = NULL;
+void __iomem *fs_enet_immap;
 
 static int setup_immap(void)
 {
diff --git a/drivers/net/fs_enet/fs_enet.h b/drivers/net/fs_enet/fs_enet.h
index a130332..12594f8 100644
--- a/drivers/net/fs_enet/fs_enet.h
+++ b/drivers/net/fs_enet/fs_enet.h
@@ -15,7 +15,7 @@
 #include <asm/commproc.h>
 
 struct fec_info {
-	fec_t *fecp;
+	fec_t __iomem *fecp;
 	u32 mii_speed;
 };
 #endif
@@ -80,14 +80,14 @@ struct fs_enet_private {
 	const struct fs_ops *ops;
 	int rx_ring, tx_ring;
 	dma_addr_t ring_mem_addr;
-	void *ring_base;
+	void __iomem *ring_base;
 	struct sk_buff **rx_skbuff;
 	struct sk_buff **tx_skbuff;
-	cbd_t *rx_bd_base;	/* Address of Rx and Tx buffers.    */
-	cbd_t *tx_bd_base;
-	cbd_t *dirty_tx;	/* ring entries to be free()ed.     */
-	cbd_t *cur_rx;
-	cbd_t *cur_tx;
+	cbd_t __iomem *rx_bd_base;	/* Address of Rx and Tx buffers.    */
+	cbd_t __iomem *tx_bd_base;
+	cbd_t __iomem *dirty_tx;	/* ring entries to be free()ed.     */
+	cbd_t __iomem *cur_rx;
+	cbd_t __iomem *cur_tx;
 	int tx_free;
 	struct net_device_stats stats;
 	struct timer_list phy_timer_list;
@@ -112,23 +112,23 @@ struct fs_enet_private {
 	union {
 		struct {
 			int idx;		/* FEC1 = 0, FEC2 = 1  */
-			void *fecp;		/* hw registers        */
+			void __iomem *fecp;	/* hw registers        */
 			u32 hthi, htlo;		/* state for multicast */
 		} fec;
 
 		struct {
 			int idx;		/* FCC1-3 = 0-2	       */
-			void *fccp;		/* hw registers	       */
-			void *ep;		/* parameter ram       */
-			void *fcccp;		/* hw registers cont.  */
-			void *mem;		/* FCC DPRAM */
+			void __iomem *fccp;	/* hw registers	       */
+			void __iomem *ep;	/* parameter ram       */
+			void __iomem *fcccp;	/* hw registers cont.  */
+			void __iomem *mem;	/* FCC DPRAM */
 			u32 gaddrh, gaddrl;	/* group address       */
 		} fcc;
 
 		struct {
 			int idx;		/* FEC1 = 0, FEC2 = 1  */
-			void *sccp;		/* hw registers        */
-			void *ep;		/* parameter ram       */
+			void __iomem *sccp;	/* hw registers        */
+			void __iomem *ep;	/* parameter ram       */
 			u32 hthi, htlo;		/* state for multicast */
 		} scc;
 
@@ -199,7 +199,7 @@ extern const struct fs_ops fs_scc_ops;
 /*******************************************************************/
 
 /* handy pointer to the immap */
-extern void *fs_enet_immap;
+extern void __iomem *fs_enet_immap;
 
 /*******************************************************************/
 
diff --git a/drivers/net/fs_enet/mac-fcc.c b/drivers/net/fs_enet/mac-fcc.c
index 10e2a2b..fe3e4c3 100644
--- a/drivers/net/fs_enet/mac-fcc.c
+++ b/drivers/net/fs_enet/mac-fcc.c
@@ -81,8 +81,6 @@
 static inline int fcc_cr_cmd(struct fs_enet_private *fep, u32 op)
 {
 	const struct fs_platform_info *fpi = fep->fpi;
-	cpm2_map_t *immap = fs_enet_immap;
-	cpm_cpm2_t *cpmp = &immap->im_cpm;
 	int i;
 
 	W32(cpmp, cp_cpcr, fpi->cp_command | op | CPM_CR_FLG);
@@ -118,8 +116,8 @@ static int do_pd_setup(struct fs_enet_private *fep)
 	if (!fep->fcc.fcccp)
 		goto out_ep;
 
-	fep->fcc.mem = (void *)cpm_dpalloc(128, 8);
-	fpi->dpram_offset = (u32)cpm2_immr;
+	fep->fcc.mem = (void __iomem *)cpm2_immr;
+	fpi->dpram_offset = cpm_dpalloc(128, 8);
 	if (IS_ERR_VALUE(fpi->dpram_offset)) {
 		ret = fpi->dpram_offset;
 		goto out_fcccp;
@@ -212,7 +210,7 @@ static int allocate_bd(struct net_device *dev)
 	struct fs_enet_private *fep = netdev_priv(dev);
 	const struct fs_platform_info *fpi = fep->fpi;
 
-	fep->ring_base = dma_alloc_coherent(fep->dev,
+	fep->ring_base = (void __iomem __force *)dma_alloc_coherent(fep->dev,
 					    (fpi->tx_ring + fpi->rx_ring) *
 					    sizeof(cbd_t), &fep->ring_mem_addr,
 					    GFP_KERNEL);
@@ -230,7 +228,7 @@ static void free_bd(struct net_device *dev)
 	if (fep->ring_base)
 		dma_free_coherent(fep->dev,
 			(fpi->tx_ring + fpi->rx_ring) * sizeof(cbd_t),
-			fep->ring_base, fep->ring_mem_addr);
+			(void __force *)fep->ring_base, fep->ring_mem_addr);
 }
 
 static void cleanup_data(struct net_device *dev)
@@ -241,7 +239,7 @@ static void cleanup_data(struct net_device *dev)
 static void set_promiscuous_mode(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	S32(fccp, fcc_fpsmr, FCC_PSMR_PRO);
 }
@@ -249,7 +247,7 @@ static void set_promiscuous_mode(struct net_device *dev)
 static void set_multicast_start(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_enet_t *ep = fep->fcc.ep;
+	fcc_enet_t __iomem *ep = fep->fcc.ep;
 
 	W32(ep, fen_gaddrh, 0);
 	W32(ep, fen_gaddrl, 0);
@@ -258,7 +256,7 @@ static void set_multicast_start(struct net_device *dev)
 static void set_multicast_one(struct net_device *dev, const u8 *mac)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_enet_t *ep = fep->fcc.ep;
+	fcc_enet_t __iomem *ep = fep->fcc.ep;
 	u16 taddrh, taddrm, taddrl;
 
 	taddrh = ((u16)mac[5] << 8) | mac[4];
@@ -274,8 +272,8 @@ static void set_multicast_one(struct net_device *dev, const u8 *mac)
 static void set_multicast_finish(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
-	fcc_enet_t *ep = fep->fcc.ep;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
+	fcc_enet_t __iomem *ep = fep->fcc.ep;
 
 	/* clear promiscuous always */
 	C32(fccp, fcc_fpsmr, FCC_PSMR_PRO);
@@ -310,12 +308,14 @@ static void restart(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 	const struct fs_platform_info *fpi = fep->fpi;
-	fcc_t *fccp = fep->fcc.fccp;
-	fcc_c_t *fcccp = fep->fcc.fcccp;
-	fcc_enet_t *ep = fep->fcc.ep;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
+	fcc_c_t __iomem *fcccp = fep->fcc.fcccp;
+	fcc_enet_t __iomem *ep = fep->fcc.ep;
 	dma_addr_t rx_bd_base_phys, tx_bd_base_phys;
 	u16 paddrh, paddrm, paddrl;
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
 	u16 mem_addr;
+#endif
 	const unsigned char *mac;
 	int i;
 
@@ -347,14 +347,22 @@ static void restart(struct net_device *dev)
 	 * this area.
 	 */
 
+#ifdef CONFIG_PPC_CPM_NEW_BINDING
+	W16(ep, fen_genfcc.fcc_riptr, fpi->dpram_offset);
+	W16(ep, fen_genfcc.fcc_tiptr, fpi->dpram_offset + 32);
+
+	W16(ep, fen_padptr, fpi->dpram_offset + 64);
+#else
 	mem_addr = (u32) fep->fcc.mem;	/* de-fixup dpram offset */
 
 	W16(ep, fen_genfcc.fcc_riptr, (mem_addr & 0xffff));
 	W16(ep, fen_genfcc.fcc_tiptr, ((mem_addr + 32) & 0xffff));
+
 	W16(ep, fen_padptr, mem_addr + 64);
+#endif
 
 	/* fill with special symbol...  */
-	memset(fep->fcc.mem + fpi->dpram_offset + 64, 0x88, 32);
+	memset_io(fep->fcc.mem + fpi->dpram_offset + 64, 0x88, 32);
 
 	W32(ep, fen_genfcc.fcc_rbptr, 0);
 	W32(ep, fen_genfcc.fcc_tbptr, 0);
@@ -470,7 +478,7 @@ static void restart(struct net_device *dev)
 static void stop(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	/* stop ethernet */
 	C32(fccp, fcc_gfmr, FCC_GFMR_ENR | FCC_GFMR_ENT);
@@ -497,7 +505,7 @@ static void post_free_irq(struct net_device *dev, int irq)
 static void napi_clear_rx_event(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	W16(fccp, fcc_fcce, FCC_NAPI_RX_EVENT_MSK);
 }
@@ -505,7 +513,7 @@ static void napi_clear_rx_event(struct net_device *dev)
 static void napi_enable_rx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	S16(fccp, fcc_fccm, FCC_NAPI_RX_EVENT_MSK);
 }
@@ -513,7 +521,7 @@ static void napi_enable_rx(struct net_device *dev)
 static void napi_disable_rx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	C16(fccp, fcc_fccm, FCC_NAPI_RX_EVENT_MSK);
 }
@@ -526,7 +534,7 @@ static void rx_bd_done(struct net_device *dev)
 static void tx_kickstart(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	S16(fccp, fcc_ftodr, 0x8000);
 }
@@ -534,7 +542,7 @@ static void tx_kickstart(struct net_device *dev)
 static u32 get_int_events(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	return (u32)R16(fccp, fcc_fcce);
 }
@@ -542,7 +550,7 @@ static u32 get_int_events(struct net_device *dev)
 static void clear_int_events(struct net_device *dev, u32 int_events)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	W16(fccp, fcc_fcce, int_events & 0xffff);
 }
@@ -553,7 +561,7 @@ static void ev_error(struct net_device *dev, u32 int_events)
 	       ": %s FS_ENET ERROR(s) 0x%x\n", dev->name, int_events);
 }
 
-int get_regs(struct net_device *dev, void *p, int *sizep)
+static int get_regs(struct net_device *dev, void *p, int *sizep)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 
@@ -570,7 +578,7 @@ int get_regs(struct net_device *dev, void *p, int *sizep)
 	return 0;
 }
 
-int get_regs_len(struct net_device *dev)
+static int get_regs_len(struct net_device *dev)
 {
 	return sizeof(fcc_t) + sizeof(fcc_enet_t) + 1;
 }
@@ -583,10 +591,10 @@ int get_regs_len(struct net_device *dev)
  * CPM37, we must disable and then re-enable the transmitter
  * following a Late Collision, Underrun, or Retry Limit error.
  */
-void tx_restart(struct net_device *dev)
+static void tx_restart(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fcc_t *fccp = fep->fcc.fccp;
+	fcc_t __iomem *fccp = fep->fcc.fccp;
 
 	C32(fccp, fcc_gfmr, FCC_GFMR_ENT);
 	udelay(10);
diff --git a/drivers/net/fs_enet/mac-fec.c b/drivers/net/fs_enet/mac-fec.c
index ef305e2..c96df19 100644
--- a/drivers/net/fs_enet/mac-fec.c
+++ b/drivers/net/fs_enet/mac-fec.c
@@ -83,7 +83,7 @@
  */
 #define FEC_RESET_DELAY		50
 
-static int whack_reset(fec_t * fecp)
+static int whack_reset(fec_t __iomem *fecp)
 {
 	int i;
 
@@ -159,7 +159,7 @@ static int allocate_bd(struct net_device *dev)
 	struct fs_enet_private *fep = netdev_priv(dev);
 	const struct fs_platform_info *fpi = fep->fpi;
 	
-	fep->ring_base = dma_alloc_coherent(fep->dev,
+	fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev,
 					    (fpi->tx_ring + fpi->rx_ring) *
 					    sizeof(cbd_t), &fep->ring_mem_addr,
 					    GFP_KERNEL);
@@ -177,7 +177,7 @@ static void free_bd(struct net_device *dev)
 	if(fep->ring_base)
 		dma_free_coherent(fep->dev, (fpi->tx_ring + fpi->rx_ring)
 					* sizeof(cbd_t),
-					fep->ring_base,
+					(void __force *)fep->ring_base,
 					fep->ring_mem_addr);
 }
 
@@ -189,7 +189,7 @@ static void cleanup_data(struct net_device *dev)
 static void set_promiscuous_mode(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FS(fecp, r_cntrl, FEC_RCNTRL_PROM);
 }
@@ -237,7 +237,7 @@ static void set_multicast_one(struct net_device *dev, const u8 *mac)
 static void set_multicast_finish(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	/* if all multi or too many multicasts; just enable all */
 	if ((dev->flags & IFF_ALLMULTI) != 0 ||
@@ -271,7 +271,7 @@ static void restart(struct net_device *dev)
 	u32 cptr;
 #endif
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 	const struct fs_platform_info *fpi = fep->fpi;
 	dma_addr_t rx_bd_base_phys, tx_bd_base_phys;
 	int r;
@@ -399,7 +399,7 @@ static void stop(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 	const struct fs_platform_info *fpi = fep->fpi;
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	struct fec_info* feci= fep->phydev->bus->priv;
 
@@ -461,7 +461,7 @@ static void post_free_irq(struct net_device *dev, int irq)
 static void napi_clear_rx_event(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FW(fecp, ievent, FEC_NAPI_RX_EVENT_MSK);
 }
@@ -469,7 +469,7 @@ static void napi_clear_rx_event(struct net_device *dev)
 static void napi_enable_rx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FS(fecp, imask, FEC_NAPI_RX_EVENT_MSK);
 }
@@ -477,7 +477,7 @@ static void napi_enable_rx(struct net_device *dev)
 static void napi_disable_rx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FC(fecp, imask, FEC_NAPI_RX_EVENT_MSK);
 }
@@ -485,7 +485,7 @@ static void napi_disable_rx(struct net_device *dev)
 static void rx_bd_done(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FW(fecp, r_des_active, 0x01000000);
 }
@@ -493,7 +493,7 @@ static void rx_bd_done(struct net_device *dev)
 static void tx_kickstart(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FW(fecp, x_des_active, 0x01000000);
 }
@@ -501,7 +501,7 @@ static void tx_kickstart(struct net_device *dev)
 static u32 get_int_events(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	return FR(fecp, ievent) & FR(fecp, imask);
 }
@@ -509,7 +509,7 @@ static u32 get_int_events(struct net_device *dev)
 static void clear_int_events(struct net_device *dev, u32 int_events)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	fec_t *fecp = fep->fec.fecp;
+	fec_t __iomem *fecp = fep->fec.fecp;
 
 	FW(fecp, ievent, int_events);
 }
@@ -520,7 +520,7 @@ static void ev_error(struct net_device *dev, u32 int_events)
 	       ": %s FEC ERROR(s) 0x%x\n", dev->name, int_events);
 }
 
-int get_regs(struct net_device *dev, void *p, int *sizep)
+static int get_regs(struct net_device *dev, void *p, int *sizep)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 
@@ -532,12 +532,12 @@ int get_regs(struct net_device *dev, void *p, int *sizep)
 	return 0;
 }
 
-int get_regs_len(struct net_device *dev)
+static int get_regs_len(struct net_device *dev)
 {
 	return sizeof(fec_t);
 }
 
-void tx_restart(struct net_device *dev)
+static void tx_restart(struct net_device *dev)
 {
 	/* nothing */
 }
diff --git a/drivers/net/fs_enet/mac-scc.c b/drivers/net/fs_enet/mac-scc.c
index 6ec6b7d..3d094fe 100644
--- a/drivers/net/fs_enet/mac-scc.c
+++ b/drivers/net/fs_enet/mac-scc.c
@@ -191,7 +191,8 @@ static int allocate_bd(struct net_device *dev)
 	if (IS_ERR_VALUE(fep->ring_mem_addr))
 		return -ENOMEM;
 
-	fep->ring_base = cpm_dpram_addr(fep->ring_mem_addr);
+	fep->ring_base = (void __iomem __force *)
+		cpm_dpram_addr(fep->ring_mem_addr);
 
 	return 0;
 }
@@ -212,7 +213,7 @@ static void cleanup_data(struct net_device *dev)
 static void set_promiscuous_mode(struct net_device *dev)
 {				
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 
 	S16(sccp, scc_psmr, SCC_PSMR_PRO);
 }
@@ -220,7 +221,7 @@ static void set_promiscuous_mode(struct net_device *dev)
 static void set_multicast_start(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_enet_t *ep = fep->scc.ep;
+	scc_enet_t __iomem *ep = fep->scc.ep;
 
 	W16(ep, sen_gaddr1, 0);
 	W16(ep, sen_gaddr2, 0);
@@ -231,7 +232,7 @@ static void set_multicast_start(struct net_device *dev)
 static void set_multicast_one(struct net_device *dev, const u8 * mac)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_enet_t *ep = fep->scc.ep;
+	scc_enet_t __iomem *ep = fep->scc.ep;
 	u16 taddrh, taddrm, taddrl;
 
 	taddrh = ((u16) mac[5] << 8) | mac[4];
@@ -247,8 +248,8 @@ static void set_multicast_one(struct net_device *dev, const u8 * mac)
 static void set_multicast_finish(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
-	scc_enet_t *ep = fep->scc.ep;
+	scc_t __iomem *sccp = fep->scc.sccp;
+	scc_enet_t __iomem *ep = fep->scc.ep;
 
 	/* clear promiscuous always */
 	C16(sccp, scc_psmr, SCC_PSMR_PRO);
@@ -285,8 +286,8 @@ static void set_multicast_list(struct net_device *dev)
 static void restart(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
-	scc_enet_t *ep = fep->scc.ep;
+	scc_t __iomem *sccp = fep->scc.sccp;
+	scc_enet_t __iomem *ep = fep->scc.ep;
 	const struct fs_platform_info *fpi = fep->fpi;
 	u16 paddrh, paddrm, paddrl;
 	const unsigned char *mac;
@@ -296,7 +297,7 @@ static void restart(struct net_device *dev)
 
 	/* clear everything (slow & steady does it) */
 	for (i = 0; i < sizeof(*ep); i++)
-		__fs_out8((char *)ep + i, 0);
+		__fs_out8((u8 __iomem *)ep + i, 0);
 
 	/* point to bds */
 	W16(ep, sen_genscc.scc_rbase, fep->ring_mem_addr);
@@ -397,7 +398,7 @@ static void restart(struct net_device *dev)
 static void stop(struct net_device *dev)	
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 	int i;
 
 	for (i = 0; (R16(sccp, scc_sccm) == 0) && i < SCC_RESET_DELAY; i++)
@@ -441,7 +442,7 @@ static void post_free_irq(struct net_device *dev, int irq)
 static void napi_clear_rx_event(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 
 	W16(sccp, scc_scce, SCC_NAPI_RX_EVENT_MSK);
 }
@@ -449,7 +450,7 @@ static void napi_clear_rx_event(struct net_device *dev)
 static void napi_enable_rx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 
 	S16(sccp, scc_sccm, SCC_NAPI_RX_EVENT_MSK);
 }
@@ -457,7 +458,7 @@ static void napi_enable_rx(struct net_device *dev)
 static void napi_disable_rx(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 
 	C16(sccp, scc_sccm, SCC_NAPI_RX_EVENT_MSK);
 }
@@ -475,7 +476,7 @@ static void tx_kickstart(struct net_device *dev)
 static u32 get_int_events(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 
 	return (u32) R16(sccp, scc_scce);
 }
@@ -483,7 +484,7 @@ static u32 get_int_events(struct net_device *dev)
 static void clear_int_events(struct net_device *dev, u32 int_events)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	scc_t *sccp = fep->scc.sccp;
+	scc_t __iomem *sccp = fep->scc.sccp;
 
 	W16(sccp, scc_scce, int_events & 0xffff);
 }
@@ -498,20 +499,20 @@ static int get_regs(struct net_device *dev, void *p, int *sizep)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
 
-	if (*sizep < sizeof(scc_t) + sizeof(scc_enet_t))
+	if (*sizep < sizeof(scc_t) + sizeof(scc_enet_t __iomem *))
 		return -EINVAL;
 
 	memcpy_fromio(p, fep->scc.sccp, sizeof(scc_t));
 	p = (char *)p + sizeof(scc_t);
 
-	memcpy_fromio(p, fep->scc.ep, sizeof(scc_enet_t));
+	memcpy_fromio(p, fep->scc.ep, sizeof(scc_enet_t __iomem *));
 
 	return 0;
 }
 
 static int get_regs_len(struct net_device *dev)
 {
-	return sizeof(scc_t) + sizeof(scc_enet_t);
+	return sizeof(scc_t) + sizeof(scc_enet_t __iomem *);
 }
 
 static void tx_restart(struct net_device *dev)
diff --git a/drivers/net/fs_enet/mii-fec.c b/drivers/net/fs_enet/mii-fec.c
index 71e07b0..d350ae9 100644
--- a/drivers/net/fs_enet/mii-fec.c
+++ b/drivers/net/fs_enet/mii-fec.c
@@ -70,7 +70,7 @@ static int match_has_phy (struct device *dev, void* data)
 static int fs_mii_fec_init(struct fec_info* fec, struct fs_mii_fec_platform_info *fmpi)
 {
 	struct resource *r;
-	fec_t *fecp;
+	fec_t __iomem *fecp;
 	char* name = "fsl-cpm-fec";
 
 	/* we need fec in order to be useful */
@@ -100,7 +100,7 @@ static int fs_mii_fec_init(struct fec_info* fec, struct fs_mii_fec_platform_info
 static int fs_enet_fec_mii_read(struct mii_bus *bus , int phy_id, int location)
 {
 	struct fec_info* fec = bus->priv;
-	fec_t *fecp = fec->fecp;
+	fec_t __iomem *fecp = fec->fecp;
 	int i, ret = -1;
 
 	if ((in_be32(&fecp->fec_r_cntrl) & FEC_RCNTRL_MII_MODE) == 0)
@@ -124,7 +124,7 @@ static int fs_enet_fec_mii_read(struct mii_bus *bus , int phy_id, int location)
 static int fs_enet_fec_mii_write(struct mii_bus *bus, int phy_id, int location, u16 val)
 {
 	struct fec_info* fec = bus->priv;
-	fec_t *fecp = fec->fecp;
+	fec_t __iomem *fecp = fec->fecp;
 	int i;
 
 	/* this must never happen */
-- 
1.5.0.3

^ permalink raw reply related

* [PATCH 5/7] fs_enet: Align receive buffers.
From: Scott Wood @ 2007-08-17 17:54 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

At least some hardware driven by this driver needs receive buffers
to be aligned on a 16-byte boundary.  This usually happens by chance,
but it breaks if slab debugging is enabled.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/fs_enet/fs_enet-main.c |   21 +++++++++++++++++++--
 drivers/net/fs_enet/fs_enet.h      |    3 ++-
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index da79a45..a4b76cd 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -69,6 +69,14 @@ static void fs_set_multicast_list(struct net_device *dev)
 	(*fep->ops->set_multicast_list)(dev);
 }
 
+static void skb_align(struct sk_buff *skb, int align)
+{
+	int off = ((unsigned long)skb->data) & (align - 1);
+
+	if (off)
+		skb_reserve(skb, align - off);
+}
+
 /* NAPI receive function */
 static int fs_enet_rx_napi(struct net_device *dev, int *budget)
 {
@@ -166,9 +174,13 @@ static int fs_enet_rx_napi(struct net_device *dev, int *budget)
 					skb = skbn;
 					skbn = skbt;
 				}
-			} else
+			} else {
 				skbn = dev_alloc_skb(ENET_RX_FRSIZE);
 
+				if (skbn)
+					skb_align(skbn, ENET_RX_ALIGN);
+			}
+
 			if (skbn != NULL) {
 				skb_put(skb, pkt_len);	/* Make room */
 				skb->protocol = eth_type_trans(skb, dev);
@@ -300,9 +312,13 @@ static int fs_enet_rx_non_napi(struct net_device *dev)
 					skb = skbn;
 					skbn = skbt;
 				}
-			} else
+			} else {
 				skbn = dev_alloc_skb(ENET_RX_FRSIZE);
 
+				if (skbn)
+					skb_align(skbn, ENET_RX_ALIGN);
+			}
+
 			if (skbn != NULL) {
 				skb_put(skb, pkt_len);	/* Make room */
 				skb->protocol = eth_type_trans(skb, dev);
@@ -512,6 +528,7 @@ void fs_init_bds(struct net_device *dev)
 			       dev->name);
 			break;
 		}
+		skb_align(skb, ENET_RX_ALIGN);
 		fep->rx_skbuff[i] = skb;
 		CBDW_BUFADDR(bdp,
 			dma_map_single(fep->dev, skb->data,
diff --git a/drivers/net/fs_enet/fs_enet.h b/drivers/net/fs_enet/fs_enet.h
index 72a61e9..f8c7ee8 100644
--- a/drivers/net/fs_enet/fs_enet.h
+++ b/drivers/net/fs_enet/fs_enet.h
@@ -82,7 +82,8 @@ struct phy_info {
 /* Must be a multiple of 32 (to cover both FEC & FCC) */
 #define PKT_MAXBLR_SIZE		((PKT_MAXBUF_SIZE + 31) & ~31)
 /* This is needed so that invalidate_xxx wont invalidate too much */
-#define ENET_RX_FRSIZE		L1_CACHE_ALIGN(PKT_MAXBUF_SIZE)
+#define ENET_RX_ALIGN  16
+#define ENET_RX_FRSIZE L1_CACHE_ALIGN(PKT_MAXBUF_SIZE + ENET_RX_ALIGN - 1)
 
 struct fs_enet_mii_bus {
 	struct list_head list;
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 4/7] fs_enet: mac-fcc: Eliminate __fcc-* macros.
From: Scott Wood @ 2007-08-17 17:54 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

These macros accomplish nothing other than defeating type checking.

This patch also fixes one instance of the wrong register size being
used that was revealed by enabling type checking.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/fs_enet/mac-fcc.c |   25 ++++++++-----------------
 1 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/net/fs_enet/mac-fcc.c b/drivers/net/fs_enet/mac-fcc.c
index ad3c5fa..8b30361 100644
--- a/drivers/net/fs_enet/mac-fcc.c
+++ b/drivers/net/fs_enet/mac-fcc.c
@@ -48,28 +48,19 @@
 
 /* FCC access macros */
 
-#define __fcc_out32(addr, x)	out_be32((unsigned *)addr, x)
-#define __fcc_out16(addr, x)	out_be16((unsigned short *)addr, x)
-#define __fcc_out8(addr, x)	out_8((unsigned char *)addr, x)
-#define __fcc_in32(addr)	in_be32((unsigned *)addr)
-#define __fcc_in16(addr)	in_be16((unsigned short *)addr)
-#define __fcc_in8(addr)		in_8((unsigned char *)addr)
-
-/* parameter space */
-
 /* write, read, set bits, clear bits */
-#define W32(_p, _m, _v)	__fcc_out32(&(_p)->_m, (_v))
-#define R32(_p, _m)	__fcc_in32(&(_p)->_m)
+#define W32(_p, _m, _v)	out_be32(&(_p)->_m, (_v))
+#define R32(_p, _m)	in_be32(&(_p)->_m)
 #define S32(_p, _m, _v)	W32(_p, _m, R32(_p, _m) | (_v))
 #define C32(_p, _m, _v)	W32(_p, _m, R32(_p, _m) & ~(_v))
 
-#define W16(_p, _m, _v)	__fcc_out16(&(_p)->_m, (_v))
-#define R16(_p, _m)	__fcc_in16(&(_p)->_m)
+#define W16(_p, _m, _v)	out_be16(&(_p)->_m, (_v))
+#define R16(_p, _m)	in_be16(&(_p)->_m)
 #define S16(_p, _m, _v)	W16(_p, _m, R16(_p, _m) | (_v))
 #define C16(_p, _m, _v)	W16(_p, _m, R16(_p, _m) & ~(_v))
 
-#define W8(_p, _m, _v)	__fcc_out8(&(_p)->_m, (_v))
-#define R8(_p, _m)	__fcc_in8(&(_p)->_m)
+#define W8(_p, _m, _v)	out_8(&(_p)->_m, (_v))
+#define R8(_p, _m)	in_8(&(_p)->_m)
 #define S8(_p, _m, _v)	W8(_p, _m, R8(_p, _m) | (_v))
 #define C8(_p, _m, _v)	W8(_p, _m, R8(_p, _m) & ~(_v))
 
@@ -290,7 +281,7 @@ static void restart(struct net_device *dev)
 
 	/* clear everything (slow & steady does it) */
 	for (i = 0; i < sizeof(*ep); i++)
-		__fcc_out8((char *)ep + i, 0);
+		out_8((char *)ep + i, 0);
 
 	/* get physical address */
 	rx_bd_base_phys = fep->ring_mem_addr;
@@ -495,7 +486,7 @@ static void tx_kickstart(struct net_device *dev)
 	struct fs_enet_private *fep = netdev_priv(dev);
 	fcc_t *fccp = fep->fcc.fccp;
 
-	S32(fccp, fcc_ftodr, 0x80);
+	S16(fccp, fcc_ftodr, 0x8000);
 }
 
 static u32 get_int_events(struct net_device *dev)
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 2/7] fs_enet: Whitespace cleanup.
From: Scott Wood @ 2007-08-17 17:53 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/fs_enet/fs_enet-main.c |   85 ++++++++++++++++-------------------
 drivers/net/fs_enet/fs_enet.h      |    4 +-
 drivers/net/fs_enet/mac-fcc.c      |    1 -
 drivers/net/fs_enet/mii-fec.c      |    1 -
 4 files changed, 41 insertions(+), 50 deletions(-)

diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index a4a2a0e..f261b90 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -353,7 +353,6 @@ static void fs_enet_tx(struct net_device *dev)
 
 	do_wake = do_restart = 0;
 	while (((sc = CBDR_SC(bdp)) & BD_ENET_TX_READY) == 0) {
-
 		dirtyidx = bdp - fep->tx_bd_base;
 
 		if (fep->tx_free == fep->tx_ring)
@@ -454,7 +453,6 @@ fs_enet_interrupt(int irq, void *dev_id)
 
 	nr = 0;
 	while ((int_events = (*fep->ops->get_int_events)(dev)) != 0) {
-
 		nr++;
 
 		int_clr_events = int_events;
@@ -710,45 +708,43 @@ static void fs_timeout(struct net_device *dev)
  *-----------------------------------------------------------------------------*/
 static void generic_adjust_link(struct  net_device *dev)
 {
-       struct fs_enet_private *fep = netdev_priv(dev);
-       struct phy_device *phydev = fep->phydev;
-       int new_state = 0;
-
-       if (phydev->link) {
-
-               /* adjust to duplex mode */
-               if (phydev->duplex != fep->oldduplex){
-                       new_state = 1;
-                       fep->oldduplex = phydev->duplex;
-               }
-
-               if (phydev->speed != fep->oldspeed) {
-                       new_state = 1;
-                       fep->oldspeed = phydev->speed;
-               }
-
-               if (!fep->oldlink) {
-                       new_state = 1;
-                       fep->oldlink = 1;
-                       netif_schedule(dev);
-                       netif_carrier_on(dev);
-                       netif_start_queue(dev);
-               }
-
-               if (new_state)
-                       fep->ops->restart(dev);
-
-       } else if (fep->oldlink) {
-               new_state = 1;
-               fep->oldlink = 0;
-               fep->oldspeed = 0;
-               fep->oldduplex = -1;
-               netif_carrier_off(dev);
-               netif_stop_queue(dev);
-       }
-
-       if (new_state && netif_msg_link(fep))
-               phy_print_status(phydev);
+	struct fs_enet_private *fep = netdev_priv(dev);
+	struct phy_device *phydev = fep->phydev;
+	int new_state = 0;
+
+	if (phydev->link) {
+		/* adjust to duplex mode */
+		if (phydev->duplex != fep->oldduplex) {
+			new_state = 1;
+			fep->oldduplex = phydev->duplex;
+		}
+
+		if (phydev->speed != fep->oldspeed) {
+			new_state = 1;
+			fep->oldspeed = phydev->speed;
+		}
+
+		if (!fep->oldlink) {
+			new_state = 1;
+			fep->oldlink = 1;
+			netif_schedule(dev);
+			netif_carrier_on(dev);
+			netif_start_queue(dev);
+		}
+
+		if (new_state)
+			fep->ops->restart(dev);
+	} else if (fep->oldlink) {
+		new_state = 1;
+		fep->oldlink = 0;
+		fep->oldspeed = 0;
+		fep->oldduplex = -1;
+		netif_carrier_off(dev);
+		netif_stop_queue(dev);
+	}
+
+	if (new_state && netif_msg_link(fep))
+		phy_print_status(phydev);
 }
 
 
@@ -792,7 +788,6 @@ static int fs_init_phy(struct net_device *dev)
 	return 0;
 }
 
-
 static int fs_enet_open(struct net_device *dev)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
@@ -978,7 +973,7 @@ static struct net_device *fs_init_instance(struct device *dev,
 #endif
 
 #ifdef CONFIG_FS_ENET_HAS_SCC
-	if (fs_get_scc_index(fpi->fs_no) >=0 )
+	if (fs_get_scc_index(fpi->fs_no) >= 0)
 		fep->ops = &fs_scc_ops;
 #endif
 
@@ -1069,9 +1064,8 @@ static struct net_device *fs_init_instance(struct device *dev,
 
 	return ndev;
 
-      err:
+err:
 	if (ndev != NULL) {
-
 		if (registered)
 			unregister_netdev(ndev);
 
@@ -1262,7 +1256,6 @@ static int __init fs_init(void)
 err:
 	cleanup_immap();
 	return r;
-	
 }
 
 static void __exit fs_cleanup(void)
diff --git a/drivers/net/fs_enet/fs_enet.h b/drivers/net/fs_enet/fs_enet.h
index 569be22..72a61e9 100644
--- a/drivers/net/fs_enet/fs_enet.h
+++ b/drivers/net/fs_enet/fs_enet.h
@@ -15,8 +15,8 @@
 #include <asm/commproc.h>
 
 struct fec_info {
-        fec_t*  fecp;
-	u32     mii_speed;
+	fec_t *fecp;
+	u32 mii_speed;
 };
 #endif
 
diff --git a/drivers/net/fs_enet/mac-fcc.c b/drivers/net/fs_enet/mac-fcc.c
index 5603121..ad3c5fa 100644
--- a/drivers/net/fs_enet/mac-fcc.c
+++ b/drivers/net/fs_enet/mac-fcc.c
@@ -86,7 +86,6 @@
 static inline int fcc_cr_cmd(struct fs_enet_private *fep, u32 mcn, u32 op)
 {
 	const struct fs_platform_info *fpi = fep->fpi;
-
 	cpm2_map_t *immap = fs_enet_immap;
 	cpm_cpm2_t *cpmp = &immap->im_cpm;
 	u32 v;
diff --git a/drivers/net/fs_enet/mii-fec.c b/drivers/net/fs_enet/mii-fec.c
index 0a563a8..53db696 100644
--- a/drivers/net/fs_enet/mii-fec.c
+++ b/drivers/net/fs_enet/mii-fec.c
@@ -113,7 +113,6 @@ static int fs_enet_fec_mii_read(struct mii_bus *bus , int phy_id, int location)
 	}
 
 	return ret;
-
 }
 
 static int fs_enet_fec_mii_write(struct mii_bus *bus, int phy_id, int location, u16 val)
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 3/7] fs_enet: Don't share the interrupt.
From: Scott Wood @ 2007-08-17 17:54 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev
In-Reply-To: <20070817175320.GA8259@ld0162-tx32.am.freescale.net>

This driver can't handle an interrupt immediately after request_irq
(making it fail with CONFIG_DEBUG_SHIRQ), and has unshared interrupts
on all hardware I'm aware of.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 drivers/net/fs_enet/fs_enet-main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index f261b90..da79a45 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -667,7 +667,7 @@ static int fs_request_irq(struct net_device *dev, int irq, const char *name,
 	struct fs_enet_private *fep = netdev_priv(dev);
 
 	(*fep->ops->pre_request_irq)(dev, irq);
-	return request_irq(irq, irqf, IRQF_SHARED, name, dev);
+	return request_irq(irq, irqf, 0, name, dev);
 }
 
 static void fs_free_irq(struct net_device *dev, int irq)
-- 
1.5.0.3


^ permalink raw reply related

* [PATCH 0/7] fs_enet patches
From: Scott Wood @ 2007-08-17 17:53 UTC (permalink / raw)
  To: jgarzik; +Cc: netdev, linuxppc-dev

The following patchset includes several updates for the fs_enet driver,
the most prominent being conversion to an of_platform device (with
platform_device code remaining until arch/ppc goes away).  It also
includes a generic MDIO bitbang library, and converts fs_enet to use it.

I have a powerpc patch (ep8248e board support) that depends on the MDIO
bitbang library, so it may be better to ack them for Paul to pick up --
but if not, I can postpone the ep8248e patch until the merge window.

-Scott

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Segher Boessenkool @ 2007-08-17 17:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang
In-Reply-To: <Pine.LNX.4.64.0708161319000.17777@schroedinger.engr.sgi.com>

>> atomic_dec() already has volatile behavior everywhere, so this is 
>> semantically
>> okay, but this code (and any like it) should be calling cpu_relax() 
>> each
>> iteration through the loop, unless there's a compelling reason not 
>> to.  I'll
>> allow that for some hardware drivers (possibly this one) such a 
>> compelling
>> reason may exist, but hardware-independent core subsystems probably 
>> have no
>> excuse.
>
> No it does not have any volatile semantics. atomic_dec() can be 
> reordered
> at will by the compiler within the current basic unit if you do not 
> add a
> barrier.

"volatile" has nothing to do with reordering.  atomic_dec() writes
to memory, so it _does_ have "volatile semantics", implicitly, as
long as the compiler cannot optimise the atomic variable away
completely -- any store counts as a side effect.


Segher


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Segher Boessenkool @ 2007-08-17 17:37 UTC (permalink / raw)
  To: Nick Piggin
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang
In-Reply-To: <46C505B2.6030704@yahoo.com.au>

>>>>>> Part of the motivation here is to fix heisenbugs.  If I knew 
>>>>>> where they
>>>>>
>>>>> By the same token we should probably disable optimisations
>>>>> altogether since that too can create heisenbugs.
>>>>
>>>> Almost everything is a tradeoff; and so is this.  I don't
>>>> believe most people would find disabling all compiler
>>>> optimisations an acceptable price to pay for some peace
>>>> of mind.
>>>
>>>
>>> So why is this a good tradeoff?
>> It certainly is better than disabling all compiler optimisations!
>
> It's easy to be better than something really stupid :)

Sure, it wasn't me who made the comparison though.

> So i386 and x86-64 don't have volatiles there, and it saves them a
> few K of kernel text.

Which has to be investigated.  A few kB is a lot more than expected.

> What you need to justify is why it is a good
> tradeoff to make them volatile (which btw, is much harder to go
> the other way after we let people make those assumptions).

My point is that people *already* made those assumptions.  There
are two ways to clean up this mess:

1) Have the "volatile" semantics by default, change the users
    that don't need it;
2) Have "non-volatile" semantics by default, change the users
    that do need it.

Option 2) randomly breaks stuff all over the place, option 1)
doesn't.  Yeah 1) could cause some extremely minor speed or
code size regression, but only temporarily until everything has
been audited.

>>> I also think that just adding things to APIs in the hope it might fix
>>> up some bugs isn't really a good road to go down. Where do you stop?
>> I look at it the other way: keeping the "volatile" semantics in
>> atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;
>
> Yeah, but we could add lots of things to help prevent bugs and
> would never be included. I would also contend that it helps _hide_
> bugs and encourages people to be lazy when thinking about these
> things.

Sure.  We aren't _adding_ anything here though, not on the platforms
where it is most likely to show up, anyway.

> Also, you dismiss the fact that we'd actually be *adding* volatile
> semantics back to the 2 most widely tested architectures (in terms
> of test time, number of testers, variety of configurations, and
> coverage of driver code).

I'm not dismissing that.  x86 however is one of the few architectures
where mistakenly leaving out a "volatile" will not easily show up on
user testing, since the compiler will very often produce a memory
reference anyway because it has no registers to play with.

> This is a very important different from
> just keeping volatile semantics because it is basically a one-way
> API change.

That's a good point.  Maybe we should create _two_ new APIs, one
explicitly going each way.

>> certainly most people expect that behaviour, and also that behaviour
>> is *needed* in some places and no other interface provides that
>> functionality.
>
> I don't know that most people would expect that behaviour.

I didn't conduct any formal poll either :-)

> Is there any documentation anywhere that would suggest this?

Not really I think, no.  But not the other way around, either.
Most uses of it seem to expect it though.

>> [some confusion about barriers wrt atomics snipped]
>
> What were you confused about?

Me?  Not much.


Segher


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Linus Torvalds @ 2007-08-17 16:48 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Satyam Sharma, Herbert Xu, Paul Mackerras, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <46C59717.4020108@cyberone.com.au>



On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> That's not obviously just taste to me. Not when the primitive has many
> (perhaps, the majority) of uses that do not require said barriers. And
> this is not solely about the code generation (which, as Paul says, is
> relatively minor even on x86). I prefer people to think explicitly
> about barriers in their lockless code.

Indeed.

I think the important issues are:

 - "volatile" itself is simply a badly/weakly defined issue. The semantics 
   of it as far as the compiler is concerned are really not very good, and 
   in practice tends to boil down to "I will generate so bad code that 
   nobody can accuse me of optimizing anything away".

 - "volatile" - regardless of how well or badly defined it is - is purely 
   a compiler thing. It has absolutely no meaning for the CPU itself, so 
   it at no point implies any CPU barriers. As a result, even if the 
   compiler generates crap code and doesn't re-order anything, there's 
   nothing that says what the CPU will do.

 - in other words, the *only* possible meaning for "volatile" is a purely 
   single-CPU meaning. And if you only have a single CPU involved in the 
   process, the "volatile" is by definition pointless (because even 
   without a volatile, the compiler is required to make the C code appear 
   consistent as far as a single CPU is concerned).

So, let's take the example *buggy* code where we use "volatile" to wait 
for other CPU's:

	atomic_set(&var, 0);
	while (!atomic_read(&var))
		/* nothing */;


which generates an endless loop if we don't have atomic_read() imply 
volatile.

The point here is that it's buggy whether the volatile is there or not! 
Exactly because the user expects multi-processing behaviour, but 
"volatile" doesn't actually give any real guarantees about it. Another CPU 
may have done:

	external_ptr = kmalloc(..);
	/* Setup is now complete, inform the waiter */
	atomic_inc(&var);

but the fact is, since the other CPU isn't serialized in any way, the 
"while-loop" (even in the presense of "volatile") doesn't actually work 
right! Whatever the "atomic_read()" was waiting for may not have 
completed, because we have no barriers!

So if "volatile" makes a difference, it is invariably a sign of a bug in 
serialization (the one exception is for IO - we use "volatile" to avoid 
having to use inline asm for IO on x86) - and for "random values" like 
jiffies).

So the question should *not* be whether "volatile" actually fixes bugs. It 
*never* fixes a bug. But what it can do is to hide the obvious ones. In 
other words, adding a volaile in the above kind of situation of 
"atomic_read()" will certainly turn an obvious bug into something that 
works "practically all of the time).

So anybody who argues for "volatile" fixing bugs is fundamentally 
incorrect. It does NO SUCH THING. By arguing that, such people only show 
that you have no idea what they are talking about.

So the only reason to add back "volatile" to the atomic_read() sequence is 
not to fix bugs, but to _hide_ the bugs better. They're still there, they 
are just a lot harder to trigger, and tend to be a lot subtler.

And hey, sometimes "hiding bugs well enough" is ok. In this case, I'd 
argue that we've successfully *not* had the volatile there for eight 
months on x86-64, and that should tell people something. 

(Does _removing_ the volatile fix bugs? No - callers still need to think 
about barriers etc, and lots of people don't. So I'm not claiming that 
removing volatile fixes any bugs either, but I *am* claiming that:

 - removing volatile makes some bugs easier to see (which is mostly a good 
   thing: they were there before, anyway).

 - removing volatile generates better code (which is a good thing, even if 
   it's just 0.1%)

 - removing volatile removes a huge mental *bug* that lots of people seem 
   to have, as shown by this whole thread. Anybody who thinks that 
   "volatile" actually fixes anything has a gaping hole in their head, and 
   we should remove volatile just to make sure that nobody thinks that it 
   means soemthign that it doesn't mean!

In other words, this whole discussion has just convinced me that we should 
*not* add back "volatile" to "atomic_read()" - I was willing to do it for 
practical and "hide the bugs" reasons, but having seen people argue for 
it, thinking that it actually fixes something, I'm now convinced that the 
*last* thing we should do is to encourage that kind of superstitious 
thinking.

"volatile" is like a black cat crossing the road. Sure, it affects 
*something* (at a minimum: before, the black cat was on one side of the 
road, afterwards it is on the other side of the road), but it has no 
bigger and longer-lasting direct affects. 

People who think "volatile" really matters are just fooling themselves.

		Linus

^ permalink raw reply

* Re: [PATCH] lockdep: annotate rcu_read_{,un}lock()
From: Paul E. McKenney @ 2007-08-17 15:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, herbert, 123.oleg, netdev, linux-kernel,
	David Miller, Daniel Walker, josht, minyard
In-Reply-To: <1187337405.6114.123.camel@twins>

On Fri, Aug 17, 2007 at 09:56:45AM +0200, Peter Zijlstra wrote:
> On Thu, 2007-08-16 at 09:01 -0700, Paul E. McKenney wrote:
> > On Thu, Aug 16, 2007 at 04:25:07PM +0200, Peter Zijlstra wrote:
> > > 
> > > There seem to be some unbalanced rcu_read_{,un}lock() issues of late,
> > > how about doing something like this:
> > 
> > This will break when rcu_read_lock() and rcu_read_unlock() are invoked
> > from NMI/SMI handlers -- the raw_local_irq_save() in lock_acquire() will
> > not mask NMIs or SMIs.
> > 
> > One approach would be to check for being in an NMI/SMI handler, and
> > to avoid calling lock_acquire() and lock_release() in those cases.
> 
> It seems:
> 
> #define nmi_enter()		do { lockdep_off(); __irq_enter(); } while (0)
> #define nmi_exit()		do { __irq_exit(); lockdep_on(); } while (0)
> 
> Should make it all work out just fine. (for NMIs at least, /me fully
> ignorant of the workings of SMIs)

Very good point, at least for NMIs on i386 and x86_64.  Can't say that I
know much about SMIs myself.  Or about whatever equivalents to NMIs and
SMIs might exist on other platforms.  :-/  Of course, the other platforms
could be handled by making the RCU lockdep operate only on i386 and x86_64
if required.

Corey, any advice on SMI handlers?  Is there something like nmi_enter()
and nmi_exit() that allows disabing lockdep?

> > Another approach would be to use sparse, which has checks for
> > rcu_read_lock()/rcu_read_unlock() nesting.
> 
> Yeah, but one more method can never hurt, no? :-)

Excellent point!

I guess the next thing for me is to do a performance check.  Looks like
CONFIG_DEBUG_LOCK_ALLOC is not on by default, so not violently worried
about performance, but would be good to know.

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney @ 2007-08-17 14:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <alpine.LFD.0.999.0708171232380.3666@enigma.security.iitk.ac.in>

On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> 
> 
> On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> 
> > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> > > >
> > > > The compiler can also reorder non-volatile accesses.  For an example
> > > > patch that cares about this, please see:
> > > > 
> > > > 	http://lkml.org/lkml/2007/8/7/280
> > > > 
> > > > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > > > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > > > to interrupt handlers and NMIs/SMIs running on that same CPU.
> > > 
> > > Good, finally we have some code to discuss (even though it's
> > > not actually in the kernel yet).
> > 
> > There was some earlier in this thread as well.
> 
> Hmm, I never quite got what all this interrupt/NMI/SMI handling and
> RCU business you mentioned earlier was all about, but now that you've
> pointed to the actual code and issues with it ...

Glad to help...

> > > First of all, I think this illustrates that what you want
> > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > macro occurs a lot more times in your patch than atomic
> > > reads/sets.  So *assuming* that it was necessary at all,
> > > then having an ordered variant of the atomic_read/atomic_set
> > > ops could do just as well.
> > 
> > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > to maintain ordering, then I could just use them instead of having to
> > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > different patch.)
> 
> +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> 
> I suppose one could want volatile access semantics for stuff that's
> a bit-field too, no?

One could, but this is not supported in general.  So if you want that,
you need to use the usual bit-mask tricks and (for setting) atomic
operations.

> Also, this gives *zero* "re-ordering" guarantees that your code wants
> as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> probably you don't care about) *nor* w.r.t. compiler re-ordering
> (which you definitely _do_ care about).

You are correct about CPU re-ordering (and about the fact that this
example doesn't care about it), but not about compiler re-ordering.

The compiler is prohibited from moving a volatile access across a sequence
point.  One example of a sequence point is a statement boundary.  Because
all of the volatile accesses in this code are separated by statement
boundaries, a conforming compiler is prohibited from reordering them.

> > > However, I still don't know which atomic_read/atomic_set in
> > > your patch would be broken if there were no volatile.  Could
> > > you please point them out?
> > 
> > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > 
> > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > 	was ordered by the compiler after
> > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > 	before the rcu_flipctr.
> > 
> > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > 	would believe that it could take the then-clause of the
> > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > 	variable had not yet been incremented, an RCU updater would
> > 	be within its rights to assume that there were no RCU reads
> > 	in progress, thus possibly yanking a data structure out from
> > 	under the reader in the SMI/NMI function.
> > 
> > 	Fatal outcome.  Note that only one CPU is involved here
> > 	because these are all either per-CPU or per-task variables.
> 
> Ok, so you don't care about CPU re-ordering. Still, I should let you know
> that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> want is a full compiler optimization barrier().

No.  See above.

> [ Your code probably works now, and emits correct code, but that's
>   just because of gcc did what it did. Nothing in any standard,
>   or in any documented behaviour of gcc, or anything about the real
>   (or expected) semantics of "volatile" is protecting the code here. ]

Really?  Why doesn't the prohibition against moving volatile accesses
across sequence points take care of this?

> > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > 	was ordered by the compiler to follow the
> > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > 	would incorrectly take the "else" clause of the enclosing "if"
> > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > 	write the new value into rcu_flipctr_idx.
> > 
> > 	Well and good so far.  But the problem arises in
> > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > 	Depending on exactly how subsequent events played out, this could
> > 	result in either prematurely ending grace periods or never-ending
> > 	grace periods, both of which are fatal outcomes.
> > 
> > And the following are not needed in the current version of the
> > patch, but will be in a future version that either avoids disabling
> > irqs or that dispenses with the smp_read_barrier_depends() that I
> > have 99% convinced myself is unneeded:
> > 
> > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > 
> > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > 
> > Furthermore, in that future version, irq handlers can cause the same
> > mischief that SMI/NMI handlers can in this version.
> > 
> > Next, looking at __rcu_read_unlock():
> > 
> > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > 	was reordered by the compiler to follow the
> > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > 	the two, this nested rcu_read_lock() would incorrectly believe
> > 	that it was protected by an enclosing RCU read-side critical
> > 	section as described in the first reversal discussed for
> > 	__rcu_read_lock() above.  Again, fatal outcome.
> > 
> > This is what we have now.  It is not hard to imagine situations that
> > interact with -both- interrupt handlers -and- other CPUs, as described
> > earlier.
> 
> It's not about interrupt/SMI/NMI handlers at all! What you clearly want,
> simply put, is that a certain stream of C statements must be emitted
> by the compiler _as they are_ with no re-ordering optimizations! You must
> *definitely* use barrier(), IMHO.

Almost.  I don't care about most of the operations, only about the loads
and stores marked volatile.  Again, although the compiler is free to
reorder volatile accesses that occur -within- a single statement, it
is prohibited by the standard from moving volatile accesses from one
statement to another.  Therefore, this code can legitimately use volatile.

Or am I missing something subtle?

						Thanx, Paul

^ permalink raw reply

* [PATCH 6/6] ibmveth: Remove use of bitfields
From: Brian King @ 2007-08-17 14:16 UTC (permalink / raw)
  To: jeff; +Cc: santil, rcjenn, netdev, linuxppc-dev, brking
In-Reply-To: <11873601831813-patch-mail.ibm.com>


Removes the use of bitfields from the ibmveth driver. This results
in slightly smaller object code.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---

 linux-2.6-bjking1/drivers/net/ibmveth.c |   90 ++++++++++++++++----------------
 linux-2.6-bjking1/drivers/net/ibmveth.h |   56 ++++++++-----------
 2 files changed, 68 insertions(+), 78 deletions(-)

diff -puN drivers/net/ibmveth.h~ibmveth_nobitfields drivers/net/ibmveth.h
--- linux-2.6/drivers/net/ibmveth.h~ibmveth_nobitfields	2007-08-09 15:15:27.000000000 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.h	2007-08-09 15:15:27.000000000 -0500
@@ -39,6 +39,12 @@
 #define IbmVethMcastRemoveFilter     0x2UL
 #define IbmVethMcastClearFilterTable 0x3UL
 
+#define IBMVETH_ILLAN_PADDED_PKT_CSUM	0x0000000000002000ULL
+#define IBMVETH_ILLAN_TRUNK_PRI_MASK	0x0000000000000F00ULL
+#define IBMVETH_ILLAN_IPV6_TCP_CSUM		0x0000000000000004ULL
+#define IBMVETH_ILLAN_IPV4_TCP_CSUM		0x0000000000000002ULL
+#define IBMVETH_ILLAN_ACTIVE_TRUNK		0x0000000000000001ULL
+
 /* hcall macros */
 #define h_register_logical_lan(ua, buflst, rxq, fltlst, mac) \
   plpar_hcall_norets(H_REGISTER_LOGICAL_LAN, ua, buflst, rxq, fltlst, mac)
@@ -150,13 +156,13 @@ struct ibmveth_adapter {
 };
 
 struct ibmveth_buf_desc_fields {
-    u32 valid : 1;
-    u32 toggle : 1;
-    u32 reserved : 4;
-    u32 no_csum : 1;
-    u32 csum_good : 1;
-    u32 length : 24;
-    u32 address;
+	u32 flags_len;
+#define IBMVETH_BUF_VALID	0x80000000
+#define IBMVETH_BUF_TOGGLE	0x40000000
+#define IBMVETH_BUF_NO_CSUM	0x02000000
+#define IBMVETH_BUF_CSUM_GOOD	0x01000000
+#define IBMVETH_BUF_LEN_MASK	0x00FFFFFF
+	u32 address;
 };
 
 union ibmveth_buf_desc {
@@ -164,33 +170,17 @@ union ibmveth_buf_desc {
     struct ibmveth_buf_desc_fields fields;
 };
 
-struct ibmveth_illan_attributes_fields {
-	u32 reserved;
-	u32 reserved2 : 18;
-	u32 csum_offload_padded_pkt_support : 1;
-	u32 reserved3 : 1;
-	u32 trunk_priority : 4;
-	u32 reserved4 : 5;
-	u32 tcp_csum_offload_ipv6 : 1;
-	u32 tcp_csum_offload_ipv4 : 1;
-	u32 active_trunk : 1;
-};
-
-union ibmveth_illan_attributes {
-	u64 desc;
-	struct ibmveth_illan_attributes_fields fields;
-};
-
 struct ibmveth_rx_q_entry {
-    u16 toggle : 1;
-    u16 valid : 1;
-    u16 reserved : 4;
-    u16 no_csum : 1;
-    u16 csum_good : 1;
-    u16 reserved2 : 8;
-    u16 offset;
-    u32 length;
-    u64 correlator;
+	u32 flags_off;
+#define IBMVETH_RXQ_TOGGLE		0x80000000
+#define IBMVETH_RXQ_TOGGLE_SHIFT	31
+#define IBMVETH_RXQ_VALID		0x40000000
+#define IBMVETH_RXQ_NO_CSUM		0x02000000
+#define IBMVETH_RXQ_CSUM_GOOD		0x01000000
+#define IBMVETH_RXQ_OFF_MASK		0x0000FFFF
+
+	u32 length;
+	u64 correlator;
 };
 
 #endif /* _IBMVETH_H */
diff -puN drivers/net/ibmveth.c~ibmveth_nobitfields drivers/net/ibmveth.c
--- linux-2.6/drivers/net/ibmveth.c~ibmveth_nobitfields	2007-08-09 15:15:27.000000000 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.c	2007-08-09 15:15:27.000000000 -0500
@@ -131,19 +131,29 @@ struct ibmveth_stat ibmveth_stats[] = {
 };
 
 /* simple methods of getting data from the current rxq entry */
+static inline u32 ibmveth_rxq_flags(struct ibmveth_adapter *adapter)
+{
+	return adapter->rx_queue.queue_addr[adapter->rx_queue.index].flags_off;
+}
+
+static inline int ibmveth_rxq_toggle(struct ibmveth_adapter *adapter)
+{
+	return (ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_TOGGLE) >> IBMVETH_RXQ_TOGGLE_SHIFT;
+}
+
 static inline int ibmveth_rxq_pending_buffer(struct ibmveth_adapter *adapter)
 {
-	return (adapter->rx_queue.queue_addr[adapter->rx_queue.index].toggle == adapter->rx_queue.toggle);
+	return (ibmveth_rxq_toggle(adapter) == adapter->rx_queue.toggle);
 }
 
 static inline int ibmveth_rxq_buffer_valid(struct ibmveth_adapter *adapter)
 {
-	return (adapter->rx_queue.queue_addr[adapter->rx_queue.index].valid);
+	return (ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_VALID);
 }
 
 static inline int ibmveth_rxq_frame_offset(struct ibmveth_adapter *adapter)
 {
-	return (adapter->rx_queue.queue_addr[adapter->rx_queue.index].offset);
+	return (ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_OFF_MASK);
 }
 
 static inline int ibmveth_rxq_frame_length(struct ibmveth_adapter *adapter)
@@ -153,7 +163,7 @@ static inline int ibmveth_rxq_frame_leng
 
 static inline int ibmveth_rxq_csum_good(struct ibmveth_adapter *adapter)
 {
-	return (adapter->rx_queue.queue_addr[adapter->rx_queue.index].csum_good);
+	return (ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_CSUM_GOOD);
 }
 
 /* setup the initial settings for a buffer pool */
@@ -253,9 +263,7 @@ static void ibmveth_replenish_buffer_poo
 		correlator = ((u64)pool->index << 32) | index;
 		*(u64*)skb->data = correlator;
 
-		desc.desc = 0;
-		desc.fields.valid = 1;
-		desc.fields.length = pool->buff_size;
+		desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size;
 		desc.fields.address = dma_addr;
 
 		lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address, desc.desc);
@@ -396,9 +404,8 @@ static void ibmveth_rxq_recycle_buffer(s
 		return;
 	}
 
-	desc.desc = 0;
-	desc.fields.valid = 1;
-	desc.fields.length = adapter->rx_buff_pool[pool].buff_size;
+	desc.fields.flags_len = IBMVETH_BUF_VALID |
+		adapter->rx_buff_pool[pool].buff_size;
 	desc.fields.address = adapter->rx_buff_pool[pool].dma_addr[index];
 
 	lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address, desc.desc);
@@ -549,9 +556,7 @@ static int ibmveth_open(struct net_devic
 	memcpy(&mac_address, netdev->dev_addr, netdev->addr_len);
 	mac_address = mac_address >> 16;
 
-	rxq_desc.desc = 0;
-	rxq_desc.fields.valid = 1;
-	rxq_desc.fields.length = adapter->rx_queue.queue_len;
+	rxq_desc.fields.flags_len = IBMVETH_BUF_VALID | adapter->rx_queue.queue_len;
 	rxq_desc.fields.address = adapter->rx_queue.queue_dma;
 
 	ibmveth_debug_printk("buffer list @ 0x%p\n", adapter->buffer_list_addr);
@@ -693,7 +698,7 @@ static int ibmveth_set_csum_offload(stru
 				    void (*done) (struct net_device *, u32))
 {
 	struct ibmveth_adapter *adapter = dev->priv;
-	union ibmveth_illan_attributes set_attr, clr_attr, ret_attr;
+	u64 set_attr, clr_attr, ret_attr;
 	long ret;
 	int rc1 = 0, rc2 = 0;
 	int restart = 0;
@@ -705,21 +710,21 @@ static int ibmveth_set_csum_offload(stru
 		adapter->pool_config = 0;
 	}
 
-	set_attr.desc = 0;
-	clr_attr.desc = 0;
+	set_attr = 0;
+	clr_attr = 0;
 
 	if (data)
-		set_attr.fields.tcp_csum_offload_ipv4 = 1;
+		set_attr = IBMVETH_ILLAN_IPV4_TCP_CSUM;
 	else
-		clr_attr.fields.tcp_csum_offload_ipv4 = 1;
+		clr_attr = IBMVETH_ILLAN_IPV4_TCP_CSUM;
 
-	ret = h_illan_attributes(adapter->vdev->unit_address, 0, 0, &ret_attr.desc);
+	ret = h_illan_attributes(adapter->vdev->unit_address, 0, 0, &ret_attr);
 
-	if (ret == H_SUCCESS && !ret_attr.fields.active_trunk &&
-	    !ret_attr.fields.trunk_priority &&
-	    ret_attr.fields.csum_offload_padded_pkt_support) {
-		ret = h_illan_attributes(adapter->vdev->unit_address, clr_attr.desc,
-					 set_attr.desc, &ret_attr.desc);
+	if (ret == H_SUCCESS && !(ret_attr & IBMVETH_ILLAN_ACTIVE_TRUNK) &&
+	    !(ret_attr & IBMVETH_ILLAN_TRUNK_PRI_MASK) &&
+	    (ret_attr & IBMVETH_ILLAN_PADDED_PKT_CSUM)) {
+		ret = h_illan_attributes(adapter->vdev->unit_address, clr_attr,
+					 set_attr, &ret_attr);
 
 		if (ret != H_SUCCESS) {
 			rc1 = -EIO;
@@ -727,13 +732,13 @@ static int ibmveth_set_csum_offload(stru
 					     " %d rc=%ld\n", data, ret);
 
 			ret = h_illan_attributes(adapter->vdev->unit_address,
-						 set_attr.desc, clr_attr.desc, &ret_attr.desc);
+						 set_attr, clr_attr, &ret_attr);
 		} else
 			done(dev, data);
 	} else {
 		rc1 = -EIO;
 		ibmveth_error_printk("unable to change checksum offload settings."
-				     " %d rc=%ld ret_attr=%lx\n", data, ret, ret_attr.desc);
+				     " %d rc=%ld ret_attr=%lx\n", data, ret, ret_attr);
 	}
 
 	if (restart)
@@ -839,11 +844,9 @@ static int ibmveth_start_xmit(struct sk_
 	unsigned int tx_send_failed = 0;
 	unsigned int tx_map_failed = 0;
 
-	desc.desc = 0;
-	desc.fields.length  = skb->len;
+	desc.fields.flags_len = IBMVETH_BUF_VALID | skb->len;
 	desc.fields.address = dma_map_single(&adapter->vdev->dev, skb->data,
-					     desc.fields.length, DMA_TO_DEVICE);
-	desc.fields.valid   = 1;
+					     skb->len, DMA_TO_DEVICE);
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL &&
 	    ip_hdr(skb)->protocol != IPPROTO_TCP && skb_checksum_help(skb)) {
@@ -855,8 +858,7 @@ static int ibmveth_start_xmit(struct sk_
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		unsigned char *buf = skb_transport_header(skb) + skb->csum_offset;
 
-		desc.fields.no_csum = 1;
-		desc.fields.csum_good = 1;
+		desc.fields.flags_len |= (IBMVETH_BUF_NO_CSUM | IBMVETH_BUF_CSUM_GOOD);
 
 		/* Need to zero out the checksum */
 		buf[0] = 0;
@@ -882,7 +884,8 @@ static int ibmveth_start_xmit(struct sk_
 	if(lpar_rc != H_SUCCESS && lpar_rc != H_DROPPED) {
 		ibmveth_error_printk("tx: h_send_logical_lan failed with rc=%ld\n", lpar_rc);
 		ibmveth_error_printk("tx: valid=%d, len=%d, address=0x%08x\n",
-				     desc.fields.valid, desc.fields.length, desc.fields.address);
+				     (desc.fields.flags_len & IBMVETH_BUF_VALID) ? 1 : 0,
+				     skb->len, desc.fields.address);
 		tx_send_failed++;
 		tx_dropped++;
 	} else {
@@ -892,7 +895,7 @@ static int ibmveth_start_xmit(struct sk_
 	}
 
 	dma_unmap_single(&adapter->vdev->dev, desc.fields.address,
-			 desc.fields.length, DMA_TO_DEVICE);
+			 skb->len, DMA_TO_DEVICE);
 
 out:	spin_lock_irqsave(&adapter->stats_lock, flags);
 	adapter->stats.tx_dropped += tx_dropped;
@@ -1108,7 +1111,7 @@ static int __devinit ibmveth_probe(struc
 	long ret;
 	struct net_device *netdev;
 	struct ibmveth_adapter *adapter;
-	union ibmveth_illan_attributes set_attr, ret_attr;
+	u64 set_attr, ret_attr;
 
 	unsigned char *mac_addr_p;
 	unsigned int *mcastFilterSize_p;
@@ -1202,23 +1205,20 @@ static int __devinit ibmveth_probe(struc
 
 	ibmveth_debug_printk("registering netdev...\n");
 
-	ret = h_illan_attributes(dev->unit_address, 0, 0, &ret_attr.desc);
+	ret = h_illan_attributes(dev->unit_address, 0, 0, &ret_attr);
 
-	if (ret == H_SUCCESS && !ret_attr.fields.active_trunk &&
-	    !ret_attr.fields.trunk_priority &&
-	    ret_attr.fields.csum_offload_padded_pkt_support) {
-		set_attr.desc = 0;
-		set_attr.fields.tcp_csum_offload_ipv4 = 1;
+	if (ret == H_SUCCESS && !(ret_attr & IBMVETH_ILLAN_ACTIVE_TRUNK) &&
+	    !(ret_attr & IBMVETH_ILLAN_TRUNK_PRI_MASK) &&
+	    (ret_attr & IBMVETH_ILLAN_PADDED_PKT_CSUM)) {
+		set_attr = IBMVETH_ILLAN_IPV4_TCP_CSUM;
 
-		ret = h_illan_attributes(dev->unit_address, 0, set_attr.desc,
-					 &ret_attr.desc);
+		ret = h_illan_attributes(dev->unit_address, 0, set_attr, &ret_attr);
 
 		if (ret == H_SUCCESS) {
 			adapter->rx_csum = 1;
 			netdev->features |= NETIF_F_IP_CSUM;
 		} else
-			ret = h_illan_attributes(dev->unit_address, set_attr.desc,
-						 0, &ret_attr.desc);
+			ret = h_illan_attributes(dev->unit_address, set_attr, 0, &ret_attr);
 	}
 
 	rc = register_netdev(netdev);
_

^ permalink raw reply

* [PATCH 5/6] ibmveth: Remove dead frag processing code
From: Brian King @ 2007-08-17 14:16 UTC (permalink / raw)
  To: jeff; +Cc: santil, rcjenn, netdev, linuxppc-dev, brking
In-Reply-To: <11873601831813-patch-mail.ibm.com>


Removes dead frag processing code from ibmveth. Since NETIF_F_SG was
not set, this code was never executed. Also, since the ibmveth
interface can only handle 6 fragments, core networking code would need
to be modified in order to efficiently enable this support.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---

 linux-2.6-bjking1/drivers/net/ibmveth.c |  100 +++++---------------------------
 linux-2.6-bjking1/drivers/net/ibmveth.h |    5 -
 2 files changed, 17 insertions(+), 88 deletions(-)

diff -puN drivers/net/ibmveth.c~ibmveth_remove_frag drivers/net/ibmveth.c
--- linux-2.6/drivers/net/ibmveth.c~ibmveth_remove_frag	2007-08-09 15:15:18.000000000 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.c	2007-08-09 15:15:18.000000000 -0500
@@ -28,7 +28,6 @@
 /**************************************************************************/
 /*
   TODO:
-  - remove frag processing code - no longer needed
   - add support for sysfs
   - possibly remove procfs support
 */
@@ -127,9 +126,6 @@ struct ibmveth_stat ibmveth_stats[] = {
 	{ "replenish_add_buff_success", IBMVETH_STAT_OFF(replenish_add_buff_success) },
 	{ "rx_invalid_buffer", IBMVETH_STAT_OFF(rx_invalid_buffer) },
 	{ "rx_no_buffer", IBMVETH_STAT_OFF(rx_no_buffer) },
-	{ "tx_multidesc_send", IBMVETH_STAT_OFF(tx_multidesc_send) },
-	{ "tx_linearized", IBMVETH_STAT_OFF(tx_linearized) },
-	{ "tx_linearize_failed", IBMVETH_STAT_OFF(tx_linearize_failed) },
 	{ "tx_map_failed", IBMVETH_STAT_OFF(tx_map_failed) },
 	{ "tx_send_failed", IBMVETH_STAT_OFF(tx_send_failed) },
 };
@@ -832,9 +828,8 @@ static int ibmveth_ioctl(struct net_devi
 static int ibmveth_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 {
 	struct ibmveth_adapter *adapter = netdev->priv;
-	union ibmveth_buf_desc desc[IbmVethMaxSendFrags];
+	union ibmveth_buf_desc desc;
 	unsigned long lpar_rc;
-	int nfrags = 0, curfrag;
 	unsigned long correlator;
 	unsigned long flags;
 	unsigned int retry_count;
@@ -844,25 +839,11 @@ static int ibmveth_start_xmit(struct sk_
 	unsigned int tx_send_failed = 0;
 	unsigned int tx_map_failed = 0;
 
-
-	if ((skb_shinfo(skb)->nr_frags + 1) > IbmVethMaxSendFrags) {
-		tx_dropped++;
-		goto out;
-	}
-
-	memset(&desc, 0, sizeof(desc));
-
-	/* nfrags = number of frags after the initial fragment */
-	nfrags = skb_shinfo(skb)->nr_frags;
-
-	if(nfrags)
-		adapter->tx_multidesc_send++;
-
-	/* map the initial fragment */
-	desc[0].fields.length  = nfrags ? skb->len - skb->data_len : skb->len;
-	desc[0].fields.address = dma_map_single(&adapter->vdev->dev, skb->data,
-					desc[0].fields.length, DMA_TO_DEVICE);
-	desc[0].fields.valid   = 1;
+	desc.desc = 0;
+	desc.fields.length  = skb->len;
+	desc.fields.address = dma_map_single(&adapter->vdev->dev, skb->data,
+					     desc.fields.length, DMA_TO_DEVICE);
+	desc.fields.valid   = 1;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL &&
 	    ip_hdr(skb)->protocol != IPPROTO_TCP && skb_checksum_help(skb)) {
@@ -874,75 +855,34 @@ static int ibmveth_start_xmit(struct sk_
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		unsigned char *buf = skb_transport_header(skb) + skb->csum_offset;
 
-		desc[0].fields.no_csum = 1;
-		desc[0].fields.csum_good = 1;
+		desc.fields.no_csum = 1;
+		desc.fields.csum_good = 1;
 
 		/* Need to zero out the checksum */
 		buf[0] = 0;
 		buf[1] = 0;
 	}
 
-	if(dma_mapping_error(desc[0].fields.address)) {
-		ibmveth_error_printk("tx: unable to map initial fragment\n");
+	if (dma_mapping_error(desc.fields.address)) {
+		ibmveth_error_printk("tx: unable to map xmit buffer\n");
 		tx_map_failed++;
 		tx_dropped++;
 		goto out;
 	}
 
-	curfrag = nfrags;
-
-	/* map fragments past the initial portion if there are any */
-	while(curfrag--) {
-		skb_frag_t *frag = &skb_shinfo(skb)->frags[curfrag];
-		desc[curfrag+1].fields.address
-			= dma_map_single(&adapter->vdev->dev,
-				page_address(frag->page) + frag->page_offset,
-				frag->size, DMA_TO_DEVICE);
-		desc[curfrag+1].fields.length = frag->size;
-		desc[curfrag+1].fields.valid  = 1;
-		if (skb->ip_summed == CHECKSUM_PARTIAL) {
-			desc[curfrag+1].fields.no_csum = 1;
-			desc[curfrag+1].fields.csum_good = 1;
-		}
-
-		if(dma_mapping_error(desc[curfrag+1].fields.address)) {
-			ibmveth_error_printk("tx: unable to map fragment %d\n", curfrag);
-			tx_map_failed++;
-			tx_dropped++;
-			/* Free all the mappings we just created */
-			while(curfrag < nfrags) {
-				dma_unmap_single(&adapter->vdev->dev,
-						 desc[curfrag+1].fields.address,
-						 desc[curfrag+1].fields.length,
-						 DMA_TO_DEVICE);
-				curfrag++;
-			}
-			goto out;
-		}
-	}
-
 	/* send the frame. Arbitrarily set retrycount to 1024 */
 	correlator = 0;
 	retry_count = 1024;
 	do {
 		lpar_rc = h_send_logical_lan(adapter->vdev->unit_address,
-					     desc[0].desc,
-					     desc[1].desc,
-					     desc[2].desc,
-					     desc[3].desc,
-					     desc[4].desc,
-					     desc[5].desc,
-					     correlator,
-					     &correlator);
+					     desc.desc, 0, 0, 0, 0, 0,
+					     correlator, &correlator);
 	} while ((lpar_rc == H_BUSY) && (retry_count--));
 
 	if(lpar_rc != H_SUCCESS && lpar_rc != H_DROPPED) {
-		int i;
 		ibmveth_error_printk("tx: h_send_logical_lan failed with rc=%ld\n", lpar_rc);
-		for(i = 0; i < 6; i++) {
-			ibmveth_error_printk("tx: desc[%i] valid=%d, len=%d, address=0x%d\n", i,
-					     desc[i].fields.valid, desc[i].fields.length, desc[i].fields.address);
-		}
+		ibmveth_error_printk("tx: valid=%d, len=%d, address=0x%08x\n",
+				     desc.fields.valid, desc.fields.length, desc.fields.address);
 		tx_send_failed++;
 		tx_dropped++;
 	} else {
@@ -951,11 +891,8 @@ static int ibmveth_start_xmit(struct sk_
 		netdev->trans_start = jiffies;
 	}
 
-	do {
-		dma_unmap_single(&adapter->vdev->dev,
-				desc[nfrags].fields.address,
-				desc[nfrags].fields.length, DMA_TO_DEVICE);
-	} while(--nfrags >= 0);
+	dma_unmap_single(&adapter->vdev->dev, desc.fields.address,
+			 desc.fields.length, DMA_TO_DEVICE);
 
 out:	spin_lock_irqsave(&adapter->stats_lock, flags);
 	adapter->stats.tx_dropped += tx_dropped;
@@ -1366,10 +1303,7 @@ static int ibmveth_seq_show(struct seq_f
 		   firmware_mac[3], firmware_mac[4], firmware_mac[5]);
 
 	seq_printf(seq, "\nAdapter Statistics:\n");
-	seq_printf(seq, "  TX:  skbuffs linearized:          %ld\n", adapter->tx_linearized);
-	seq_printf(seq, "       multi-descriptor sends:      %ld\n", adapter->tx_multidesc_send);
-	seq_printf(seq, "       skb_linearize failures:      %ld\n", adapter->tx_linearize_failed);
-	seq_printf(seq, "       vio_map_single failres:      %ld\n", adapter->tx_map_failed);
+	seq_printf(seq, "  TX:  vio_map_single failres:      %ld\n", adapter->tx_map_failed);
 	seq_printf(seq, "       send failures:               %ld\n", adapter->tx_send_failed);
 	seq_printf(seq, "  RX:  replenish task cycles:       %ld\n", adapter->replenish_task_cycles);
 	seq_printf(seq, "       alloc_skb_failures:          %ld\n", adapter->replenish_no_mem);
diff -puN drivers/net/ibmveth.h~ibmveth_remove_frag drivers/net/ibmveth.h
--- linux-2.6/drivers/net/ibmveth.h~ibmveth_remove_frag	2007-08-09 15:15:18.000000000 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.h	2007-08-09 15:15:18.000000000 -0500
@@ -25,8 +25,6 @@
 #ifndef _IBMVETH_H
 #define _IBMVETH_H
 
-#define IbmVethMaxSendFrags 6
-
 /* constants for H_MULTICAST_CTRL */
 #define IbmVethMcastReceptionModifyBit     0x80000UL
 #define IbmVethMcastReceptionEnableBit     0x20000UL
@@ -146,9 +144,6 @@ struct ibmveth_adapter {
     u64 replenish_add_buff_success;
     u64 rx_invalid_buffer;
     u64 rx_no_buffer;
-    u64 tx_multidesc_send;
-    u64 tx_linearized;
-    u64 tx_linearize_failed;
     u64 tx_map_failed;
     u64 tx_send_failed;
     spinlock_t stats_lock;
_

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox