Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 03/16] net: Basic network namespace infrastructure.
From: Eric W. Biederman @ 2007-09-10 15:53 UTC (permalink / raw)
  To: Pavel Emelyanov
  Cc: Linux Containers, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <46E543A0.7010104-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>

Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:

> Eric W. Biederman wrote:
>
> [snip]
>
>> --- /dev/null
>> +++ b/include/net/net_namespace.h
>> @@ -0,0 +1,68 @@
>> +/*
>> + * Operations on the network namespace
>> + */
>> +#ifndef __NET_NET_NAMESPACE_H
>> +#define __NET_NET_NAMESPACE_H
>> +
>> +#include <asm/atomic.h>
>> +#include <linux/workqueue.h>
>> +#include <linux/list.h>
>> +
>> +struct net {
>
> Isn't this name is too generic? Why not net_namespace?

My fingers went on strike.  struct netns probably wouldn't be bad.
The very long and spelled out version seemed painful.  I don't really
care except that not changing it is easier for me.  I'm happy to do
whatever is considered most maintainable.

>> +	/* Walk through the list backwards calling the exit functions
>> +	 * for the pernet modules whose init functions did not fail.
>> +	 */
>> +	for (ptr = ptr->prev; ptr != &pernet_list; ptr = ptr->prev) {
>
> Good reason for adding list_for_each_continue_reverse :)

Sounds reasonable.

>> +static int register_pernet_operations(struct list_head *list,
>> +				      struct pernet_operations *ops)
>> +{
>> +	struct net *net, *undo_net;
>> +	int error;
>> +
>> +	error = 0;
>> +	list_add_tail(&ops->list, list);
>> +	for_each_net(net) {
>> +		if (ops->init) {
>
> Maybe it's better to do it vice-versa?
> if (ops->init)
>    for_each_net(net)
>        ops->init(net);
> ...

My gut feel says it is more readable with the test on the inside.
Although something like
if (ops->init)
   goto out;

might be more readable.

>> +int register_pernet_device(struct pernet_operations *ops)
>> +{
>> +	int error;
>> +	mutex_lock(&net_mutex);
>> +	error = register_pernet_operations(&pernet_list, ops);
>> +	if (!error && (first_device == &pernet_list))
>
> Very minor: why do you give the name "device" to some pernet_operations?

Subsystems need to be initialized before and cleaned up after network
devices.  We don't have much in the way that needs this distinction,
but we have just enough that it is useful to make this distinction.

Looking at my complete patchset all I have in this category is the
loopback device, and it is important on the teardown side of things
that the loopback device be unregistered before I clean up the
protocols or else I get weird leaks.

Reflecting on it I'm not quite certain about the setup side of things.
I'm on the fence if I need to completely dynamically allocate the
loopback device or if I need to embed it struct net.

There also may be a call for some other special network devices
to have one off instances in each network namespace.    With the
new netlink creation API it isn't certain we need that idiom anymore
but before that point I was certain we would have other network
devices besides the loopback that would care.

Eric

^ permalink raw reply

* Re: [PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace
From: Eric W. Biederman @ 2007-09-10 15:27 UTC (permalink / raw)
  To: Pavel Emelyanov
  Cc: Linux Containers, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <46E54B96.8060105-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>

Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:

> Eric W. Biederman wrote:
>> Until we support multiple network namespaces with netfilter only allow
>> netfilter configuration in the initial network namespace.
>
> PATCH 17/16? :)

Exactly!

If my target was the core of the networking stack I figured I better
include the change that keeps netfilter commands isolated to the
initial network namespace, and in my review of completeness I had
missed that in my first pass through my patches.

Eric

^ permalink raw reply

* Re: [PATCH 12/16] net: Support multiple network namespaces with netlink
From: Eric W. Biederman @ 2007-09-10 15:24 UTC (permalink / raw)
  To: Pavel Emelyanov; +Cc: David Miller, netdev, Linux Containers
In-Reply-To: <46E54AC8.7000609@openvz.org>

Pavel Emelyanov <xemul@openvz.org> writes:
>
> Rrrrrr. This is the 5th or even the 6th patch that changes tens of files
> but (!) most of these changes are just propagating some core thing into
> protocols, drivers, etc. E.g. you add an argument to some function and
> then make all the rest use it, but the chunk adding the argument itself
> is buried in these changes.
>
> Why not make a reviewers' lifes easier and make (with hands) the core 
> hunks go first and the "propagation" ones at the end? For RFC purpose 
> I would even break the git-bisect safeness and splitted these patches 
> into 2 parts: those with the core and those with the propagation.

Agreed, this is an issue for easy review. My apologies.

I guess it was a failure in my imagination on how to prepare these
patches for review, in a way that was both reviewer and preparer friendly.

There is at least one /proc idiom change that needs to be made so I
will keep this in mind for the resend.

Eric

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Linus Torvalds @ 2007-09-10 15:09 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Kyle Moffett, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <200709101516.03234.vda.linux@googlemail.com>


On Mon, 10 Sep 2007, Denys Vlasenko wrote:
> 
> static inline int
> qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
> {
>         int      return_status = QLA_SUCCESS;
>         unsigned long loop_timeout ;
>         scsi_qla_host_t *pha = to_qla_parent(ha);
> 
>         /* wait for 5 min at the max for loop to be ready */
>         loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);
> 
>         while ((!atomic_read(&pha->loop_down_timer) &&
>             atomic_read(&pha->loop_state) == LOOP_DOWN) ||
>             atomic_read(&pha->loop_state) != LOOP_READY) {
>                 if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
...
> Is above correct or buggy? Correct, because msleep is a barrier.
> Is it obvious? No.

It's *buggy*. But it has nothing to do with any msleep() in the loop, or 
anything else.

And more importantly, it would be equally buggy even *with* a "volatile" 
atomic_read().

Why is this so hard for people to understand? You're all acting like 
morons.

The reason it is buggy has absolutely nothing to do with whether the read 
is done or not, it has to do with the fact that the CPU may re-order the 
reads *regardless* of whether the read is done in some specific order by 
the compiler ot not! In effect, there is zero ordering between all those 
three reads, and if you don't have memory barriers (or a lock or other 
serialization), that code is buggy.

So stop this idiotic discussion thread already. The above kind of code 
needs memory barriers to be non-buggy. The whole "volatile or not" 
discussion is totally idiotic, and pointless, and anybody who doesn't 
understand that by now needs to just shut up and think about it more, 
rather than make this discussion drag out even further.

The fact is, "volatile" *only* makes things worse. It generates worse 
code, and never fixes any real bugs. This is a *fact*.

			Linus

^ permalink raw reply

* sky2.c: length mismatch errors with vlan frames
From: Pierre-Yves Ritschard @ 2007-09-10 14:35 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger

Hi list,

I have been running recent linux kernel on nexcom NSA 1086's equipped
with sysconnect NICs.

Like some people previously have on this list I am running into
problems with these NICs and seeing frequent errors in my dmesg:

sky2 eth4: rx error, status 0x402300 length 60
printk: 17 messages suppressed.
sky2 eth4: rx error, status 0x402300 length 60
printk: 32 messages suppressed.
sky2 eth4: rx error, status 0x602300 length 92
printk: 25 messages suppressed.
sky2 eth4: rx error, status 0x6e2300 length 106
printk: 16 messages suppressed.
sky2 eth4: rx error, status 0x402300 length 60
printk: 10 messages suppressed.
sky2 eth4: rx error, status 0x402300 length 60
printk: 17 messages suppressed.
sky2 eth4: rx error, status 0x402300 length 60


I have investigated a bit and status doesn't match any of the errors in
GMR_FS_ANY_ERR.
The block generating the error is now len_mismatch, due to the fact
that on packets having the GMR_FS_VLAN bit set, the length argument to
sky2_receive is 4 bytes shorter that the 16 bit value in status (status
``right shift'' 16).

Since both these values are read from the card I don't know how to
solve this other than by incrementing the length arg by 4 when
GMR_FS_VLAN is set in status. So here's a diff that does this, although
I'm not sure its an elegant solution:

--- sky2.c.orig	2007-09-10 15:34:15.000000000 +0200
+++ sky2.c	2007-09-10 16:20:28.000000000 +0200
@@ -2059,13 +2059,16 @@ static struct sk_buff *sky2_receive(stru
 	sky2->rx_next = (sky2->rx_next + 1) % sky2->rx_pending;
 	prefetch(sky2->rx_ring + sky2->rx_next);
 
+	if (status & GMR_FS_VLAN)
+		length += 4;
+
 	if (status & GMR_FS_ANY_ERR)
 		goto error;
 
 	if (!(status & GMR_FS_RX_OK))
 		goto resubmit;
 
-	if (status >> 16 != length)
+	if ((status >> 16) != length)
 		goto len_mismatch;
 
 	if (length < copybreak)
@@ -2081,6 +2084,8 @@ len_mismatch:
 	/* Truncation of overlength packets
 	   causes PHY length to not match MAC length */
 	++sky2->net_stats.rx_length_errors;
+	printk(KERN_INFO PFX "%s: rx length mismatch: length %d != %d\n",
+	       dev->name, length, status >> 16);
 
 error:
 	++sky2->net_stats.rx_errors;

Any thoughts on how to solve this ?
Further testing shows that sometimes there are packets without the
GMR_FS_VLAN bit set which have a 4 bytes length difference, has shown
by this log excerpt:

sky2 eth4: rx length mismatch: length 243 != 247
sky2 eth4: rx error, status 0xf70300 length 243

For debugging purposes, I here is the little program with
excerpts from sky2.c I use to see what bits are set in the status field:

#include <sys/types.h>
#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

enum {
        GMR_FS_LEN      = 0xffff<<16, /* Bit 31..16:    Rx Frame Length */
        GMR_FS_VLAN     = 1<<13, /* VLAN Packet */
        GMR_FS_JABBER   = 1<<12, /* Jabber Packet */
        GMR_FS_UN_SIZE  = 1<<11, /* Undersize Packet */
        GMR_FS_MC       = 1<<10, /* Multicast Packet */
        GMR_FS_BC       = 1<<9,  /* Broadcast Packet */
        GMR_FS_RX_OK    = 1<<8,  /* Receive OK (Good Packet) */
        GMR_FS_GOOD_FC  = 1<<7,  /* Good Flow-Control Packet */
        GMR_FS_BAD_FC   = 1<<6,  /* Bad  Flow-Control Packet */
        GMR_FS_MII_ERR  = 1<<5,  /* MII Error */
        GMR_FS_LONG_ERR = 1<<4,  /* Too Long Packet */
        GMR_FS_FRAGMENT = 1<<3,  /* Fragment */

        GMR_FS_CRC_ERR  = 1<<1,  /* CRC Error */
        GMR_FS_RX_FF_OV = 1<<0,  /* Rx FIFO Overflow */

        GMR_FS_ANY_ERR  = GMR_FS_RX_FF_OV | GMR_FS_CRC_ERR |
                          GMR_FS_FRAGMENT | GMR_FS_LONG_ERR |
                          GMR_FS_MII_ERR | GMR_FS_BAD_FC |
                          GMR_FS_UN_SIZE | GMR_FS_JABBER,
};

struct bitdesc {
	const char 	*name;
	u_int32_t 	 field;
};

struct bitdesc bits[] = {
        { "GMR_FS_VLAN", 1 << 13 },
        { "GMR_FS_JABBER", 1 << 12 },
        { "GMR_FS_UN_SIZE", 1 << 11 },
        { "GMR_FS_MC", 1 << 10 },
        { "GMR_FS_BC", 1 << 9 },
        { "GMR_FS_RX_OK", 1 << 8 },
        { "GMR_FS_GOOD_FC", 1 << 7 },
        { "GMR_FS_BAD_FC", 1 << 6 },
        { "GMR_FS_MII_ERR", 1 << 5 },
        { "GMR_FS_LONG_ERR", 1 << 4 },
        { "GMR_FS_FRAGMENT", 1 << 3 },
        { "GMR_FS_CRC_ERR", 1 << 1 },
        { "GMR_FS_RX_FF_OV", 1 << 0 }
};

int
main(int argc, const char *argv[])
{
	int	status;
	int	i;

	if (argc < 2)
		return (1);

	status = strtol(argv[1], NULL, 16);
	printf("status: 0x%08x\n", status);
	if (status & GMR_FS_ANY_ERR)
		printf("packet has an error\n");
	printf("length: %u\n", status >> 16);

	for (i = 0; i < (sizeof(bits) / sizeof(bits[0])); i++) {
		if (status & bits[i].field) {
			printf("has bit %s\n", bits[i].name);
		}
	}
	return (0);
}

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Denys Vlasenko @ 2007-09-10 14:38 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070910155156.2c1453fc@laptopd505.fenrus.org>

On Monday 10 September 2007 15:51, Arjan van de Ven wrote:
> On Mon, 10 Sep 2007 11:56:29 +0100
> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> 
> > 
> > Well, if you insist on having it again:
> > 
> > Waiting for atomic value to be zero:
> > 
> >         while (atomic_read(&x))
> >                 continue;
> > 
> 
> and this I would say is buggy code all the way.
>
> Not from a pure C level semantics, but from a "busy waiting is buggy"
> semantics level and a "I'm inventing my own locking" semantics level.

After inspecting arch/*, I cannot agree with you.
Otherwise almost all major architectures use
"conceptually buggy busy-waiting":

arch/alpha
arch/i386
arch/ia64
arch/m32r
arch/mips
arch/parisc
arch/powerpc
arch/sh
arch/sparc64
arch/um
arch/x86_64

All of the above contain busy-waiting on atomic_read.

Including these loops without barriers:

arch/mips/kernel/smtc.c
			while (atomic_read(&idle_hook_initialized) < 1000)
				;
arch/mips/sgi-ip27/ip27-nmi.c
	while (atomic_read(&nmied_cpus) != num_online_cpus());

[Well maybe num_online_cpus() is a barrier, I didn't check]

arch/sh/kernel/smp.c
	if (wait)
		while (atomic_read(&smp_fn_call.finished) != (nr_cpus - 1));

Bugs?
--
vda

^ permalink raw reply

* Re: [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver
From: Moni Shoua @ 2007-09-10 14:31 UTC (permalink / raw)
  To: Michael S. Tsirkin, rdreier, davem, fubar; +Cc: netdev, general
In-Reply-To: <46C9B474.5020202@voltaire.com>

Hi all,
This patch series is a bit neglected.
Since our goal is to have bonding support for IPoIB in kernel 2.6.24 it is 
very important for us to get comments soon.

We would appreciate if you take some time to look at this and help us push this code upstream.

thanks

 MoniS  



^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Denys Vlasenko @ 2007-09-10 14:16 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <200709101438.36710.vda.linux@googlemail.com>

On Monday 10 September 2007 14:38, Denys Vlasenko wrote:
> You are basically trying to educate me how to use atomic properly.
> You don't need to do it, as I am (currently) not a driver author.
> 
> I am saying that people who are already using atomic_read()
> (and who unfortunately did not read your explanation above)
> will still sometimes use atomic_read() as a way to read atomic value
> *from memory*, and will create nasty heisenbugs for you to debug.

static inline int
qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
{
        int      return_status = QLA_SUCCESS;
        unsigned long loop_timeout ;
        scsi_qla_host_t *pha = to_qla_parent(ha);

        /* wait for 5 min at the max for loop to be ready */
        loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);

        while ((!atomic_read(&pha->loop_down_timer) &&
            atomic_read(&pha->loop_state) == LOOP_DOWN) ||
            atomic_read(&pha->loop_state) != LOOP_READY) {
                if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
                        return_status = QLA_FUNCTION_FAILED;
                        break;
                }
                msleep(1000);
                if (time_after_eq(jiffies, loop_timeout)) {
                        return_status = QLA_FUNCTION_FAILED;
                        break;
                }
        }
        return (return_status);
}

Is above correct or buggy? Correct, because msleep is a barrier.
Is it obvious? No.

static void
qla2x00_rst_aen(scsi_qla_host_t *ha)
{
        if (ha->flags.online && !ha->flags.reset_active &&
            !atomic_read(&ha->loop_down_timer) &&
            !(test_bit(ABORT_ISP_ACTIVE, &ha->dpc_flags))) {
                do {
                        clear_bit(RESET_MARKER_NEEDED, &ha->dpc_flags);

                        /*
                         * Issue marker command only when we are going to start
                         * the I/O.
                         */
                        ha->marker_needed = 1;
                } while (!atomic_read(&ha->loop_down_timer) &&
                    (test_bit(RESET_MARKER_NEEDED, &ha->dpc_flags)));
        }
}

Is above correct? I honestly don't know. Correct, because set_bit is
a barrier on _all _memory_? Will it break if set_bit will be changed
to be a barrier only on its operand? Probably yes.

drivers/kvm/kvm_main.c

        while (atomic_read(&completed) != needed) {
                cpu_relax();
                barrier();
        }

Obviously author did not know that cpu_relax is already a barrier.
See why I think driver authors will be confused?

arch/x86_64/kernel/crash.c

static void nmi_shootdown_cpus(void)
{
...
        msecs = 1000; /* Wait at most a second for the other cpus to stop */
        while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
                mdelay(1);
                msecs--;
        }
...
}

Is mdelay(1) a barrier? Yes, because it is a function on x86_64.
Absolutely the same code will be buggy on an arch where
mdelay(1) == udelay(1000), and udelay is implemented
as inline busy-wait.

arch/sparc64/kernel/smp.c

        /* Wait for response */
        while (atomic_read(&data.finished) != cpus)
                cpu_relax();
...later in the same file...
                while (atomic_read(&smp_capture_registry) != ncpus)
                        rmb();

I'm confused. Do we need cpu_relax() or rmb()? Does cpu_relax() imply rmb()?
(No it doesn't). Which of those two while loops needs correcting?
--
vda

^ permalink raw reply

* Re: [Lksctp-developers] [-mm patch] net/sctp/socket.c: make 3 variables static
From: Neil Horman @ 2007-09-10 14:05 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Andrew Morton, vladislav.yasevich, sri, davem, netdev,
	linux-kernel, lksctp-developers
In-Reply-To: <20070909202554.GY3563@stusta.de>

On Sun, Sep 09, 2007 at 10:25:54PM +0200, Adrian Bunk wrote:
> On Fri, Aug 31, 2007 at 09:58:22PM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.23-rc3-mm1:
> >...
> >  git-net.patch
> >...
> >  git trees
> >...
> 
> This patch makes the following needlessly globalvariables static:
> - sctp_memory_pressure
> - sctp_memory_allocated
> - sctp_sockets_allocated
> 
> Signed-off-by: Adrian Bunk <bunk@kernel.org>
> 
Looks fine to me
Acked-by: Neil Horman <nhorman@tuxdriver.com>

Neil

-- 
/***************************************************
 *Neil Horman
 *nhorman@tuxdriver.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/

^ permalink raw reply

* Re: [PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace
From: Pavel Emelyanov @ 2007-09-10 13:50 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: David Miller, netdev, Linux Containers
In-Reply-To: <m1ps0su8wv.fsf_-_@ebiederm.dsl.xmission.com>

Eric W. Biederman wrote:
> Until we support multiple network namespaces with netfilter only allow
> netfilter configuration in the initial network namespace.

PATCH 17/16? :)

Sorry,
Pavel

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Arjan van de Ven @ 2007-09-10 14:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <200709101156.30010.vda.linux@googlemail.com>

On Mon, 10 Sep 2007 11:56:29 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> 
> Well, if you insist on having it again:
> 
> Waiting for atomic value to be zero:
> 
>         while (atomic_read(&x))
>                 continue;
> 

and this I would say is buggy code all the way.

Not from a pure C level semantics, but from a "busy waiting is buggy"
semantics level and a "I'm inventing my own locking" semantics level.


^ permalink raw reply

* Re: [PATCH 12/16] net: Support multiple network namespaces with netlink
From: Pavel Emelyanov @ 2007-09-10 13:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Containers, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <m1bqccvock.fsf_-_-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>

Eric W. Biederman wrote:
> Each netlink socket will live in exactly one network namespace,
> this includes the controlling kernel sockets.
> 
> This patch updates all of the existing netlink protocols
> to only support the initial network namespace.  Request
> by clients in other namespaces will get -ECONREFUSED.
> As they would if the kernel did not have the support for
> that netlink protocol compiled in.
> 
> As each netlink protocol is updated to be multiple network
> namespace safe it can register multiple kernel sockets
> to acquire a presence in the rest of the network namespaces.
> 
> The implementation in af_netlink is a simple filter implementation
> at hash table insertion and hash table look up time.
> 
> Signed-off-by: Eric W. Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/connector/connector.c       |    2 +-
>  drivers/scsi/scsi_netlink.c         |    2 +-
>  drivers/scsi/scsi_transport_iscsi.c |    2 +-
>  fs/ecryptfs/netlink.c               |    2 +-
>  include/linux/netlink.h             |    6 ++-
>  kernel/audit.c                      |    4 +-
>  lib/kobject_uevent.c                |    5 +-
>  net/bridge/netfilter/ebt_ulog.c     |    5 +-
>  net/core/rtnetlink.c                |    4 +-
>  net/decnet/netfilter/dn_rtmsg.c     |    3 +-
>  net/ipv4/fib_frontend.c             |    4 +-
>  net/ipv4/inet_diag.c                |    4 +-
>  net/ipv4/netfilter/ip_queue.c       |    6 +-
>  net/ipv4/netfilter/ipt_ULOG.c       |    3 +-
>  net/ipv6/netfilter/ip6_queue.c      |    6 +-
>  net/netfilter/nfnetlink.c           |    2 +-
>  net/netfilter/nfnetlink_log.c       |    3 +-
>  net/netfilter/nfnetlink_queue.c     |    3 +-
>  net/netlink/af_netlink.c            |  106 ++++++++++++++++++++++++++---------
>  net/netlink/genetlink.c             |    4 +-
>  net/xfrm/xfrm_user.c                |    2 +-
>  security/selinux/netlink.c          |    5 +-
>  22 files changed, 122 insertions(+), 61 deletions(-)

Rrrrrr. This is the 5th or even the 6th patch that changes tens of files
but (!) most of these changes are just propagating some core thing into
protocols, drivers, etc. E.g. you add an argument to some function and
then make all the rest use it, but the chunk adding the argument itself
is buried in these changes.

Why not make a reviewers' lifes easier and make (with hands) the core 
hunks go first and the "propagation" ones at the end? For RFC purpose 
I would even break the git-bisect safeness and splitted these patches 
into 2 parts: those with the core and those with the propagation.

Thanks,
Pavel

^ permalink raw reply

* Re: [Lksctp-developers] [2.6 patch] make sctp_addto_param() static
From: Neil Horman @ 2007-09-10 13:42 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Wei Yongjun, Vlad Yasevich, sri, davem, netdev, linux-kernel,
	lksctp-developers
In-Reply-To: <20070909202550.GX3563@stusta.de>

On Sun, Sep 09, 2007 at 10:25:50PM +0200, Adrian Bunk wrote:
> sctp_addto_param() can become static.
> 
> Signed-off-by: Adrian Bunk <bunk@kernel.org>
> 
ACK, seems reasonable to me.
Neil


-- 
/***************************************************
 *Neil Horman
 *nhorman@tuxdriver.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/

^ permalink raw reply

* Re: [2.6 patch] make sctp_addto_param() static
From: Vlad Yasevich @ 2007-09-10 13:40 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Wei Yongjun, sri, davem, linux-kernel, lksctp-developers, netdev
In-Reply-To: <20070909202550.GX3563@stusta.de>

Adrian Bunk wrote:
> sctp_addto_param() can become static.
> 
> Signed-off-by: Adrian Bunk <bunk@kernel.org>

Ack

-vlad

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Denys Vlasenko @ 2007-09-10 13:38 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <6370BBDF-0C79-41EB-BD2A-02AA0D216924@mac.com>

On Monday 10 September 2007 13:22, Kyle Moffett wrote:
> On Sep 10, 2007, at 06:56:29, Denys Vlasenko wrote:
> > On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
> >> On Sun, 9 Sep 2007 19:02:54 +0100
> >> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> >>
> >>> Why is all this fixation on "volatile"? I don't think people want  
> >>> "volatile" keyword per se, they want atomic_read(&x) to _always_  
> >>> compile into an memory-accessing instruction, not register access.
> >>
> >> and ... why is that?  is there any valid, non-buggy code sequence  
> >> that makes that a reasonable requirement?
> >
> > Well, if you insist on having it again:
> >
> > Waiting for atomic value to be zero:
> >
> >         while (atomic_read(&x))
> >                 continue;
> >
> > gcc may happily convert it into:
> >
> >         reg = atomic_read(&x);
> >         while (reg)
> >                 continue;
> 
> Bzzt.  Even if you fixed gcc to actually convert it to a busy loop on  
> a memory variable, you STILL HAVE A BUG as it may *NOT* be gcc that  
> does the conversion, it may be that the CPU does the caching of the  
> memory value.  GCC has no mechanism to do cache-flushes or memory- 
> barriers except through our custom inline assembly.

CPU can cache the value all right, but it cannot use that cached value
*forever*, it has to react to invalidate cycles on the shared bus
and re-fetch new data.

IOW: atomic_read(&x) which compiles down to memory accessor
will work properly.

> the CPU.  Thirdly, on a large system it may take some arbitrarily  
> large amount of time for cache-propagation to update the value of the  
> variable in your local CPU cache.

Yes, but "arbitrarily large amount of time" is actually measured
in nanoseconds here. Let's say 1000ns max for hundreds of CPUs?

> Also, you   
> probably want a cpu_relax() in there somewhere to avoid overheating  
> the CPU.

Yes, but 
1. CPU shouldn't overheat (in a sense that it gets damaged),
   it will only use more power than needed.
2. cpu_relax() just throttles down my CPU, so it's performance
   optimization only. Wait, it isn't, it's a barrier too.
   Wow, "cpu_relax" is a barrier? How am I supposed to know
   that without reading lkml flamewars and/or header files?

Let's try reading headers. asm-x86_64/processor.h:

#define cpu_relax()   rep_nop()

So, is it a barrier? No clue yet.

/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
static inline void rep_nop(void)
{
        __asm__ __volatile__("rep;nop": : :"memory");
}

Comment explicitly says that it is "a good thing" (doesn't say
that it is mandatory) and says NOTHING about barriers!

Barrier-ness is not mentioned and is hidden in "memory" clobber.

Do you think it's obvious enough for average driver writer?
I think not, especially that it's unlikely for him to even start
suspecting that it is a memory barrier based on the "cpu_relax"
name.

> You simply CANNOT use an atomic_t as your sole synchronizing
> primitive, it doesn't work!  You virtually ALWAYS want to use an  
> atomic_t in the following types of situations:
> 
> (A) As an object refcount.  The value is never read except as part of  
> an atomic_dec_return().  Why aren't you using "struct kref"?
> 
> (B) As an atomic value counter (number of processes, for example).   
> Just "reading" the value is racy anyways, if you want to enforce a  
> limit or something then use atomic_inc_return(), check the result,  
> and use atomic_dec() if it's too big.  If you just want to return the  
> statistics then you are going to be instantaneous-point-in-time anyways.
> 
> (C) As an optimization value (statistics-like, but exact accuracy  
> isn't important).
> 
> Atomics are NOT A REPLACEMENT for the proper kernel subsystem, like  
> completions, mutexes, semaphores, spinlocks, krefs, etc.  It's not  
> useful for synchronization, only for keeping track of simple integer  
> RMW values.  Note that atomic_read() and atomic_set() aren't very  
> useful RMW primitives (read-nomodify-nowrite and read-set-zero- 
> write).  Code which assumes anything else is probably buggy in other  
> ways too.

You are basically trying to educate me how to use atomic properly.
You don't need to do it, as I am (currently) not a driver author.

I am saying that people who are already using atomic_read()
(and who unfortunately did not read your explanation above)
will still sometimes use atomic_read() as a way to read atomic value
*from memory*, and will create nasty heisenbugs for you to debug.
--
vda

^ permalink raw reply

* Re: [PATCH 03/16] net: Basic network namespace infrastructure.
From: Pavel Emelyanov @ 2007-09-10 13:16 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Containers, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <m1ejh8x3ih.fsf_-_-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>

Eric W. Biederman wrote:

[snip]

> --- /dev/null
> +++ b/include/net/net_namespace.h
> @@ -0,0 +1,68 @@
> +/*
> + * Operations on the network namespace
> + */
> +#ifndef __NET_NET_NAMESPACE_H
> +#define __NET_NET_NAMESPACE_H
> +
> +#include <asm/atomic.h>
> +#include <linux/workqueue.h>
> +#include <linux/list.h>
> +
> +struct net {

Isn't this name is too generic? Why not net_namespace?

> +	atomic_t		count;		/* To decided when the network
> +						 *  namespace should be freed.
> +						 */
> +	atomic_t		use_count;	/* To track references we
> +						 * destroy on demand
> +						 */
> +	struct list_head	list;		/* list of network namespaces */
> +	struct work_struct	work;		/* work struct for freeing */
> +};

[snip]

> --- /dev/null
> +++ b/net/core/net_namespace.c

[snip]

> +static int setup_net(struct net *net)
> +{
> +	/* Must be called with net_mutex held */
> +	struct pernet_operations *ops;
> +	struct list_head *ptr;
> +	int error;
> +
> +	memset(net, 0, sizeof(struct net));
> +	atomic_set(&net->count, 1);
> +	atomic_set(&net->use_count, 0);
> +
> +	error = 0;
> +	list_for_each(ptr, &pernet_list) {
> +		ops = list_entry(ptr, struct pernet_operations, list);
> +		if (ops->init) {
> +			error = ops->init(net);
> +			if (error < 0)
> +				goto out_undo;
> +		}
> +	}
> +out:
> +	return error;
> +out_undo:
> +	/* Walk through the list backwards calling the exit functions
> +	 * for the pernet modules whose init functions did not fail.
> +	 */
> +	for (ptr = ptr->prev; ptr != &pernet_list; ptr = ptr->prev) {

Good reason for adding list_for_each_continue_reverse :)

> +		ops = list_entry(ptr, struct pernet_operations, list);
> +		if (ops->exit)
> +			ops->exit(net);
> +	}
> +	goto out;
> +}
> +
> +static int __init net_ns_init(void)
> +{
> +	int err;
> +
> +	printk(KERN_INFO "net_namespace: %zd bytes\n", sizeof(struct net));
> +	net_cachep = kmem_cache_create("net_namespace", sizeof(struct net),
> +					SMP_CACHE_BYTES,
> +					SLAB_PANIC, NULL);
> +	mutex_lock(&net_mutex);
> +	err = setup_net(&init_net);
> +
> +	net_lock();
> +	list_add_tail(&init_net.list, &net_namespace_list);
> +	net_unlock();
> +
> +	mutex_unlock(&net_mutex);
> +	if (err)
> +		panic("Could not setup the initial network namespace");
> +
> +	return 0;
> +}
> +
> +pure_initcall(net_ns_init);
> +
> +static int register_pernet_operations(struct list_head *list,
> +				      struct pernet_operations *ops)
> +{
> +	struct net *net, *undo_net;
> +	int error;
> +
> +	error = 0;
> +	list_add_tail(&ops->list, list);
> +	for_each_net(net) {
> +		if (ops->init) {

Maybe it's better to do it vice-versa?
if (ops->init)
   for_each_net(net)
       ops->init(net);
...

> +			error = ops->init(net);
> +			if (error)
> +				goto out_undo;
> +		}
> +	}
> +out:
> +	return error;
> +
> +out_undo:
> +	/* If I have an error cleanup all namespaces I initialized */
> +	list_del(&ops->list);
> +	for_each_net(undo_net) {
> +		if (undo_net == net)
> +			goto undone;
> +		if (ops->exit)
> +			ops->exit(undo_net);
> +	}
> +undone:
> +	goto out;
> +}
> +
> +static void unregister_pernet_operations(struct pernet_operations *ops)
> +{
> +	struct net *net;
> +
> +	list_del(&ops->list);
> +	for_each_net(net)
> +		if (ops->exit)

The same here.

> +			ops->exit(net);
> +}
> +
> +/**
> + *      register_pernet_subsys - register a network namespace subsystem
> + *	@ops:  pernet operations structure for the subsystem
> + *
> + *	Register a subsystem which has init and exit functions
> + *	that are called when network namespaces are created and
> + *	destroyed respectively.
> + *
> + *	When registered all network namespace init functions are
> + *	called for every existing network namespace.  Allowing kernel
> + *	modules to have a race free view of the set of network namespaces.
> + *
> + *	When a new network namespace is created all of the init
> + *	methods are called in the order in which they were registered.
> + *
> + *	When a network namespace is destroyed all of the exit methods
> + *	are called in the reverse of the order with which they were
> + *	registered.
> + */
> +int register_pernet_subsys(struct pernet_operations *ops)
> +{
> +	int error;
> +	mutex_lock(&net_mutex);
> +	error =  register_pernet_operations(first_device, ops);
> +	mutex_unlock(&net_mutex);
> +	return error;
> +}
> +EXPORT_SYMBOL_GPL(register_pernet_subsys);
> +
> +/**
> + *      unregister_pernet_subsys - unregister a network namespace subsystem
> + *	@ops: pernet operations structure to manipulate
> + *
> + *	Remove the pernet operations structure from the list to be
> + *	used when network namespaces are created or destoryed.  In
> + *	addition run the exit method for all existing network
> + *	namespaces.
> + */
> +void unregister_pernet_subsys(struct pernet_operations *module)
> +{
> +	mutex_lock(&net_mutex);
> +	unregister_pernet_operations(module);
> +	mutex_unlock(&net_mutex);
> +}
> +EXPORT_SYMBOL_GPL(unregister_pernet_subsys);
> +
> +/**
> + *      register_pernet_device - register a network namespace device
> + *	@ops:  pernet operations structure for the subsystem
> + *
> + *	Register a device which has init and exit functions
> + *	that are called when network namespaces are created and
> + *	destroyed respectively.
> + *
> + *	When registered all network namespace init functions are
> + *	called for every existing network namespace.  Allowing kernel
> + *	modules to have a race free view of the set of network namespaces.
> + *
> + *	When a new network namespace is created all of the init
> + *	methods are called in the order in which they were registered.
> + *
> + *	When a network namespace is destroyed all of the exit methods
> + *	are called in the reverse of the order with which they were
> + *	registered.
> + */
> +int register_pernet_device(struct pernet_operations *ops)
> +{
> +	int error;
> +	mutex_lock(&net_mutex);
> +	error = register_pernet_operations(&pernet_list, ops);
> +	if (!error && (first_device == &pernet_list))

Very minor: why do you give the name "device" to some pernet_operations?

Thanks,
Pavel

^ permalink raw reply

* Re: 2.6.23-rc5: possible irq lock inversion dependency detected
From: Herbert Xu @ 2007-09-10 13:00 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-kernel, netdev, jamal
In-Reply-To: <alpine.DEB.0.999.0709021455510.7103@sheep.housecafe.de>

On Sun, Sep 02, 2007 at 01:11:29PM +0000, Christian Kujau wrote:
> 
> after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep 
> was quite noisy when I tried to shape my external (wireless) interface:
> 
> [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock:
> [ 6400.534713]  (&dev->ingress_lock){-+..}, at: [<c038d595>] 
> netif_receive_skb+0x2d5/0x3c0
> [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the 
> past:
> [ 6400.535145]  (police_lock){-.--}

This is a genuine dead-lock.  The police lock can be taken
for reading with softirqs on.  If a second CPU tries to take
the police lock for writing, while holding the ingress lock,
then a softirq on the first CPU can dead-lock when it tries
to get the ingress lock.

The minimal fix would be to make sure that we disable BH on
the first CPU.  Jamal, could you take a look at this please?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 1/2] remove asm/bitops.h includes
From: Ralf Baechle @ 2007-09-10 12:30 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Andrew Morton, linux-kernel, Adrian Bunk, netdev, rth, dhowells,
	linux-mips
In-Reply-To: <30483262301654323266@pripojeni.net>

On Sat, Sep 08, 2007 at 09:00:08PM +0100, Jiri Slaby wrote:

> 
> remove asm/bitops.h includes
> 
> including asm/bitops directly may cause compile errors. don't include it
> and include linux/bitops instead. next patch will deny including asm header
> directly.
> 
> Cc: Adrian Bunk <bunk@kernel.org>
> Signed-off-by: Jiri Slaby <jirislaby@gmail.com>

For the MIPS and hamradio bits:

Acked-by: Ralf Baechle <ralf@linux-mips.org>

  Ralf

^ permalink raw reply

* Re: RFC: possible NAPI improvements to reduce interrupt rates for low traffic rates
From: jamal @ 2007-09-10 12:27 UTC (permalink / raw)
  To: James Chapman
  Cc: netdev, davem, jeff, mandeep.baines, ossthema, Stephen Hemminger
In-Reply-To: <46E50C4F.1070506@katalix.com>

On Mon, 2007-10-09 at 10:20 +0100, James Chapman wrote:
> jamal wrote:
> 
> > If the problem i am trying to solve is "reduce cpu use at lower rate",
> > then this is not the right answer because your cpu use has gone up.
> 
> The problem I'm trying to solve is "reduce the max interrupt rate from 
> NAPI drivers while minimizing latency".

As long as what you are saying above translates to "there is one
interupt per packet per napi poll" then we are saying the same thing.

>  In modern systems, the interrupt 
> rate can be so high that the CPU spends too much time processing 
> interrupts, resulting in the system's behavior seen by the user being 
> degraded.

modern systems also can handle interupts a lot better. 
If you can amortize two packets per interupt per napi poll then you
have done better than the breakeven point; however, I think it is fair
to also disprove that in modern hardware the breakeven point is met with
amortizing two packets.

> Having the poll() called when idle will always increase CPU usage. But 
> the feedback you and others are giving encourages me to find a better 
> compromise. 

i dont mean in any way to discourage you - just making you work
better ;-> It is very refreshing to see that you understand the scope is
performance not vomiting endless versions of code - and for this i feel
obligated to help.

> I'll go away and do some tests. I'll post results here for discussion later.

way to go.

cheers,
jamal


^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Kyle Moffett @ 2007-09-10 12:22 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <200709101156.30010.vda.linux@googlemail.com>

On Sep 10, 2007, at 06:56:29, Denys Vlasenko wrote:
> On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
>> On Sun, 9 Sep 2007 19:02:54 +0100
>> Denys Vlasenko <vda.linux@googlemail.com> wrote:
>>
>>> Why is all this fixation on "volatile"? I don't think people want  
>>> "volatile" keyword per se, they want atomic_read(&x) to _always_  
>>> compile into an memory-accessing instruction, not register access.
>>
>> and ... why is that?  is there any valid, non-buggy code sequence  
>> that makes that a reasonable requirement?
>
> Well, if you insist on having it again:
>
> Waiting for atomic value to be zero:
>
>         while (atomic_read(&x))
>                 continue;
>
> gcc may happily convert it into:
>
>         reg = atomic_read(&x);
>         while (reg)
>                 continue;

Bzzt.  Even if you fixed gcc to actually convert it to a busy loop on  
a memory variable, you STILL HAVE A BUG as it may *NOT* be gcc that  
does the conversion, it may be that the CPU does the caching of the  
memory value.  GCC has no mechanism to do cache-flushes or memory- 
barriers except through our custom inline assembly.  Also, you  
probably want a cpu_relax() in there somewhere to avoid overheating  
the CPU.  Thirdly, on a large system it may take some arbitrarily  
large amount of time for cache-propagation to update the value of the  
variable in your local CPU cache.  Finally, if atomics are based on  
based on spinlock+interrupt-disable then you will sit in a tight busy- 
loop of spin_lock_irqsave()->spin_unlock_irqrestore().  Depending on  
your system's internal model this may practically lock up your core  
because the spin_lock() will take the cacheline for exclusive access  
and doing that in a loop can prevent any other CPU from doing any  
operation on it!  Since your IRQs are disabled you even have a very  
small window that an IRQ will come along and free it up long enough  
for the update to take place.

The earlier code segment of:
> while(atomic_read(&x) > 0)
> 	atomic_dec(&x);
is *completely* buggy because you could very easily have 4 CPUs doing  
this on an atomic variable with a value of 1 and end up with it at  
negative 3 by the time you are done.  Moreover all the alternatives  
are also buggy, with the sole exception of this rather obvious- 
seeming one:
> atomic_set(&x, 0);

You simply CANNOT use an atomic_t as your sole synchronizing  
primitive, it doesn't work!  You virtually ALWAYS want to use an  
atomic_t in the following types of situations:

(A) As an object refcount.  The value is never read except as part of  
an atomic_dec_return().  Why aren't you using "struct kref"?

(B) As an atomic value counter (number of processes, for example).   
Just "reading" the value is racy anyways, if you want to enforce a  
limit or something then use atomic_inc_return(), check the result,  
and use atomic_dec() if it's too big.  If you just want to return the  
statistics then you are going to be instantaneous-point-in-time anyways.

(C) As an optimization value (statistics-like, but exact accuracy  
isn't important).

Atomics are NOT A REPLACEMENT for the proper kernel subsystem, like  
completions, mutexes, semaphores, spinlocks, krefs, etc.  It's not  
useful for synchronization, only for keeping track of simple integer  
RMW values.  Note that atomic_read() and atomic_set() aren't very  
useful RMW primitives (read-nomodify-nowrite and read-set-zero- 
write).  Code which assumes anything else is probably buggy in other  
ways too.

So while I see no real reason for the "volatile" on the atomics, I  
also see no real reason why it's terribly harmful.  Regardless of the  
"volatile" on the operation the CPU is perfectly happy to cache it  
anyways so it doesn't buy you any actual "always-access-memory"  
guarantees.  If you are just interested in it as an optimization you  
could probably just read the properly-aligned integer counter  
directly, an atomic read on most CPUs.

If you really need it to hit main memory *every* *single* *time*  
(Why?  Are you using it instead of the proper kernel subsystem?)   
then you probably need a custom inline assembly helper anyways.

Cheers,
Kyle Moffett

^ permalink raw reply

* [PATCH] sb1250-mac.c: De-typedef, de-volatile, de-etc...
From: Maciej W. Rozycki @ 2007-09-10 12:20 UTC (permalink / raw)
  To: Andrew Morton, Jeff Garzik; +Cc: netdev, linux-mips, linux-kernel

 Remove typedefs, volatiles and convert kmalloc()/memset() pairs to
kcalloc().  Also reformat the surrounding clutter.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
 Per your request, Andrew, a while ago.  It builds, runs, passes 
checkpatch.pl and sparse.  No semantic changes.

 Please apply,

  Maciej

patch-mips-2.6.23-rc5-20070904-sb1250-mac-typedef-7
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/drivers/net/sb1250-mac.c linux-mips-2.6.23-rc5-20070904/drivers/net/sb1250-mac.c
--- linux-mips-2.6.23-rc5-20070904.macro/drivers/net/sb1250-mac.c	2007-07-10 04:57:46.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/drivers/net/sb1250-mac.c	2007-09-10 01:01:09.000000000 +0000
@@ -140,17 +140,17 @@ MODULE_PARM_DESC(int_timeout_rx, "RX tim
  ********************************************************************* */
 
 
-typedef enum { sbmac_speed_auto, sbmac_speed_10,
-	       sbmac_speed_100, sbmac_speed_1000 } sbmac_speed_t;
+enum sbmac_speed { sbmac_speed_auto, sbmac_speed_10,
+		   sbmac_speed_100, sbmac_speed_1000 };
 
-typedef enum { sbmac_duplex_auto, sbmac_duplex_half,
-	       sbmac_duplex_full } sbmac_duplex_t;
+enum sbmac_duplex { sbmac_duplex_auto, sbmac_duplex_half,
+		    sbmac_duplex_full };
 
-typedef enum { sbmac_fc_auto, sbmac_fc_disabled, sbmac_fc_frame,
-	       sbmac_fc_collision, sbmac_fc_carrier } sbmac_fc_t;
+enum sbmac_fc { sbmac_fc_auto, sbmac_fc_disabled, sbmac_fc_frame,
+		sbmac_fc_collision, sbmac_fc_carrier } sbmac_fc_t;
 
-typedef enum { sbmac_state_uninit, sbmac_state_off, sbmac_state_on,
-	       sbmac_state_broken } sbmac_state_t;
+enum sbmac_state { sbmac_state_uninit, sbmac_state_off, sbmac_state_on,
+		   sbmac_state_broken };
 
 
 /**********************************************************************
@@ -176,55 +176,61 @@ typedef enum { sbmac_state_uninit, sbmac
  *  DMA Descriptor structure
  ********************************************************************* */
 
-typedef struct sbdmadscr_s {
+struct sbdmadscr {
 	uint64_t  dscr_a;
 	uint64_t  dscr_b;
-} sbdmadscr_t;
-
-typedef unsigned long paddr_t;
+};
 
 /**********************************************************************
  *  DMA Controller structure
  ********************************************************************* */
 
-typedef struct sbmacdma_s {
+struct sbmacdma {
 
 	/*
 	 * This stuff is used to identify the channel and the registers
 	 * associated with it.
 	 */
-
-	struct sbmac_softc *sbdma_eth;	    /* back pointer to associated MAC */
-	int              sbdma_channel;	    /* channel number */
-	int		 sbdma_txdir;       /* direction (1=transmit) */
-	int		 sbdma_maxdescr;    /* total # of descriptors in ring */
+	struct sbmac_softc	*sbdma_eth;	/* back pointer to associated
+						   MAC */
+	int			sbdma_channel;	/* channel number */
+	int			sbdma_txdir;	/* direction (1=transmit) */
+	int			sbdma_maxdescr;	/* total # of descriptors
+						   in ring */
 #ifdef CONFIG_SBMAC_COALESCE
-	int		 sbdma_int_pktcnt;  /* # descriptors rx/tx before interrupt*/
-	int		 sbdma_int_timeout; /* # usec rx/tx interrupt */
-#endif
-
-	volatile void __iomem *sbdma_config0;	/* DMA config register 0 */
-	volatile void __iomem *sbdma_config1;	/* DMA config register 1 */
-	volatile void __iomem *sbdma_dscrbase;	/* Descriptor base address */
-	volatile void __iomem *sbdma_dscrcnt;   /* Descriptor count register */
-	volatile void __iomem *sbdma_curdscr;	/* current descriptor address */
-	volatile void __iomem *sbdma_oodpktlost;/* pkt drop (rx only) */
-
+	int			sbdma_int_pktcnt;
+						/* # descriptors rx/tx
+						   before interrupt */
+	int			sbdma_int_timeout;
+						/* # usec rx/tx interrupt */
+#endif
+	void __iomem		*sbdma_config0;	/* DMA config register 0 */
+	void __iomem		*sbdma_config1;	/* DMA config register 1 */
+	void __iomem		*sbdma_dscrbase;
+						/* descriptor base address */
+	void __iomem		*sbdma_dscrcnt;	/* descriptor count register */
+	void __iomem		*sbdma_curdscr;	/* current descriptor
+						   address */
+	void __iomem		*sbdma_oodpktlost;
+						/* pkt drop (rx only) */
 
 	/*
 	 * This stuff is for maintenance of the ring
 	 */
-
-	sbdmadscr_t     *sbdma_dscrtable_unaligned;
-	sbdmadscr_t     *sbdma_dscrtable;	/* base of descriptor table */
-	sbdmadscr_t     *sbdma_dscrtable_end; /* end of descriptor table */
-
-	struct sk_buff **sbdma_ctxtable;    /* context table, one per descr */
-
-	paddr_t          sbdma_dscrtable_phys; /* and also the phys addr */
-	sbdmadscr_t     *sbdma_addptr;	/* next dscr for sw to add */
-	sbdmadscr_t     *sbdma_remptr;	/* next dscr for sw to remove */
-} sbmacdma_t;
+	void			*sbdma_dscrtable_unaligned;
+	struct sbdmadscr	*sbdma_dscrtable;
+						/* base of descriptor table */
+	struct sbdmadscr	*sbdma_dscrtable_end;
+						/* end of descriptor table */
+	struct sk_buff		**sbdma_ctxtable;
+						/* context table, one
+						   per descr */
+	dma_addr_t		sbdma_dscrtable_phys;
+						/* and also the phys addr */
+	struct sbdmadscr	*sbdma_addptr;	/* next dscr for sw to add */
+	struct sbdmadscr	*sbdma_remptr;	/* next dscr for sw
+						   to remove */
+};
 
 
 /**********************************************************************
@@ -236,47 +242,45 @@ struct sbmac_softc {
 	/*
 	 * Linux-specific things
 	 */
+	struct net_device	*sbm_dev;	/* pointer to linux device */
+	spinlock_t sbm_lock;			/* spin lock */
+	struct timer_list	sbm_timer;	/* for monitoring MII */
+	struct net_device_stats	sbm_stats;
+	int			sbm_devflags;	/* current device flags */
+
+	int			sbm_phy_oldbmsr;
+	int			sbm_phy_oldanlpar;
+	int			sbm_phy_oldk1stsr;
+	int			sbm_phy_oldlinkstat;
+	int			sbm_buffersize;
 
-	struct net_device *sbm_dev;		/* pointer to linux device */
-	spinlock_t sbm_lock;		/* spin lock */
-	struct timer_list sbm_timer;     	/* for monitoring MII */
-	struct net_device_stats sbm_stats;
-	int sbm_devflags;			/* current device flags */
-
-	int	     sbm_phy_oldbmsr;
-	int	     sbm_phy_oldanlpar;
-	int	     sbm_phy_oldk1stsr;
-	int	     sbm_phy_oldlinkstat;
-	int sbm_buffersize;
-
-	unsigned char sbm_phys[2];
+	unsigned char		sbm_phys[2];
 
 	/*
 	 * Controller-specific things
 	 */
+	void __iomem		*sbm_base;	/* MAC's base address */
+	enum sbmac_state	sbm_state;	/* current state */
 
-	void __iomem		*sbm_base;          /* MAC's base address */
-	sbmac_state_t    sbm_state;         /* current state */
-
-	volatile void __iomem	*sbm_macenable;	/* MAC Enable Register */
-	volatile void __iomem	*sbm_maccfg;	/* MAC Configuration Register */
-	volatile void __iomem	*sbm_fifocfg;	/* FIFO configuration register */
-	volatile void __iomem	*sbm_framecfg;	/* Frame configuration register */
-	volatile void __iomem	*sbm_rxfilter;	/* receive filter register */
-	volatile void __iomem	*sbm_isr;	/* Interrupt status register */
-	volatile void __iomem	*sbm_imr;	/* Interrupt mask register */
-	volatile void __iomem	*sbm_mdio;	/* MDIO register */
-
-	sbmac_speed_t    sbm_speed;		/* current speed */
-	sbmac_duplex_t   sbm_duplex;	/* current duplex */
-	sbmac_fc_t       sbm_fc;		/* current flow control setting */
-
-	unsigned char    sbm_hwaddr[ETHER_ADDR_LEN];
-
-	sbmacdma_t       sbm_txdma;		/* for now, only use channel 0 */
-	sbmacdma_t       sbm_rxdma;
-	int              rx_hw_checksum;
-	int 		 sbe_idx;
+	void __iomem		*sbm_macenable;	/* MAC Enable Register */
+	void __iomem		*sbm_maccfg;	/* MAC Config Register */
+	void __iomem		*sbm_fifocfg;	/* FIFO Config Register */
+	void __iomem		*sbm_framecfg;	/* Frame Config Register */
+	void __iomem		*sbm_rxfilter;	/* Receive Filter Register */
+	void __iomem		*sbm_isr;	/* Interrupt Status Register */
+	void __iomem		*sbm_imr;	/* Interrupt Mask Register */
+	void __iomem		*sbm_mdio;	/* MDIO Register */
+
+	enum sbmac_speed	sbm_speed;	/* current speed */
+	enum sbmac_duplex	sbm_duplex;	/* current duplex */
+	enum sbmac_fc		sbm_fc;		/* cur. flow control setting */
+
+	unsigned char		sbm_hwaddr[ETHER_ADDR_LEN];
+
+	struct sbmacdma		sbm_txdma;	/* only channel 0 for now */
+	struct sbmacdma		sbm_rxdma;
+	int			rx_hw_checksum;
+	int			sbe_idx;
 };
 
 
@@ -288,30 +292,31 @@ struct sbmac_softc {
  *  Prototypes
  ********************************************************************* */
 
-static void sbdma_initctx(sbmacdma_t *d,
-			  struct sbmac_softc *s,
-			  int chan,
-			  int txrx,
-			  int maxdescr);
-static void sbdma_channel_start(sbmacdma_t *d, int rxtx);
-static int sbdma_add_rcvbuffer(sbmacdma_t *d,struct sk_buff *m);
-static int sbdma_add_txbuffer(sbmacdma_t *d,struct sk_buff *m);
-static void sbdma_emptyring(sbmacdma_t *d);
-static void sbdma_fillring(sbmacdma_t *d);
-static int sbdma_rx_process(struct sbmac_softc *sc,sbmacdma_t *d, int work_to_do, int poll);
-static void sbdma_tx_process(struct sbmac_softc *sc,sbmacdma_t *d, int poll);
+static void sbdma_initctx(struct sbmacdma *d, struct sbmac_softc *s, int chan,
+			  int txrx, int maxdescr);
+static void sbdma_channel_start(struct sbmacdma *d, int rxtx);
+static int sbdma_add_rcvbuffer(struct sbmacdma *d, struct sk_buff *m);
+static int sbdma_add_txbuffer(struct sbmacdma *d, struct sk_buff *m);
+static void sbdma_emptyring(struct sbmacdma *d);
+static void sbdma_fillring(struct sbmacdma *d);
+static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
+			    int work_to_do, int poll);
+static void sbdma_tx_process(struct sbmac_softc *sc, struct sbmacdma *d,
+			     int poll);
 static int sbmac_initctx(struct sbmac_softc *s);
 static void sbmac_channel_start(struct sbmac_softc *s);
 static void sbmac_channel_stop(struct sbmac_softc *s);
-static sbmac_state_t sbmac_set_channel_state(struct sbmac_softc *,sbmac_state_t);
-static void sbmac_promiscuous_mode(struct sbmac_softc *sc,int onoff);
+static enum sbmac_state sbmac_set_channel_state(struct sbmac_softc *,
+						enum sbmac_state);
+static void sbmac_promiscuous_mode(struct sbmac_softc *sc, int onoff);
 static uint64_t sbmac_addr2reg(unsigned char *ptr);
-static irqreturn_t sbmac_intr(int irq,void *dev_instance);
+static irqreturn_t sbmac_intr(int irq, void *dev_instance);
 static int sbmac_start_tx(struct sk_buff *skb, struct net_device *dev);
 static void sbmac_setmulti(struct sbmac_softc *sc);
 static int sbmac_init(struct net_device *dev, int idx);
-static int sbmac_set_speed(struct sbmac_softc *s,sbmac_speed_t speed);
-static int sbmac_set_duplex(struct sbmac_softc *s,sbmac_duplex_t duplex,sbmac_fc_t fc);
+static int sbmac_set_speed(struct sbmac_softc *s, enum sbmac_speed speed);
+static int sbmac_set_duplex(struct sbmac_softc *s, enum sbmac_duplex duplex,
+			    enum sbmac_fc fc);
 
 static int sbmac_open(struct net_device *dev);
 static void sbmac_timer(unsigned long data);
@@ -322,13 +327,15 @@ static int sbmac_mii_ioctl(struct net_de
 static int sbmac_close(struct net_device *dev);
 static int sbmac_poll(struct net_device *poll_dev, int *budget);
 
-static int sbmac_mii_poll(struct sbmac_softc *s,int noisy);
+static int sbmac_mii_poll(struct sbmac_softc *s, int noisy);
 static int sbmac_mii_probe(struct net_device *dev);
 
 static void sbmac_mii_sync(struct sbmac_softc *s);
-static void sbmac_mii_senddata(struct sbmac_softc *s,unsigned int data, int bitcnt);
-static unsigned int sbmac_mii_read(struct sbmac_softc *s,int phyaddr,int regidx);
-static void sbmac_mii_write(struct sbmac_softc *s,int phyaddr,int regidx,
+static void sbmac_mii_senddata(struct sbmac_softc *s, unsigned int data,
+			       int bitcnt);
+static unsigned int sbmac_mii_read(struct sbmac_softc *s, int phyaddr,
+				   int regidx);
+static void sbmac_mii_write(struct sbmac_softc *s, int phyaddr, int regidx,
 			    unsigned int regval);
 
 
@@ -677,8 +684,8 @@ static void sbmac_mii_write(struct sbmac
  *  way.
  *
  *  Input parameters:
- *  	   d - sbmacdma_t structure (DMA channel context)
- *  	   s - sbmac_softc structure (pointer to a MAC)
+ *  	   d - struct sbmacdma (DMA channel context)
+ *  	   s - struct sbmac_softc (pointer to a MAC)
  *  	   chan - channel number (0..1 right now)
  *  	   txrx - Identifies DMA_TX or DMA_RX for channel direction
  *      maxdescr - number of descriptors
@@ -687,11 +694,8 @@ static void sbmac_mii_write(struct sbmac
  *  	   nothing
  ********************************************************************* */
 
-static void sbdma_initctx(sbmacdma_t *d,
-			  struct sbmac_softc *s,
-			  int chan,
-			  int txrx,
-			  int maxdescr)
+static void sbdma_initctx(struct sbmacdma *d, struct sbmac_softc *s, int chan,
+			  int txrx, int maxdescr)
 {
 #ifdef CONFIG_SBMAC_COALESCE
 	int int_pktcnt, int_timeout;
@@ -758,18 +762,17 @@ static void sbdma_initctx(sbmacdma_t *d,
 
 	d->sbdma_maxdescr = maxdescr;
 
-	d->sbdma_dscrtable_unaligned =
-	d->sbdma_dscrtable = (sbdmadscr_t *)
-		kmalloc((d->sbdma_maxdescr+1)*sizeof(sbdmadscr_t), GFP_KERNEL);
+	d->sbdma_dscrtable_unaligned = kcalloc(d->sbdma_maxdescr + 1,
+					       sizeof(*d->sbdma_dscrtable),
+					       GFP_KERNEL);
 
 	/*
 	 * The descriptor table must be aligned to at least 16 bytes or the
 	 * MAC will corrupt it.
 	 */
-	d->sbdma_dscrtable = (sbdmadscr_t *)
-		ALIGN((unsigned long)d->sbdma_dscrtable, sizeof(sbdmadscr_t));
-
-	memset(d->sbdma_dscrtable,0,d->sbdma_maxdescr*sizeof(sbdmadscr_t));
+	d->sbdma_dscrtable = (struct sbdmadscr *)
+			     ALIGN((unsigned long)d->sbdma_dscrtable_unaligned,
+				   sizeof(*d->sbdma_dscrtable));
 
 	d->sbdma_dscrtable_end = d->sbdma_dscrtable + d->sbdma_maxdescr;
 
@@ -779,10 +782,8 @@ static void sbdma_initctx(sbmacdma_t *d,
 	 * And context table
 	 */
 
-	d->sbdma_ctxtable = (struct sk_buff **)
-		kmalloc(d->sbdma_maxdescr*sizeof(struct sk_buff *), GFP_KERNEL);
-
-	memset(d->sbdma_ctxtable,0,d->sbdma_maxdescr*sizeof(struct sk_buff *));
+	d->sbdma_ctxtable = kcalloc(d->sbdma_maxdescr,
+				    sizeof(*d->sbdma_ctxtable), GFP_KERNEL);
 
 #ifdef CONFIG_SBMAC_COALESCE
 	/*
@@ -819,7 +820,7 @@ static void sbdma_initctx(sbmacdma_t *d,
  *  	   nothing
  ********************************************************************* */
 
-static void sbdma_channel_start(sbmacdma_t *d, int rxtx )
+static void sbdma_channel_start(struct sbmacdma *d, int rxtx)
 {
 	/*
 	 * Turn on the DMA channel
@@ -860,7 +861,7 @@ static void sbdma_channel_start(sbmacdma
  *  	   nothing
  ********************************************************************* */
 
-static void sbdma_channel_stop(sbmacdma_t *d)
+static void sbdma_channel_stop(struct sbmacdma *d)
 {
 	/*
 	 * Turn off the DMA channel
@@ -909,10 +910,10 @@ static void sbdma_align_skb(struct sk_bu
  ********************************************************************* */
 
 
-static int sbdma_add_rcvbuffer(sbmacdma_t *d,struct sk_buff *sb)
+static int sbdma_add_rcvbuffer(struct sbmacdma *d, struct sk_buff *sb)
 {
-	sbdmadscr_t *dsc;
-	sbdmadscr_t *nextdsc;
+	struct sbdmadscr *dsc;
+	struct sbdmadscr *nextdsc;
 	struct sk_buff *sb_new = NULL;
 	int pktsize = ENET_PACKET_SIZE;
 
@@ -1024,10 +1025,10 @@ static int sbdma_add_rcvbuffer(sbmacdma_
  ********************************************************************* */
 
 
-static int sbdma_add_txbuffer(sbmacdma_t *d,struct sk_buff *sb)
+static int sbdma_add_txbuffer(struct sbmacdma *d, struct sk_buff *sb)
 {
-	sbdmadscr_t *dsc;
-	sbdmadscr_t *nextdsc;
+	struct sbdmadscr *dsc;
+	struct sbdmadscr *nextdsc;
 	uint64_t phys;
 	uint64_t ncb;
 	int length;
@@ -1113,7 +1114,7 @@ static int sbdma_add_txbuffer(sbmacdma_t
  *  	   nothing
  ********************************************************************* */
 
-static void sbdma_emptyring(sbmacdma_t *d)
+static void sbdma_emptyring(struct sbmacdma *d)
 {
 	int idx;
 	struct sk_buff *sb;
@@ -1141,7 +1142,7 @@ static void sbdma_emptyring(sbmacdma_t *
  *  	   nothing
  ********************************************************************* */
 
-static void sbdma_fillring(sbmacdma_t *d)
+static void sbdma_fillring(struct sbmacdma *d)
 {
 	int idx;
 
@@ -1188,12 +1189,12 @@ static void sbmac_netpoll(struct net_dev
  *  	   nothing
  ********************************************************************* */
 
-static int sbdma_rx_process(struct sbmac_softc *sc,sbmacdma_t *d,
-                             int work_to_do, int poll)
+static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
+			    int work_to_do, int poll)
 {
 	int curidx;
 	int hwidx;
-	sbdmadscr_t *dsc;
+	struct sbdmadscr *dsc;
 	struct sk_buff *sb;
 	int len;
 	int work_done = 0;
@@ -1225,8 +1226,9 @@ again:
 		prefetch(dsc);
 		prefetch(&d->sbdma_ctxtable[curidx]);
 
-		hwidx = (int) (((__raw_readq(d->sbdma_curdscr) & M_DMA_CURDSCR_ADDR) -
-				d->sbdma_dscrtable_phys) / sizeof(sbdmadscr_t));
+		hwidx = ((__raw_readq(d->sbdma_curdscr) & M_DMA_CURDSCR_ADDR) -
+			 d->sbdma_dscrtable_phys) /
+			sizeof(*d->sbdma_dscrtable);
 
 		/*
 		 * If they're the same, that means we've processed all
@@ -1350,11 +1352,12 @@ done:
  *  	   nothing
  ********************************************************************* */
 
-static void sbdma_tx_process(struct sbmac_softc *sc,sbmacdma_t *d, int poll)
+static void sbdma_tx_process(struct sbmac_softc *sc, struct sbmacdma *d,
+			     int poll)
 {
 	int curidx;
 	int hwidx;
-	sbdmadscr_t *dsc;
+	struct sbdmadscr *dsc;
 	struct sk_buff *sb;
 	unsigned long flags;
 	int packets_handled = 0;
@@ -1364,8 +1367,8 @@ static void sbdma_tx_process(struct sbma
 	if (d->sbdma_remptr == d->sbdma_addptr)
 	  goto end_unlock;
 
-	hwidx = (int) (((__raw_readq(d->sbdma_curdscr) & M_DMA_CURDSCR_ADDR) -
-			d->sbdma_dscrtable_phys) / sizeof(sbdmadscr_t));
+	hwidx = ((__raw_readq(d->sbdma_curdscr) & M_DMA_CURDSCR_ADDR) -
+		 d->sbdma_dscrtable_phys) / sizeof(*d->sbdma_dscrtable);
 
 	for (;;) {
 		/*
@@ -1502,7 +1505,7 @@ static int sbmac_initctx(struct sbmac_so
 }
 
 
-static void sbdma_uninitctx(struct sbmacdma_s *d)
+static void sbdma_uninitctx(struct sbmacdma *d)
 {
 	if (d->sbdma_dscrtable_unaligned) {
 		kfree(d->sbdma_dscrtable_unaligned);
@@ -1538,7 +1541,7 @@ static void sbmac_uninitctx(struct sbmac
 static void sbmac_channel_start(struct sbmac_softc *s)
 {
 	uint64_t reg;
-	volatile void __iomem *port;
+	void __iomem *port;
 	uint64_t cfg,fifo,framecfg;
 	int idx, th_value;
 
@@ -1801,10 +1804,10 @@ static void sbmac_channel_stop(struct sb
  *  Return value:
  *  	   old state
  ********************************************************************* */
-static sbmac_state_t sbmac_set_channel_state(struct sbmac_softc *sc,
-					     sbmac_state_t state)
+static enum sbmac_state sbmac_set_channel_state(struct sbmac_softc *sc,
+						enum sbmac_state state)
 {
-	sbmac_state_t oldstate = sc->sbm_state;
+	enum sbmac_state oldstate = sc->sbm_state;
 
 	/*
 	 * If same as previous state, return
@@ -1939,14 +1942,14 @@ static uint64_t sbmac_addr2reg(unsigned 
  *
  *  Input parameters:
  *  	   s - sbmac structure
- *  	   speed - speed to set MAC to (see sbmac_speed_t enum)
+ *  	   speed - speed to set MAC to (see enum sbmac_speed)
  *
  *  Return value:
  *  	   1 if successful
  *      0 indicates invalid parameters
  ********************************************************************* */
 
-static int sbmac_set_speed(struct sbmac_softc *s,sbmac_speed_t speed)
+static int sbmac_set_speed(struct sbmac_softc *s, enum sbmac_speed speed)
 {
 	uint64_t cfg;
 	uint64_t framecfg;
@@ -2028,15 +2031,16 @@ static int sbmac_set_speed(struct sbmac_
  *
  *  Input parameters:
  *  	   s - sbmac structure
- *  	   duplex - duplex setting (see sbmac_duplex_t)
- *  	   fc - flow control setting (see sbmac_fc_t)
+ *  	   duplex - duplex setting (see enum sbmac_duplex)
+ *  	   fc - flow control setting (see enum sbmac_fc)
  *
  *  Return value:
  *  	   1 if ok
  *  	   0 if an invalid parameter combination was specified
  ********************************************************************* */
 
-static int sbmac_set_duplex(struct sbmac_softc *s,sbmac_duplex_t duplex,sbmac_fc_t fc)
+static int sbmac_set_duplex(struct sbmac_softc *s, enum sbmac_duplex duplex,
+			    enum sbmac_fc fc)
 {
 	uint64_t cfg;
 
@@ -2236,7 +2240,7 @@ static int sbmac_start_tx(struct sk_buff
 static void sbmac_setmulti(struct sbmac_softc *sc)
 {
 	uint64_t reg;
-	volatile void __iomem *port;
+	void __iomem *port;
 	int idx;
 	struct dev_mc_list *mclist;
 	struct net_device *dev = sc->sbm_dev;

^ permalink raw reply

* Re: RFC: possible NAPI improvements to reduce interrupt rates for low traffic rates
From: jamal @ 2007-09-10 12:12 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: James Chapman, netdev, davem, jeff, ossthema, Stephen Hemminger
In-Reply-To: <20070908164222.GB3765@ludhiana>

On Sat, 2007-08-09 at 09:42 -0700, Mandeep Singh Baines wrote:

> Reading the "interrupt pending" register would require an MMIO read.
> MMIO reads are very expensive. In some systems the latency of an MMIO
> read can be 1000x that of an L1 cache access.

Indeed.

> However, work_done() doesn't have to be inefficient. For newer
> devices you can implement work_done() without an MMIO read by polling
> the next ring entry status in memory or some other mechanism. Since
> PCI is coherent, acceses to this memory location could be cached
> after the first miss. For architectures where PCI is not coherent you'd 
> have to go to memory for every poll. So for these architectures has_work()
> will be moderately expensive (memory access) even when has_work() does
> not require an MMIO read. This might affect home routers: not sure if MIPS or
> ARM have coherent PCI.

I think the effect would be clearly experimentally observable in smaller
devices e.g the Geode you seem to be experimenting on.
One other suggestion i made in the paper is to something along the lines
of cached_irq_mask for the i8259

cheers,
jamal


^ permalink raw reply

* [PATCH] ipconfig.c: De-clutter IP configuration report
From: Maciej W. Rozycki @ 2007-09-10 12:09 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, linux-kernel

 Reformat the printk() calls removing leading new-line characters, making 
output being done line-by-line rather than partially and defining the log 
level used.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
 The new code builds fine; no semantic changes.

 Please apply,

  Maciej

patch-mips-2.6.23-rc5-20070904-ipconfig-printk-2
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/net/ipv4/ipconfig.c linux-mips-2.6.23-rc5-20070904/net/ipv4/ipconfig.c
--- linux-mips-2.6.23-rc5-20070904.macro/net/ipv4/ipconfig.c	2007-09-04 04:56:22.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/net/ipv4/ipconfig.c	2007-09-10 11:53:19.000000000 +0000
@@ -1364,17 +1364,17 @@ static int __init ip_auto_config(void)
 	/*
 	 * Clue in the operator.
 	 */
-	printk("IP-Config: Complete:");
-	printk("\n      device=%s", ic_dev->name);
-	printk(", addr=%u.%u.%u.%u", NIPQUAD(ic_myaddr));
-	printk(", mask=%u.%u.%u.%u", NIPQUAD(ic_netmask));
-	printk(", gw=%u.%u.%u.%u", NIPQUAD(ic_gateway));
-	printk(",\n     host=%s, domain=%s, nis-domain=%s",
-	       utsname()->nodename, ic_domain, utsname()->domainname);
-	printk(",\n     bootserver=%u.%u.%u.%u", NIPQUAD(ic_servaddr));
-	printk(", rootserver=%u.%u.%u.%u", NIPQUAD(root_server_addr));
-	printk(", rootpath=%s", root_server_path);
-	printk("\n");
+	pr_info("IP-Config: Complete:\n");
+	pr_info("      device=%s, addr=%u.%u.%u.%u, "
+		"mask=%u.%u.%u.%u, gw=%u.%u.%u.%u,\n",
+		ic_dev->name, NIPQUAD(ic_myaddr),
+		NIPQUAD(ic_netmask), NIPQUAD(ic_gateway));
+	pr_info("      host=%s, domain=%s, nis-domain=%s,\n",
+		utsname()->nodename, ic_domain, utsname()->domainname);
+	pr_info("      bootserver=%u.%u.%u.%u, "
+		"rootserver=%u.%u.%u.%u, rootpath=%s\n",
+		NIPQUAD(ic_servaddr),
+		NIPQUAD(root_server_addr), root_server_path);
 #endif /* !SILENT */
 
 	return 0;

^ permalink raw reply

* Re: 2.6.23-rc5: possible irq lock inversion dependency detected
From: Peter Zijlstra @ 2007-09-10 12:03 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-kernel, netdev
In-Reply-To: <alpine.DEB.0.999.0709021455510.7103@sheep.housecafe.de>

On Sun, 2007-09-02 at 15:11 +0200, Christian Kujau wrote:
> Hi,
> 
> after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep 
> was quite noisy when I tried to shape my external (wireless) interface:
> 
> [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock:
> [ 6400.534713]  (&dev->ingress_lock){-+..}, at: [<c038d595>] netif_receive_skb+0x2d5/0x3c0
> [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the past:
> [ 6400.535145]  (police_lock){-.--}
> 
> This happened when I executed: http://nerdbynature.de/bits/2.6.23-rc5/qos.sh.txt
> (using iproute2-ss070313). The is still running, I just noticed a short 
> hickup, probably when it was busy writing the warning to the disk.
> 
> More details and .config: http://nerdbynature.de/bits/2.6.23-rc5/

seems unavailable at this time, please submit the whole lockdep report
if possible.

^ permalink raw reply

* [PATCH] cfg80211: fix initialisation if built-in
From: Johannes Berg @ 2007-09-10 11:44 UTC (permalink / raw)
  To: John W. Linville; +Cc: linux-wireless, netdev, Rob Hussey, stable

When cfg80211 is built into the kernel it needs to init earlier
so that device registrations are run after it has initialised.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>

---
 net/wireless/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- wireless-dev.orig/net/wireless/core.c	2007-09-10 13:00:18.567257487 +0200
+++ wireless-dev/net/wireless/core.c	2007-09-10 13:00:59.837219347 +0200
@@ -363,7 +363,7 @@ out_fail_notifier:
 out_fail_sysfs:
 	return err;
 }
-module_init(cfg80211_init);
+subsys_initcall(cfg80211_init);
 
 static void cfg80211_exit(void)
 {



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox