Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Jiri Slaby @ 2011-04-14 21:30 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Bryan Schumaker, Jiri Slaby, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1302816095.24028.87.camel-SyLVLa/KEI9HwK5hSS5vWB2eb7JE58TQ@public.gmane.org>

On 04/14/2011 11:21 PM, Trond Myklebust wrote:
> On Thu, 2011-04-14 at 22:37 +0200, Jiri Slaby wrote:
>> On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
>>> On 04/12/2011 02:52 PM, Jiri Slaby wrote:
>>>> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>>>>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>>>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>>>>> inacceptable for automounted NFS dirs.
>>>>>>>
>>>>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>>>>> and then trying rpcsec_gss if and only if that fails.
>>>>>>>
>>>>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>>>>> supported flavour?
>>>>>>
>>>>>> I don't know, I connect to a nfs server which is not maintained by me.
>>>>>> It looks like that. How can I find out?
>>>>>
>>>>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>>>>
>>>> I don't have NFS in modules. It's all built-in. And this one is
>>>> unconditionally selected because of CONFIG_NFS_V4.
>>>
>>> Does this patch help?
>>
>> Nope, it makes things even worse:
>> # mount -oro,intr XXX:/yyy /mnt/c/
>> <15s delay here>
>> mount.nfs: access denied by server while mounting XXX:/yyy
>>
>> So in nfs4_proc_get_root I do:
>>   printk("%s: %d %u\n", __func__, i, flav_array[i]);
>>   status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
>>   printk("%s: res=%d\n", __func__, status);
>> and get:
>> [   18.159818] nfs4_proc_get_root: 0 1
>> [   18.214872] nfs4_proc_get_root: res=-1
>> [   18.214875] nfs4_proc_get_root: 1 0
>> [   18.254636] nfs4_proc_get_root: res=-1
>> [   18.254639] nfs4_proc_get_root: 2 390003
>> [   33.252174] RPC: AUTH_GSS upcall timed out.
>> [   33.252177] Please check user daemon is running.
>> [   33.252192] nfs4_proc_get_root: res=-13
>>
>> If I revert that back and do the same:
>> [   28.275569] nfs4_proc_get_root: 0 1
>> [   28.296545] nfs4_proc_get_root: res=-1
>> [   28.296548] nfs4_proc_get_root: 1 390003
>> [   43.296107] RPC: AUTH_GSS upcall timed out.
>> [   43.296108] Please check user daemon is running.
>> [   43.296121] nfs4_proc_get_root: res=-13
>> [   43.296122] nfs4_proc_get_root: 2 0
>> [   43.318201] nfs4_proc_get_root: res=-1
>>
>> I.e. all methods fail. And what matters is the last retval. From NULL it
>> is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
>> nfs3, for EACCESS it dies terrible death.
> 
> OK. That's good information. Thanks for testing!
> 
> I'm still curious as to why that NFS server is refusing all NFSv4 mounts
> with NFS4ERR_WRONGSEC. Unless NFSv4 really is configured only to export
> the root filesystem with RPCSEC_GSS, then that definitely sounds like a
> bug...

With gssd running if that helps:
[  229.806528] nfs4_proc_get_root: 0 1
[  229.828491] nfs4_proc_get_root: res=-1
[  229.828494] nfs4_proc_get_root: 1 390003
[  229.896994] nfs4_proc_get_root: res=-13
[  229.896997] nfs4_proc_get_root: 2 0
[  229.920344] nfs4_proc_get_root: res=-1

thanks,
-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Trond Myklebust @ 2011-04-14 21:21 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Bryan Schumaker, Jiri Slaby, linux-kernel, akpm, mm-commits,
	ML netdev, linux-nfs
In-Reply-To: <4DA75AFC.3040000@suse.cz>

On Thu, 2011-04-14 at 22:37 +0200, Jiri Slaby wrote:
> On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
> > On 04/12/2011 02:52 PM, Jiri Slaby wrote:
> >> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
> >>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
> >>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
> >>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
> >>>>>> inacceptable for automounted NFS dirs.
> >>>>>
> >>>>> I'm still confused as to why you are hitting it at all. In the normal
> >>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
> >>>>> and then trying rpcsec_gss if and only if that fails.
> >>>>>
> >>>>> Are you really exporting a filesystem using AUTH_NULL as the only
> >>>>> supported flavour?
> >>>>
> >>>> I don't know, I connect to a nfs server which is not maintained by me.
> >>>> It looks like that. How can I find out?
> >>>
> >>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
> >>
> >> I don't have NFS in modules. It's all built-in. And this one is
> >> unconditionally selected because of CONFIG_NFS_V4.
> > 
> > Does this patch help?
> 
> Nope, it makes things even worse:
> # mount -oro,intr XXX:/yyy /mnt/c/
> <15s delay here>
> mount.nfs: access denied by server while mounting XXX:/yyy
> 
> So in nfs4_proc_get_root I do:
>   printk("%s: %d %u\n", __func__, i, flav_array[i]);
>   status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
>   printk("%s: res=%d\n", __func__, status);
> and get:
> [   18.159818] nfs4_proc_get_root: 0 1
> [   18.214872] nfs4_proc_get_root: res=-1
> [   18.214875] nfs4_proc_get_root: 1 0
> [   18.254636] nfs4_proc_get_root: res=-1
> [   18.254639] nfs4_proc_get_root: 2 390003
> [   33.252174] RPC: AUTH_GSS upcall timed out.
> [   33.252177] Please check user daemon is running.
> [   33.252192] nfs4_proc_get_root: res=-13
> 
> If I revert that back and do the same:
> [   28.275569] nfs4_proc_get_root: 0 1
> [   28.296545] nfs4_proc_get_root: res=-1
> [   28.296548] nfs4_proc_get_root: 1 390003
> [   43.296107] RPC: AUTH_GSS upcall timed out.
> [   43.296108] Please check user daemon is running.
> [   43.296121] nfs4_proc_get_root: res=-13
> [   43.296122] nfs4_proc_get_root: 2 0
> [   43.318201] nfs4_proc_get_root: res=-1
> 
> I.e. all methods fail. And what matters is the last retval. From NULL it
> is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
> nfs3, for EACCESS it dies terrible death.

OK. That's good information. Thanks for testing!

I'm still curious as to why that NFS server is refusing all NFSv4 mounts
with NFS4ERR_WRONGSEC. Unless NFSv4 really is configured only to export
the root filesystem with RPCSEC_GSS, then that definitely sounds like a
bug...

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply

* Re: Race condition when creating multiple namespaces?
From: Hans Schillstrom @ 2011-04-14 20:46 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, Daniel Lezcano
In-Reply-To: <m1ei58co08.fsf@fess.ebiederm.org>

Hello
I thought this might have been a kvm bug, but now I've got it in  net-netx 2.6.39-rc2 too

On Tuesday, April 12, 2011 02:27:35 Eric W. Biederman wrote:
> Hans Schillstrom <hans@schillstrom.com> writes:
> 
> > Hello
> > I'v been strugling with this for some time now
> >
> > When creating multiple namespaces using lxc-start,  un-initialized network namespace parts will be called by the new process in the namespace.
> > ex. when using conntrack or ipvsadm to quickly,  (a sleep 2 "solves" the problem).
> > (From what I can see syscall clone() is used in lx-start  i.e. do_fork will be called later on.)
> > Actually I was debugging ip_vs when closing multiple ns  when I fell into this one.
> >
> > I have a loop that create 33 containers whith lxc-start ... -- test.sh
> > the first thing the new conatiner does in test.sh is
> > #!/bin/bash
> > iptables -t mangle -A PREROUTING -m conntrack --ctstate RELATED,ESTABLISHED -j CONNMARK --restore-mark
> > nc -l -p1234
> >
> > This results in NULL ptr in ip_conntrack_net_init(struct *net)
> 
> Ouch!
> 
> > and in anoither test test.sh looks like this
> > #!/bin/bash
> > ipvsadm --start-daemon=master --mcast-interface=lo
> > nc -l -p1234
> >
> > And this results in an uniitialized spinlock in ip_vs_sync
> >
> > I put a printk in nsproxy: copy_namespaces() and could see a dozens of them
> > before anything appears from ipvs or conntrack.
> >
> > My feeling is that when you start up user processes in a new name space, 
> > all kernel related init should have been done (you should not need to add a sleep to get it working)
> >
> > All test  made by using todays net-next-2.6 (2.6.39-rc1)
> >

Same problem in rc2 from today

> > Note:
> > That neither conntrack or ip_vs modules where loaded,
> > if modules where loaded before creating new namespaces it all works...
> >
> > Finally the question,
> > Should it really work to load modules within a namespace , 
> > that is a part of netns ?
> 
> >From an implementation point of view kernel modules are not in a
> namespace, so there should be no difference between being in a namespace
> and loading a kernel networking module and not being in a namespace and
> loading in a kernel module.
> 
> It does sound like you have hit a module loading race, and perhaps
> a race that is confined to network namespaces.
> 

When the namespace was created I had a bunch of IPv4 & IPv6 tunnels and eth0 & eth1


[ 1114.323402] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 1114.330293] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[ 1114.331002] IP: [<ffffffff8104de50>] __sysctl_head_next+0x70/0x130
[ 1114.331002] PGD 169693067 PUD 16bfce067 PMD 0 
[ 1114.331002] Oops: 0000 [#1] PREEMPT SMP 
[ 1114.331002] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/scsi_generic/sg0/dev
[ 1114.331002] CPU 1 
[ 1114.331002] Modules linked in: nf_conntrack(+) macvlan arptable_filter arp_tables 3c59x nouveau ttm drm_kms_helper
[ 1114.331002] 
[ 1114.331002] Pid: 936, comm: modprobe Not tainted 2.6.39-rc2+ #21 System manufacturer System Product Name/P5B
[ 1114.331002] RIP: 0010:[<ffffffff8104de50>]  [<ffffffff8104de50>] __sysctl_head_next+0x70/0x130
[ 1114.331002] RSP: 0018:ffff880169c1bb98  EFLAGS: 00010286
[ 1114.331002] RAX: ffff88016bdb1530 RBX: fffffffffffffff8 RCX: 0000000000000000
[ 1114.331002] RDX: 000000000000e901 RSI: ffff880169c1bda8 RDI: ffffffff816b94a0
[ 1114.331002] RBP: ffff880169c1bbb8 R08: 0000000000000000 R09: ffff880169eee2b0
[ 1114.331002] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
[ 1114.331002] R13: ffff880169c1bda8 R14: ffffffffa0103300 R15: 0000000000000001
[ 1114.331002] FS:  00007f6039af3700(0000) GS:ffff88017fc80000(0000) knlGS:0000000000000000
[ 1114.331002] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1114.331002] CR2: 0000000000000018 CR3: 000000016968d000 CR4: 00000000000006e0
[ 1114.331002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1114.331002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1114.331002] Process modprobe (pid: 936, threadinfo ffff880169c1a000, task ffff88016bcc16c0)
[ 1114.331002] Stack:
[ 1114.331002]  ffff88017fffcc00 ffff880169eec9c8 ffff880169c1bbf0 ffff880169eee388
[ 1114.331002]  ffff880169c1bc28 ffffffff8106fba5 000000007fffde48 ffff880169c1bda8
[ 1114.331002]  0000000201c94f80 ffff88016958f818 ffff880169c1bc38 0000000000000000
[ 1114.331002] Call Trace:
[ 1114.331002]  [<ffffffff8106fba5>] sysctl_check_table+0x2b5/0x3f0
[ 1114.331002]  [<ffffffff8106f955>] sysctl_check_table+0x65/0x3f0
[ 1114.331002]  [<ffffffff8106f955>] sysctl_check_table+0x65/0x3f0
[ 1114.331002]  [<ffffffff8104dadc>] __register_sysctl_paths+0xfc/0x320
[ 1114.331002]  [<ffffffff810fd85a>] ? cache_alloc_debugcheck_after+0xea/0x220
[ 1114.331002]  [<ffffffffa01006ce>] ? nf_conntrack_acct_init+0x3e/0xe0 [nf_conntrack]
[ 1114.331002]  [<ffffffff811007ef>] ? __kmalloc_track_caller+0x11f/0x2a0
[ 1114.331002]  [<ffffffff814534f1>] register_net_sysctl_table+0x61/0x70
[ 1114.331002]  [<ffffffffa01006f4>] nf_conntrack_acct_init+0x64/0xe0 [nf_conntrack]
[ 1114.331002]  [<ffffffffa00f8604>] nf_conntrack_init+0xf4/0x350 [nf_conntrack]
[ 1114.331002]  [<ffffffffa00fb614>] nf_conntrack_net_init+0x14/0x1a0 [nf_conntrack]
[ 1114.331002]  [<ffffffff813718d7>] ops_init+0x47/0x130
[ 1114.331002]  [<ffffffff81371de3>] register_pernet_operations+0xa3/0x180
[ 1114.331002]  [<ffffffffa010c000>] ? 0xffffffffa010bfff
[ 1114.331002]  [<ffffffffa010c000>] ? 0xffffffffa010bfff
[ 1114.331002]  [<ffffffff81371fec>] register_pernet_subsys+0x2c/0x50
[ 1114.331002]  [<ffffffffa010c010>] nf_conntrack_standalone_init+0x10/0x12 [nf_conntrack]
[ 1114.331002]  [<ffffffff810001d3>] do_one_initcall+0x43/0x170
[ 1114.331002]  [<ffffffff8108393b>] sys_init_module+0xbb/0x200
[ 1114.331002]  [<ffffffff81469beb>] system_call_fastpath+0x16/0x1b
[ 1114.331002] Code: 87 00 00 00 48 8b 5b 30 4d 8b 24 24 48 8b 43 30 48 85 c0 0f 84 92 00 00 00 4c 89 ee 48 89 df ff d0 49 39 c4 74 45 49 8d 5c 24 f8 
[ 1114.331002]  83 7b 20 00 75 d2 83 43 18 01 48 c7 c7 60 9a 67 81 e8 a9 b2 
[ 1114.331002] RIP  [<ffffffff8104de50>] __sysctl_head_next+0x70/0x130
[ 1114.331002]  RSP <ffff880169c1bb98>
[ 1114.331002] CR2: 0000000000000018
[ 1114.691196] ---[ end trace b3f24866c78b4f05 ]---
[ 1114.696485] note: modprobe[936] exited with preempt_count 1
[ 1114.702440] BUG: sleeping function called from invalid context at /opt/src/ericsson/kvm/net-next-2.6/kernel/rwsem.c:21


> My head is in another problem so I won't be able to look at this for
> a bit.  But if you are getting into ip_conntrack_net_init with
> a NULL network namespace something spectacularly bad is happening.
> 
> In particular it looks like you must be hitting a bug in for_each_net.
> Which would pretty much have to be a race in adding or removing from
> net_namespace_list.
> 
> I took a quick skim through the code and whenever we modify the
> net_namespace we hold but the net_mutex and inside it the rtnl_lock so I
> don't immediate see how you could be getting a NULL net into
> ip_conntrack_net_init.
> 
> Is there a codepath besides register_pernet_subsys that is calling
> ip_conntrack_net_init?
> 
In this case it's ip_vs that tries to load nf_conntrack

> Do you have any local modifications that could be messing up register_pernet_subsys?

nop
> 
> Eric
> 

^ permalink raw reply

* Re: [PATCH 2/2] bna: fix memory leak during RX path cleanup
From: David Miller @ 2011-04-14 20:40 UTC (permalink / raw)
  To: rmody; +Cc: netdev, ddutt
In-Reply-To: <1302804319-15677-2-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Thu, 14 Apr 2011 11:05:19 -0700

> The memory leak was caused by unintentional assignment of the Rx path
> destroy callback function pointer to NULL just after correct
> initialization.
> 
> Signed-off-by: Debashis Dutt <ddutt@brocade.com>
> Signed-off-by: Rasesh Mody <rmody@brocade.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] bna: fix for clean fw re-initialization
From: David Miller @ 2011-04-14 20:39 UTC (permalink / raw)
  To: rmody; +Cc: netdev, ddutt
In-Reply-To: <1302804319-15677-1-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Thu, 14 Apr 2011 11:05:18 -0700

> During a kernel crash, bna control path state machine and firmware do not
> get a notification and hence are not cleanly shutdown. The registers
> holding driver/IOC state information are not reset back to valid
> disabled/parking values. This causes subsequent driver initialization
> to hang during kdump kernel boot. This patch, during the initialization
> of first PCI function, resets corresponding register when unclean shutown
> is detect by reading chip registers. This will make sure that ioc/fw
> gets clean re-initialization.
> 
> Signed-off-by: Debashis Dutt <ddutt@brocade.com>
> Signed-off-by: Rasesh Mody <rmody@brocade.com>

Applied.

^ permalink raw reply

* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Jiri Slaby @ 2011-04-14 20:37 UTC (permalink / raw)
  To: Bryan Schumaker
  Cc: Trond Myklebust, Jiri Slaby, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4DA60AB9.1050104-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>

On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
> On 04/12/2011 02:52 PM, Jiri Slaby wrote:
>> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>>> inacceptable for automounted NFS dirs.
>>>>>
>>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>>> and then trying rpcsec_gss if and only if that fails.
>>>>>
>>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>>> supported flavour?
>>>>
>>>> I don't know, I connect to a nfs server which is not maintained by me.
>>>> It looks like that. How can I find out?
>>>
>>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>>
>> I don't have NFS in modules. It's all built-in. And this one is
>> unconditionally selected because of CONFIG_NFS_V4.
> 
> Does this patch help?

Nope, it makes things even worse:
# mount -oro,intr XXX:/yyy /mnt/c/
<15s delay here>
mount.nfs: access denied by server while mounting XXX:/yyy

So in nfs4_proc_get_root I do:
  printk("%s: %d %u\n", __func__, i, flav_array[i]);
  status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
  printk("%s: res=%d\n", __func__, status);
and get:
[   18.159818] nfs4_proc_get_root: 0 1
[   18.214872] nfs4_proc_get_root: res=-1
[   18.214875] nfs4_proc_get_root: 1 0
[   18.254636] nfs4_proc_get_root: res=-1
[   18.254639] nfs4_proc_get_root: 2 390003
[   33.252174] RPC: AUTH_GSS upcall timed out.
[   33.252177] Please check user daemon is running.
[   33.252192] nfs4_proc_get_root: res=-13

If I revert that back and do the same:
[   28.275569] nfs4_proc_get_root: 0 1
[   28.296545] nfs4_proc_get_root: res=-1
[   28.296548] nfs4_proc_get_root: 1 390003
[   43.296107] RPC: AUTH_GSS upcall timed out.
[   43.296108] Please check user daemon is running.
[   43.296121] nfs4_proc_get_root: res=-13
[   43.296122] nfs4_proc_get_root: 2 0
[   43.318201] nfs4_proc_get_root: res=-1

I.e. all methods fail. And what matters is the last retval. From NULL it
is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
nfs3, for EACCESS it dies terrible death.

linux-b984:~ # strace -fe mount -s 1000 mount -oro,intr XXX:/yyy /mnt/c/
Process 2396 attached
Process 2395 suspended
[pid  2396] mount("XXX:/yyy", "/mnt/c", "nfs", MS_RDONLY,
"intr,vers=4,addr=10.20.3.2,clientaddr=10.0.2.15") = -1 EPERM (Operation
not permitted)
[pid  2396] mount("XXX:/yyy", "/mnt/c", "nfs", MS_RDONLY,
"intr,addr=10.20.3.2,vers=3,proto=tcp,mountvers=3,mountproto=udp,mountport=709")
= 0
Process 2395 resumed
Process 2396 detached
--- SIGCHLD (Child exited) @ 0 (0) ---

thanks,
-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: pull request: wireless-2.6 2011-04-14
From: David Miller @ 2011-04-14 20:18 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20110414200858.GC2652@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Thu, 14 Apr 2011 16:08:58 -0400

> Another small round of fixes intended for 2.6.39...
> 
> Included are a fix for a WARNING from ath9k regarding a DMA failure, a
> fix for iwlegacy not initializing its Tx power correctly, and a small
> ath9k fix to report the driver name correctly to ethtool.

Pulled, thanks a lot John.

^ permalink raw reply

* Re: [PATCH NET-2.6 1/1] qlcnic: limit skb frags for non tso packet
From: Greg KH @ 2011-04-14 20:09 UTC (permalink / raw)
  To: Amit Salecha
  Cc: netdev, Anirban Chakraborty, David Miller, Ameen Rahman, stable
In-Reply-To: <99737F4847ED0A48AECC9F4A1974A4B80FD13840FE@MNEXMB2.qlogic.org>

On Thu, Apr 14, 2011 at 12:22:35AM -0500, Amit Salecha wrote:
> > > Footer will present in my reply to this email. But footer should not
> > be there in patches sent by me.
> > > Can you verify patch version 2 again ? Here
> > http://patchwork.ozlabs.org/patch/90938/ I don't see any footer.
> > > If you see footer with patch version 2, please send me that.
> >
> > Your footer was not in your patch, correct.  But it was in this email.
> >
> > And that's the issue, you can't have that footer on emails you send to
> > a
> > public list where you are going to be collaborating on a public
> > project,
> > otherwise no one can use anything you say.
> >
> > Now if you only think that people will just accept your patches,
> > without
> > being able to have you participate in development and maintance of
> > those
> > patches (which is required to be done through email), you are mistaken.
> >
> > So please fix your email issue, otherwise it is not going to work.
> >
> > Note, other people at qualcomm have fixed this, so you are not alone.
> >
> Ok.
> 
> Our IT has fixed the footer issue. Sending this email to verify.
> 
> -Amit
> 
> This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.

Nope, still shows up :(


_______________________________________________
stable mailing list
stable@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/stable

^ permalink raw reply

* pull request: wireless-2.6 2011-04-14
From: John W. Linville @ 2011-04-14 20:08 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Dave,

Another small round of fixes intended for 2.6.39...

Included are a fix for a WARNING from ath9k regarding a DMA failure, a
fix for iwlegacy not initializing its Tx power correctly, and a small
ath9k fix to report the driver name correctly to ethtool.

Please let me know if there are problems!

Thanks,

John

---

The following changes since commit 38a2f37258f9e2ae3f6e4241e01088be8dfaf4e9:

  usbnet: Fix up 'FLAG_POINTTOPOINT' and 'FLAG_MULTI_PACKET' overlaps. (2011-04-14 00:22:27 -0700)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Felix Fietkau (1):
      ath9k_hw: fix stopping rx DMA during resets

Stanislaw Gruszka (1):
      iwlegacy: fix tx_power initialization

Sujith Manoharan (1):
      ath9k_htc: Fix ethtool reporting

 drivers/net/wireless/ath/ath9k/hif_usb.c     |    4 ++--
 drivers/net/wireless/ath/ath9k/hw.c          |    9 ---------
 drivers/net/wireless/ath/ath9k/mac.c         |   25 ++++++++++++++++++++++---
 drivers/net/wireless/ath/ath9k/mac.h         |    2 +-
 drivers/net/wireless/ath/ath9k/recv.c        |    6 +++---
 drivers/net/wireless/iwlegacy/iwl-3945-hw.h  |    2 --
 drivers/net/wireless/iwlegacy/iwl-4965-hw.h  |    3 ---
 drivers/net/wireless/iwlegacy/iwl-core.c     |   17 +++++++++++------
 drivers/net/wireless/iwlegacy/iwl-eeprom.c   |    7 -------
 drivers/net/wireless/iwlegacy/iwl3945-base.c |    4 ----
 drivers/net/wireless/iwlegacy/iwl4965-base.c |    6 ------
 11 files changed, 39 insertions(+), 46 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c
index f1b8af6..2d10239 100644
--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -1040,7 +1040,7 @@ static int ath9k_hif_usb_probe(struct usb_interface *interface,
 	}
 
 	ret = ath9k_htc_hw_init(hif_dev->htc_handle,
-				&hif_dev->udev->dev, hif_dev->device_id,
+				&interface->dev, hif_dev->device_id,
 				hif_dev->udev->product, id->driver_info);
 	if (ret) {
 		ret = -EINVAL;
@@ -1158,7 +1158,7 @@ fail_resume:
 #endif
 
 static struct usb_driver ath9k_hif_usb_driver = {
-	.name = "ath9k_hif_usb",
+	.name = KBUILD_MODNAME,
 	.probe = ath9k_hif_usb_probe,
 	.disconnect = ath9k_hif_usb_disconnect,
 #ifdef CONFIG_PM
diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c
index 1ec9bcd..c95bc5c 100644
--- a/drivers/net/wireless/ath/ath9k/hw.c
+++ b/drivers/net/wireless/ath/ath9k/hw.c
@@ -1254,15 +1254,6 @@ int ath9k_hw_reset(struct ath_hw *ah, struct ath9k_channel *chan,
 	ah->txchainmask = common->tx_chainmask;
 	ah->rxchainmask = common->rx_chainmask;
 
-	if ((common->bus_ops->ath_bus_type != ATH_USB) && !ah->chip_fullsleep) {
-		ath9k_hw_abortpcurecv(ah);
-		if (!ath9k_hw_stopdmarecv(ah)) {
-			ath_dbg(common, ATH_DBG_XMIT,
-				"Failed to stop receive dma\n");
-			bChannelChange = false;
-		}
-	}
-
 	if (!ath9k_hw_setpower(ah, ATH9K_PM_AWAKE))
 		return -EIO;
 
diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 562257a..edc1cbb 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -751,28 +751,47 @@ void ath9k_hw_abortpcurecv(struct ath_hw *ah)
 }
 EXPORT_SYMBOL(ath9k_hw_abortpcurecv);
 
-bool ath9k_hw_stopdmarecv(struct ath_hw *ah)
+bool ath9k_hw_stopdmarecv(struct ath_hw *ah, bool *reset)
 {
 #define AH_RX_STOP_DMA_TIMEOUT 10000   /* usec */
 #define AH_RX_TIME_QUANTUM     100     /* usec */
 	struct ath_common *common = ath9k_hw_common(ah);
+	u32 mac_status, last_mac_status = 0;
 	int i;
 
+	/* Enable access to the DMA observation bus */
+	REG_WRITE(ah, AR_MACMISC,
+		  ((AR_MACMISC_DMA_OBS_LINE_8 << AR_MACMISC_DMA_OBS_S) |
+		   (AR_MACMISC_MISC_OBS_BUS_1 <<
+		    AR_MACMISC_MISC_OBS_BUS_MSB_S)));
+
 	REG_WRITE(ah, AR_CR, AR_CR_RXD);
 
 	/* Wait for rx enable bit to go low */
 	for (i = AH_RX_STOP_DMA_TIMEOUT / AH_TIME_QUANTUM; i != 0; i--) {
 		if ((REG_READ(ah, AR_CR) & AR_CR_RXE) == 0)
 			break;
+
+		if (!AR_SREV_9300_20_OR_LATER(ah)) {
+			mac_status = REG_READ(ah, AR_DMADBG_7) & 0x7f0;
+			if (mac_status == 0x1c0 && mac_status == last_mac_status) {
+				*reset = true;
+				break;
+			}
+
+			last_mac_status = mac_status;
+		}
+
 		udelay(AH_TIME_QUANTUM);
 	}
 
 	if (i == 0) {
 		ath_err(common,
-			"DMA failed to stop in %d ms AR_CR=0x%08x AR_DIAG_SW=0x%08x\n",
+			"DMA failed to stop in %d ms AR_CR=0x%08x AR_DIAG_SW=0x%08x DMADBG_7=0x%08x\n",
 			AH_RX_STOP_DMA_TIMEOUT / 1000,
 			REG_READ(ah, AR_CR),
-			REG_READ(ah, AR_DIAG_SW));
+			REG_READ(ah, AR_DIAG_SW),
+			REG_READ(ah, AR_DMADBG_7));
 		return false;
 	} else {
 		return true;
diff --git a/drivers/net/wireless/ath/ath9k/mac.h b/drivers/net/wireless/ath/ath9k/mac.h
index b2b2ff8..c2a5938 100644
--- a/drivers/net/wireless/ath/ath9k/mac.h
+++ b/drivers/net/wireless/ath/ath9k/mac.h
@@ -695,7 +695,7 @@ bool ath9k_hw_setrxabort(struct ath_hw *ah, bool set);
 void ath9k_hw_putrxbuf(struct ath_hw *ah, u32 rxdp);
 void ath9k_hw_startpcureceive(struct ath_hw *ah, bool is_scanning);
 void ath9k_hw_abortpcurecv(struct ath_hw *ah);
-bool ath9k_hw_stopdmarecv(struct ath_hw *ah);
+bool ath9k_hw_stopdmarecv(struct ath_hw *ah, bool *reset);
 int ath9k_hw_beaconq_setup(struct ath_hw *ah);
 
 /* Interrupt Handling */
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index a9c3f46..dcd19bc 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -486,12 +486,12 @@ start_recv:
 bool ath_stoprecv(struct ath_softc *sc)
 {
 	struct ath_hw *ah = sc->sc_ah;
-	bool stopped;
+	bool stopped, reset = false;
 
 	spin_lock_bh(&sc->rx.rxbuflock);
 	ath9k_hw_abortpcurecv(ah);
 	ath9k_hw_setrxfilter(ah, 0);
-	stopped = ath9k_hw_stopdmarecv(ah);
+	stopped = ath9k_hw_stopdmarecv(ah, &reset);
 
 	if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
 		ath_edma_stop_recv(sc);
@@ -506,7 +506,7 @@ bool ath_stoprecv(struct ath_softc *sc)
 			"confusing the DMA engine when we start RX up\n");
 		ATH_DBG_WARN_ON_ONCE(!stopped);
 	}
-	return stopped;
+	return stopped || reset;
 }
 
 void ath_flushrecv(struct ath_softc *sc)
diff --git a/drivers/net/wireless/iwlegacy/iwl-3945-hw.h b/drivers/net/wireless/iwlegacy/iwl-3945-hw.h
index 779d3cb..5c3a68d 100644
--- a/drivers/net/wireless/iwlegacy/iwl-3945-hw.h
+++ b/drivers/net/wireless/iwlegacy/iwl-3945-hw.h
@@ -74,8 +74,6 @@
 /* RSSI to dBm */
 #define IWL39_RSSI_OFFSET	95
 
-#define IWL_DEFAULT_TX_POWER	0x0F
-
 /*
  * EEPROM related constants, enums, and structures.
  */
diff --git a/drivers/net/wireless/iwlegacy/iwl-4965-hw.h b/drivers/net/wireless/iwlegacy/iwl-4965-hw.h
index 08b189c..fc6fa28 100644
--- a/drivers/net/wireless/iwlegacy/iwl-4965-hw.h
+++ b/drivers/net/wireless/iwlegacy/iwl-4965-hw.h
@@ -804,9 +804,6 @@ struct iwl4965_scd_bc_tbl {
 
 #define IWL4965_DEFAULT_TX_RETRY  15
 
-/* Limit range of txpower output target to be between these values */
-#define IWL4965_TX_POWER_TARGET_POWER_MIN	(0)	/* 0 dBm: 1 milliwatt */
-
 /* EEPROM */
 #define IWL4965_FIRST_AMPDU_QUEUE	10
 
diff --git a/drivers/net/wireless/iwlegacy/iwl-core.c b/drivers/net/wireless/iwlegacy/iwl-core.c
index a209a0e..2b08efb 100644
--- a/drivers/net/wireless/iwlegacy/iwl-core.c
+++ b/drivers/net/wireless/iwlegacy/iwl-core.c
@@ -160,6 +160,7 @@ int iwl_legacy_init_geos(struct iwl_priv *priv)
 	struct ieee80211_channel *geo_ch;
 	struct ieee80211_rate *rates;
 	int i = 0;
+	s8 max_tx_power = 0;
 
 	if (priv->bands[IEEE80211_BAND_2GHZ].n_bitrates ||
 	    priv->bands[IEEE80211_BAND_5GHZ].n_bitrates) {
@@ -235,8 +236,8 @@ int iwl_legacy_init_geos(struct iwl_priv *priv)
 
 			geo_ch->flags |= ch->ht40_extension_channel;
 
-			if (ch->max_power_avg > priv->tx_power_device_lmt)
-				priv->tx_power_device_lmt = ch->max_power_avg;
+			if (ch->max_power_avg > max_tx_power)
+				max_tx_power = ch->max_power_avg;
 		} else {
 			geo_ch->flags |= IEEE80211_CHAN_DISABLED;
 		}
@@ -249,6 +250,10 @@ int iwl_legacy_init_geos(struct iwl_priv *priv)
 				 geo_ch->flags);
 	}
 
+	priv->tx_power_device_lmt = max_tx_power;
+	priv->tx_power_user_lmt = max_tx_power;
+	priv->tx_power_next = max_tx_power;
+
 	if ((priv->bands[IEEE80211_BAND_5GHZ].n_channels == 0) &&
 	     priv->cfg->sku & IWL_SKU_A) {
 		IWL_INFO(priv, "Incorrectly detected BG card as ABG. "
@@ -1124,11 +1129,11 @@ int iwl_legacy_set_tx_power(struct iwl_priv *priv, s8 tx_power, bool force)
 	if (!priv->cfg->ops->lib->send_tx_power)
 		return -EOPNOTSUPP;
 
-	if (tx_power < IWL4965_TX_POWER_TARGET_POWER_MIN) {
+	/* 0 dBm mean 1 milliwatt */
+	if (tx_power < 0) {
 		IWL_WARN(priv,
-			 "Requested user TXPOWER %d below lower limit %d.\n",
-			 tx_power,
-			 IWL4965_TX_POWER_TARGET_POWER_MIN);
+			 "Requested user TXPOWER %d below 1 mW.\n",
+			 tx_power);
 		return -EINVAL;
 	}
 
diff --git a/drivers/net/wireless/iwlegacy/iwl-eeprom.c b/drivers/net/wireless/iwlegacy/iwl-eeprom.c
index 04c5648..cb346d1 100644
--- a/drivers/net/wireless/iwlegacy/iwl-eeprom.c
+++ b/drivers/net/wireless/iwlegacy/iwl-eeprom.c
@@ -471,13 +471,6 @@ int iwl_legacy_init_channel_map(struct iwl_priv *priv)
 					     flags & EEPROM_CHANNEL_RADAR))
 				       ? "" : "not ");
 
-			/* Set the tx_power_user_lmt to the highest power
-			 * supported by any channel */
-			if (eeprom_ch_info[ch].max_power_avg >
-						priv->tx_power_user_lmt)
-				priv->tx_power_user_lmt =
-				    eeprom_ch_info[ch].max_power_avg;
-
 			ch_info++;
 		}
 	}
diff --git a/drivers/net/wireless/iwlegacy/iwl3945-base.c b/drivers/net/wireless/iwlegacy/iwl3945-base.c
index 28eb3d8..cc7ebce 100644
--- a/drivers/net/wireless/iwlegacy/iwl3945-base.c
+++ b/drivers/net/wireless/iwlegacy/iwl3945-base.c
@@ -3825,10 +3825,6 @@ static int iwl3945_init_drv(struct iwl_priv *priv)
 	priv->force_reset[IWL_FW_RESET].reset_duration =
 		IWL_DELAY_NEXT_FORCE_FW_RELOAD;
 
-
-	priv->tx_power_user_lmt = IWL_DEFAULT_TX_POWER;
-	priv->tx_power_next = IWL_DEFAULT_TX_POWER;
-
 	if (eeprom->version < EEPROM_3945_EEPROM_VERSION) {
 		IWL_WARN(priv, "Unsupported EEPROM version: 0x%04X\n",
 			 eeprom->version);
diff --git a/drivers/net/wireless/iwlegacy/iwl4965-base.c b/drivers/net/wireless/iwlegacy/iwl4965-base.c
index 91b3d8b..d484c36 100644
--- a/drivers/net/wireless/iwlegacy/iwl4965-base.c
+++ b/drivers/net/wireless/iwlegacy/iwl4965-base.c
@@ -3140,12 +3140,6 @@ static int iwl4965_init_drv(struct iwl_priv *priv)
 
 	iwl_legacy_init_scan_params(priv);
 
-	/* Set the tx_power_user_lmt to the lowest power level
-	 * this value will get overwritten by channel max power avg
-	 * from eeprom */
-	priv->tx_power_user_lmt = IWL4965_TX_POWER_TARGET_POWER_MIN;
-	priv->tx_power_next = IWL4965_TX_POWER_TARGET_POWER_MIN;

^ permalink raw reply related

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Alexander Duyck @ 2011-04-14 19:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter Zijlstra, Wei Gu, netdev, Kirsher, Jeffrey T,
	Mike Galbraith
In-Reply-To: <1302803357.2744.1.camel@edumazet-laptop>

On 4/14/2011 10:49 AM, Eric Dumazet wrote:
> Le jeudi 14 avril 2011 à 18:57 +0200, Eric Dumazet a écrit :
>> Le jeudi 14 avril 2011 à 18:56 +0200, Peter Zijlstra a écrit :
>>> On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:
>>>
>>>> I'm doing some more digging into this now.  One thought that occurred to
>>>> me is that if the patch you mention is having some sort of effect this
>>>> could be a sign of perhaps a kernel timer or scheduling problem.
>>>
>>> Right, so the removal of the NO_HZ throttle will allow the CPU to go
>>> into C states more often, this could result in longer wake-up times for
>>> IRQs.
>>>
>>> We reverted because:
>>>    - it caused significant battery drain due to not going into C states
>>>      often enough, and
>>>    - its a much better idea to implement these things in the idle
>>>      governor since it already has the job of guestimating the idle
>>>      duration.
>>>
>>> I really can't remember back far enough to even come up with a theory of
>>> why kernels prior to merging the NO_HZ throttle would not exhibit this
>>> problem.
>>>
>>>
>>>
>>
>> Normally, Wei Gu already asked to not use C states.
>>
>> http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf
>>
>> How can we/he check this ?
>>
>>
>
> Anyway, this could explain a latency problem, not packet drops.
>
> With NAPI, we should get few hardware irqs under load.
>
> Once softirq started, scheduler is out of the equation.

The problem is on these newer systems it is becoming significantly 
harder to get locked into the polling only state.  In many cases we will 
just complete all of the RX work in a single poll and go back to 
interrupts.  This is especially true when traffic is spread out across 
multiple queues and CPUs.

I'm thinking that maybe powertop results for before that patch and after 
that patch should be pretty telling.  It should tell us if C states are 
active, and if so it will also tell us if we are being woken by 
interrupts or if we are staying in the polling state.

Thanks,

Alex

^ permalink raw reply

* Re: [net-next-2.6 RFC PATCH v3] ethtool: allow custom interval for physical identification
From: Jon Mason @ 2011-04-14 18:55 UTC (permalink / raw)
  To: Bruce Allan
  Cc: netdev, Ben Hutchings, Sathya Perla, Subbu Seetharaman,
	Ajit Khaparde, Michael Chan, Eilon Greenstein, Divy Le Ray,
	Don Fry, Solarflare linux maintainers, Steve Hodgson,
	Stephen Hemminger, Matt Carlson
In-Reply-To: <20110413230910.16317.11372.stgit@gitlad.jf.intel.com>

On Wed, Apr 13, 2011 at 04:09:10PM -0700, Bruce Allan wrote:
> When physical identification of an adapter is done by toggling the
> mechanism on and off through software utilizing the set_phys_id operation,
> it is done with a fixed duration for both on and off states.  Some drivers
> may want to set a custom duration for the on/off intervals.  This patch
> changes the API so the return code from the driver's entry point when it
> is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to
> cycle the on/off states, and updates the drivers that have already been
> converted to use the new set_phys_id and use the synchronous method for
> identifying an adapter.
> 
> The physical identification frequency set in the updated drivers is based
> on how it was done prior to the introduction of set_phys_id.
> 
> Compile tested only.  Also fixes a compiler warning in sfc.
> 
> v2: drivers do not return -EINVAL for ETHOOL_ID_ACTIVE
> v3: fold patchset into single patch and cleanup per Ben's feedback
> 
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Acked-by: Jon Mason <jdmason@kudzu.us>
> Cc: Ben Hutchings <bhutchings@solarflare.com>
> Cc: Sathya Perla <sathya.perla@emulex.com>
> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
> Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
> Cc: Michael Chan <mchan@broadcom.com>
> Cc: Eilon Greenstein <eilong@broadcom.com>
> Cc: Divy Le Ray <divy@chelsio.com>
> Cc: Don Fry <pcnet32@frontier.com>
> Cc: Jon Mason <jdmason@kudzu.us>
> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
> Cc: Steve Hodgson <shodgson@solarflare.com>
> Cc: Stephen Hemminger <shemminger@linux-foundation.org>
> Cc: Matt Carlson <mcarlson@broadcom.com>
> ---
> 
>  drivers/net/benet/be_ethtool.c    |    2 +-
>  drivers/net/bnx2.c                |    2 +-
>  drivers/net/bnx2x/bnx2x_ethtool.c |    2 +-
>  drivers/net/cxgb3/cxgb3_main.c    |    2 +-
>  drivers/net/ewrk3.c               |    2 +-
>  drivers/net/niu.c                 |    2 +-
>  drivers/net/pcnet32.c             |    2 +-
>  drivers/net/s2io.c                |    2 +-
>  drivers/net/sfc/ethtool.c         |    6 +++---
>  drivers/net/skge.c                |    2 +-
>  drivers/net/sky2.c                |    2 +-
>  drivers/net/tg3.c                 |    2 +-
>  include/linux/ethtool.h           |    6 ++++--
>  net/core/ethtool.c                |   31 ++++++++++++++++---------------
>  14 files changed, 34 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
> index 96f5502..80226e4 100644
> --- a/drivers/net/benet/be_ethtool.c
> +++ b/drivers/net/benet/be_ethtool.c
> @@ -516,7 +516,7 @@ be_set_phys_id(struct net_device *netdev,
>  	case ETHTOOL_ID_ACTIVE:
>  		be_cmd_get_beacon_state(adapter, adapter->hba_port_num,
>  					&adapter->beacon_state);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		be_cmd_set_beacon_state(adapter, adapter->hba_port_num, 0, 0,
> diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
> index 0a52079..bf729ee 100644
> --- a/drivers/net/bnx2.c
> +++ b/drivers/net/bnx2.c
> @@ -7473,7 +7473,7 @@ bnx2_set_phys_id(struct net_device *dev, enum ethtool_phys_id_state state)
>  
>  		bp->leds_save = REG_RD(bp, BNX2_MISC_CFG);
>  		REG_WR(bp, BNX2_MISC_CFG, BNX2_MISC_CFG_LEDMODE_MAC);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		REG_WR(bp, BNX2_EMAC_LED, BNX2_EMAC_LED_OVERRIDE |
> diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
> index ad7d91e..0a5e88d 100644
> --- a/drivers/net/bnx2x/bnx2x_ethtool.c
> +++ b/drivers/net/bnx2x/bnx2x_ethtool.c
> @@ -2025,7 +2025,7 @@ static int bnx2x_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		bnx2x_set_led(&bp->link_params, &bp->link_vars,
> diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
> index 802c7a7..a087e06 100644
> --- a/drivers/net/cxgb3/cxgb3_main.c
> +++ b/drivers/net/cxgb3/cxgb3_main.c
> @@ -1757,7 +1757,7 @@ static int set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_OFF:
>  		t3_set_reg_field(adapter, A_T3DBG_GPIO_EN, F_GPIO0_OUT_VAL, 0);
> diff --git a/drivers/net/ewrk3.c b/drivers/net/ewrk3.c
> index c7ce443..17b6027 100644
> --- a/drivers/net/ewrk3.c
> +++ b/drivers/net/ewrk3.c
> @@ -1618,7 +1618,7 @@ static int ewrk3_set_phys_id(struct net_device *dev,
>  		/* Prevent ISR from twiddling the LED */
>  		lp->led_mask = 0;
>  		spin_unlock_irq(&lp->hw_lock);
> -		return -EINVAL;
> +		return 2;	/* cycle on/off twice per second */
>  
>  	case ETHTOOL_ID_ON:
>  		cr = inb(EWRK3_CR);
> diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> index 3fa1e9c..ea2272f 100644
> --- a/drivers/net/niu.c
> +++ b/drivers/net/niu.c
> @@ -7896,7 +7896,7 @@ static int niu_set_phys_id(struct net_device *dev,
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
>  		np->orig_led_state = niu_led_state_save(np);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		niu_force_led(np, 1);
> diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c
> index e89afb9..0a1efba 100644
> --- a/drivers/net/pcnet32.c
> +++ b/drivers/net/pcnet32.c
> @@ -1038,7 +1038,7 @@ static int pcnet32_set_phys_id(struct net_device *dev,
>  		for (i = 4; i < 8; i++)
>  			lp->save_regs[i - 4] = a->read_bcr(ioaddr, i);
>  		spin_unlock_irqrestore(&lp->lock, flags);
> -		return -EINVAL;
> +		return 2;	/* cycle on/off twice per second */
>  
>  	case ETHTOOL_ID_ON:
>  	case ETHTOOL_ID_OFF:
> diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
> index 2d5cc61..2302d97 100644
> --- a/drivers/net/s2io.c
> +++ b/drivers/net/s2io.c
> @@ -5541,7 +5541,7 @@ static int s2io_ethtool_set_led(struct net_device *dev,
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
>  		sp->adapt_ctrl_org = readq(&bar0->gpio_control);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		s2io_set_led(sp, true);
> diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
> index 644f7c1..5d8468f 100644
> --- a/drivers/net/sfc/ethtool.c
> +++ b/drivers/net/sfc/ethtool.c
> @@ -182,7 +182,7 @@ static int efx_ethtool_phys_id(struct net_device *net_dev,
>  			       enum ethtool_phys_id_state state)
>  {
>  	struct efx_nic *efx = netdev_priv(net_dev);
> -	enum efx_led_mode mode;
> +	enum efx_led_mode mode = EFX_LED_DEFAULT;
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ON:
> @@ -194,8 +194,8 @@ static int efx_ethtool_phys_id(struct net_device *net_dev,
>  	case ETHTOOL_ID_INACTIVE:
>  		mode = EFX_LED_DEFAULT;
>  		break;
> -	default:
> -		return -EINVAL;
> +	case ETHTOOL_ID_ACTIVE:
> +		return 1;	/* cycle on/off once per second */
>  	}
>  
>  	efx->type->set_id_led(efx, mode);
> diff --git a/drivers/net/skge.c b/drivers/net/skge.c
> index 310dcbc..176d784 100644
> --- a/drivers/net/skge.c
> +++ b/drivers/net/skge.c
> @@ -753,7 +753,7 @@ static int skge_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 2;	/* cycle on/off twice per second */
>  
>  	case ETHTOOL_ID_ON:
>  		skge_led(skge, LED_MODE_TST);
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index a4b8fe5..c8d0451 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -3813,7 +3813,7 @@ static int sky2_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  	case ETHTOOL_ID_INACTIVE:
>  		sky2_led(sky2, MO_LED_NORM);
>  		break;
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index 9d7defc..7c1a9dd 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -10292,7 +10292,7 @@ static int tg3_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		tw32(MAC_LED_CTRL, LED_CTRL_LNKLED_OVERRIDE |
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index ad22a68..9de3127 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -798,8 +798,10 @@ bool ethtool_invalid_flags(struct net_device *dev, u32 data, u32 supported);
>   *	attached to it.  The implementation may update the indicator
>   *	asynchronously or synchronously, but in either case it must return
>   *	quickly.  It is initially called with the argument %ETHTOOL_ID_ACTIVE,
> - *	and must either activate asynchronous updates or return -%EINVAL.
> - *	If it returns -%EINVAL then it will be called again at intervals with
> + *	and must either activate asynchronous updates and return zero, return
> + *	a negative error or return a positive frequency for synchronous
> + *	indication (e.g. 1 for one on/off cycle per second).  If it returns
> + *	a frequency then it will be called again at intervals with the
>   *	argument %ETHTOOL_ID_ON or %ETHTOOL_ID_OFF and should set the state of
>   *	the indicator accordingly.  Finally, it is called with the argument
>   *	%ETHTOOL_ID_INACTIVE and must deactivate the indicator.  Returns a
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 41dee2d..13d79f5 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -1669,7 +1669,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
>  		return dev->ethtool_ops->phys_id(dev, id.data);
>  
>  	rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ACTIVE);
> -	if (rc && rc != -EINVAL)
> +	if (rc < 0)
>  		return rc;
>  
>  	/* Drop the RTNL lock while waiting, but prevent reentry or
> @@ -1684,21 +1684,22 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
>  		schedule_timeout_interruptible(
>  			id.data ? (id.data * HZ) : MAX_SCHEDULE_TIMEOUT);
>  	} else {
> -		/* Driver expects to be called periodically */
> +		/* Driver expects to be called at twice the frequency in rc */
> +		int n = rc * 2, i, interval = HZ / n;
> +
> +		/* Count down seconds */
>  		do {
> -			rtnl_lock();
> -			rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ON);
> -			rtnl_unlock();
> -			if (rc)
> -				break;
> -			schedule_timeout_interruptible(HZ / 2);
> -
> -			rtnl_lock();
> -			rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_OFF);
> -			rtnl_unlock();
> -			if (rc)
> -				break;
> -			schedule_timeout_interruptible(HZ / 2);
> +			/* Count down iterations per second */
> +			i = n;
> +			do {
> +				rtnl_lock();
> +				rc = dev->ethtool_ops->set_phys_id(dev,
> +				    (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
> +				rtnl_unlock();
> +				if (rc)
> +					break;
> +				schedule_timeout_interruptible(interval);
> +			} while (!signal_pending(current) && --i != 0);
>  		} while (!signal_pending(current) &&
>  			 (id.data == 0 || --id.data != 0));
>  	}
> 

^ permalink raw reply

* [PATCH 2/2] bna: fix memory leak during RX path cleanup
From: Rasesh Mody @ 2011-04-14 18:05 UTC (permalink / raw)
  To: davem, netdev; +Cc: Rasesh Mody, Debashis Dutt
In-Reply-To: <1302804319-15677-1-git-send-email-rmody@brocade.com>

The memory leak was caused by unintentional assignment of the Rx path
destroy callback function pointer to NULL just after correct
initialization.

Signed-off-by: Debashis Dutt <ddutt@brocade.com>
Signed-off-by: Rasesh Mody <rmody@brocade.com>
---
 drivers/net/bna/bnad.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bna/bnad.c b/drivers/net/bna/bnad.c
index b9f2534..e588511 100644
--- a/drivers/net/bna/bnad.c
+++ b/drivers/net/bna/bnad.c
@@ -1837,7 +1837,6 @@ bnad_setup_rx(struct bnad *bnad, uint rx_id)
 	/* Initialize the Rx event handlers */
 	rx_cbfn.rcb_setup_cbfn = bnad_cb_rcb_setup;
 	rx_cbfn.rcb_destroy_cbfn = bnad_cb_rcb_destroy;
-	rx_cbfn.rcb_destroy_cbfn = NULL;
 	rx_cbfn.ccb_setup_cbfn = bnad_cb_ccb_setup;
 	rx_cbfn.ccb_destroy_cbfn = bnad_cb_ccb_destroy;
 	rx_cbfn.rx_cleanup_cbfn = bnad_cb_rx_cleanup;
-- 
1.7.1


^ permalink raw reply related

* [PATCH 1/2] bna: fix for clean fw re-initialization
From: Rasesh Mody @ 2011-04-14 18:05 UTC (permalink / raw)
  To: davem, netdev; +Cc: Rasesh Mody, Debashis Dutt

During a kernel crash, bna control path state machine and firmware do not
get a notification and hence are not cleanly shutdown. The registers
holding driver/IOC state information are not reset back to valid
disabled/parking values. This causes subsequent driver initialization
to hang during kdump kernel boot. This patch, during the initialization
of first PCI function, resets corresponding register when unclean shutown
is detect by reading chip registers. This will make sure that ioc/fw
gets clean re-initialization.

Signed-off-by: Debashis Dutt <ddutt@brocade.com>
Signed-off-by: Rasesh Mody <rmody@brocade.com>
---
 drivers/net/bna/bfa_ioc.c    |   31 ++++++++++++++++++-------------
 drivers/net/bna/bfa_ioc.h    |    1 +
 drivers/net/bna/bfa_ioc_ct.c |   28 ++++++++++++++++++++++++++++
 drivers/net/bna/bfi.h        |    6 ++++--
 4 files changed, 51 insertions(+), 15 deletions(-)

diff --git a/drivers/net/bna/bfa_ioc.c b/drivers/net/bna/bfa_ioc.c
index e3de0b8..7581518 100644
--- a/drivers/net/bna/bfa_ioc.c
+++ b/drivers/net/bna/bfa_ioc.c
@@ -38,6 +38,8 @@
 #define bfa_ioc_map_port(__ioc) ((__ioc)->ioc_hwif->ioc_map_port(__ioc))
 #define bfa_ioc_notify_fail(__ioc)			\
 			((__ioc)->ioc_hwif->ioc_notify_fail(__ioc))
+#define bfa_ioc_sync_start(__ioc)               \
+			((__ioc)->ioc_hwif->ioc_sync_start(__ioc))
 #define bfa_ioc_sync_join(__ioc)			\
 			((__ioc)->ioc_hwif->ioc_sync_join(__ioc))
 #define bfa_ioc_sync_leave(__ioc)			\
@@ -602,7 +604,7 @@ bfa_iocpf_sm_fwcheck(struct bfa_iocpf *iocpf, enum iocpf_event event)
 	switch (event) {
 	case IOCPF_E_SEMLOCKED:
 		if (bfa_ioc_firmware_lock(ioc)) {
-			if (bfa_ioc_sync_complete(ioc)) {
+			if (bfa_ioc_sync_start(ioc)) {
 				iocpf->retry_count = 0;
 				bfa_ioc_sync_join(ioc);
 				bfa_fsm_set_state(iocpf, bfa_iocpf_sm_hwinit);
@@ -1314,7 +1316,7 @@ bfa_nw_ioc_fwver_cmp(struct bfa_ioc *ioc, struct bfi_ioc_image_hdr *fwhdr)
  * execution context (driver/bios) must match.
  */
 static bool
-bfa_ioc_fwver_valid(struct bfa_ioc *ioc)
+bfa_ioc_fwver_valid(struct bfa_ioc *ioc, u32 boot_env)
 {
 	struct bfi_ioc_image_hdr fwhdr, *drv_fwhdr;
 
@@ -1325,7 +1327,7 @@ bfa_ioc_fwver_valid(struct bfa_ioc *ioc)
 	if (fwhdr.signature != drv_fwhdr->signature)
 		return false;
 
-	if (fwhdr.exec != drv_fwhdr->exec)
+	if (swab32(fwhdr.param) != boot_env)
 		return false;
 
 	return bfa_nw_ioc_fwver_cmp(ioc, &fwhdr);
@@ -1352,9 +1354,12 @@ bfa_ioc_hwinit(struct bfa_ioc *ioc, bool force)
 {
 	enum bfi_ioc_state ioc_fwstate;
 	bool fwvalid;
+	u32 boot_env;
 
 	ioc_fwstate = readl(ioc->ioc_regs.ioc_fwstate);
 
+	boot_env = BFI_BOOT_LOADER_OS;
+
 	if (force)
 		ioc_fwstate = BFI_IOC_UNINIT;
 
@@ -1362,10 +1367,10 @@ bfa_ioc_hwinit(struct bfa_ioc *ioc, bool force)
 	 * check if firmware is valid
 	 */
 	fwvalid = (ioc_fwstate == BFI_IOC_UNINIT) ?
-		false : bfa_ioc_fwver_valid(ioc);
+		false : bfa_ioc_fwver_valid(ioc, boot_env);
 
 	if (!fwvalid) {
-		bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, ioc->pcidev.device_id);
+		bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, boot_env);
 		return;
 	}
 
@@ -1396,7 +1401,7 @@ bfa_ioc_hwinit(struct bfa_ioc *ioc, bool force)
 	/**
 	 * Initialize the h/w for any other states.
 	 */
-	bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, ioc->pcidev.device_id);
+	bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, boot_env);
 }
 
 void
@@ -1506,7 +1511,7 @@ bfa_ioc_hb_stop(struct bfa_ioc *ioc)
  */
 static void
 bfa_ioc_download_fw(struct bfa_ioc *ioc, u32 boot_type,
-		    u32 boot_param)
+		    u32 boot_env)
 {
 	u32 *fwimg;
 	u32 pgnum, pgoff;
@@ -1558,10 +1563,10 @@ bfa_ioc_download_fw(struct bfa_ioc *ioc, u32 boot_type,
 	/*
 	 * Set boot type and boot param at the end.
 	*/
-	writel((swab32(swab32(boot_type))), ((ioc->ioc_regs.smem_page_start)
+	writel(boot_type, ((ioc->ioc_regs.smem_page_start)
 			+ (BFI_BOOT_TYPE_OFF)));
-	writel((swab32(swab32(boot_param))), ((ioc->ioc_regs.smem_page_start)
-			+ (BFI_BOOT_PARAM_OFF)));
+	writel(boot_env, ((ioc->ioc_regs.smem_page_start)
+			+ (BFI_BOOT_LOADER_OFF)));
 }
 
 static void
@@ -1721,7 +1726,7 @@ bfa_ioc_pll_init(struct bfa_ioc *ioc)
  * as the entry vector.
  */
 static void
-bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_param)
+bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_env)
 {
 	void __iomem *rb;
 
@@ -1734,7 +1739,7 @@ bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_param)
 	 * Initialize IOC state of all functions on a chip reset.
 	 */
 	rb = ioc->pcidev.pci_bar_kva;
-	if (boot_param == BFI_BOOT_TYPE_MEMTEST) {
+	if (boot_type == BFI_BOOT_TYPE_MEMTEST) {
 		writel(BFI_IOC_MEMTEST, (rb + BFA_IOC0_STATE_REG));
 		writel(BFI_IOC_MEMTEST, (rb + BFA_IOC1_STATE_REG));
 	} else {
@@ -1743,7 +1748,7 @@ bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_param)
 	}
 
 	bfa_ioc_msgflush(ioc);
-	bfa_ioc_download_fw(ioc, boot_type, boot_param);
+	bfa_ioc_download_fw(ioc, boot_type, boot_env);
 
 	/**
 	 * Enable interrupts just before starting LPU
diff --git a/drivers/net/bna/bfa_ioc.h b/drivers/net/bna/bfa_ioc.h
index e4974bc..bd48abe 100644
--- a/drivers/net/bna/bfa_ioc.h
+++ b/drivers/net/bna/bfa_ioc.h
@@ -194,6 +194,7 @@ struct bfa_ioc_hwif {
 					bool msix);
 	void		(*ioc_notify_fail)	(struct bfa_ioc *ioc);
 	void		(*ioc_ownership_reset)	(struct bfa_ioc *ioc);
+	bool		(*ioc_sync_start)       (struct bfa_ioc *ioc);
 	void		(*ioc_sync_join)	(struct bfa_ioc *ioc);
 	void		(*ioc_sync_leave)	(struct bfa_ioc *ioc);
 	void		(*ioc_sync_ack)		(struct bfa_ioc *ioc);
diff --git a/drivers/net/bna/bfa_ioc_ct.c b/drivers/net/bna/bfa_ioc_ct.c
index 469997c..87aecdf 100644
--- a/drivers/net/bna/bfa_ioc_ct.c
+++ b/drivers/net/bna/bfa_ioc_ct.c
@@ -41,6 +41,7 @@ static void bfa_ioc_ct_map_port(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_isr_mode_set(struct bfa_ioc *ioc, bool msix);
 static void bfa_ioc_ct_notify_fail(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_ownership_reset(struct bfa_ioc *ioc);
+static bool bfa_ioc_ct_sync_start(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_sync_join(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_sync_leave(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_sync_ack(struct bfa_ioc *ioc);
@@ -63,6 +64,7 @@ bfa_nw_ioc_set_ct_hwif(struct bfa_ioc *ioc)
 	nw_hwif_ct.ioc_isr_mode_set = bfa_ioc_ct_isr_mode_set;
 	nw_hwif_ct.ioc_notify_fail = bfa_ioc_ct_notify_fail;
 	nw_hwif_ct.ioc_ownership_reset = bfa_ioc_ct_ownership_reset;
+	nw_hwif_ct.ioc_sync_start = bfa_ioc_ct_sync_start;
 	nw_hwif_ct.ioc_sync_join = bfa_ioc_ct_sync_join;
 	nw_hwif_ct.ioc_sync_leave = bfa_ioc_ct_sync_leave;
 	nw_hwif_ct.ioc_sync_ack = bfa_ioc_ct_sync_ack;
@@ -345,6 +347,32 @@ bfa_ioc_ct_ownership_reset(struct bfa_ioc *ioc)
 /**
  * Synchronized IOC failure processing routines
  */
+static bool
+bfa_ioc_ct_sync_start(struct bfa_ioc *ioc)
+{
+	u32 r32 = readl(ioc->ioc_regs.ioc_fail_sync);
+	u32 sync_reqd = bfa_ioc_ct_get_sync_reqd(r32);
+
+	/*
+	 * Driver load time.  If the sync required bit for this PCI fn
+	 * is set, it is due to an unclean exit by the driver for this
+	 * PCI fn in the previous incarnation. Whoever comes here first
+	 * should clean it up, no matter which PCI fn.
+	 */
+
+	if (sync_reqd & bfa_ioc_ct_sync_pos(ioc)) {
+		writel(0, ioc->ioc_regs.ioc_fail_sync);
+		writel(1, ioc->ioc_regs.ioc_usage_reg);
+		writel(BFI_IOC_UNINIT, ioc->ioc_regs.ioc_fwstate);
+		writel(BFI_IOC_UNINIT, ioc->ioc_regs.alt_ioc_fwstate);
+		return true;
+	}
+
+	return bfa_ioc_ct_sync_complete(ioc);
+}
+/**
+ * Synchronized IOC failure processing routines
+ */
 static void
 bfa_ioc_ct_sync_join(struct bfa_ioc *ioc)
 {
diff --git a/drivers/net/bna/bfi.h b/drivers/net/bna/bfi.h
index a973968..6050379 100644
--- a/drivers/net/bna/bfi.h
+++ b/drivers/net/bna/bfi.h
@@ -184,12 +184,14 @@ enum bfi_mclass {
 #define BFI_IOC_MSGLEN_MAX	32	/* 32 bytes */
 
 #define BFI_BOOT_TYPE_OFF		8
-#define BFI_BOOT_PARAM_OFF		12
+#define BFI_BOOT_LOADER_OFF		12
 
-#define BFI_BOOT_TYPE_NORMAL 		0	/* param is device id */
+#define BFI_BOOT_TYPE_NORMAL 		0
 #define	BFI_BOOT_TYPE_FLASH		1
 #define	BFI_BOOT_TYPE_MEMTEST		2
 
+#define BFI_BOOT_LOADER_OS		0
+
 #define BFI_BOOT_MEMTEST_RES_ADDR   0x900
 #define BFI_BOOT_MEMTEST_RES_SIG    0xA0A1A2A3
 
-- 
1.7.1


^ permalink raw reply related

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Eric Dumazet @ 2011-04-14 17:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Duyck, Wei Gu, netdev, Kirsher, Jeffrey T,
	Mike Galbraith
In-Reply-To: <1302800221.3248.39.camel@edumazet-laptop>

Le jeudi 14 avril 2011 à 18:57 +0200, Eric Dumazet a écrit :
> Le jeudi 14 avril 2011 à 18:56 +0200, Peter Zijlstra a écrit :
> > On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:
> > 
> > > I'm doing some more digging into this now.  One thought that occurred to 
> > > me is that if the patch you mention is having some sort of effect this 
> > > could be a sign of perhaps a kernel timer or scheduling problem.
> > 
> > Right, so the removal of the NO_HZ throttle will allow the CPU to go
> > into C states more often, this could result in longer wake-up times for
> > IRQs.
> > 
> > We reverted because:
> >   - it caused significant battery drain due to not going into C states
> >     often enough, and
> >   - its a much better idea to implement these things in the idle
> >     governor since it already has the job of guestimating the idle
> >     duration.
> > 
> > I really can't remember back far enough to even come up with a theory of
> > why kernels prior to merging the NO_HZ throttle would not exhibit this
> > problem.
> > 
> > 
> > 
> 
> Normally, Wei Gu already asked to not use C states.
> 
> http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf
> 
> How can we/he check this ?
> 
> 

Anyway, this could explain a latency problem, not packet drops.

With NAPI, we should get few hardware irqs under load.

Once softirq started, scheduler is out of the equation.




^ permalink raw reply

* [PATCH] net: export skb_clone_tx_timestamp
From: Richard Cochran @ 2011-04-14 17:35 UTC (permalink / raw)
  To: netdev; +Cc: David Miller

MAC drivers compiled as modules may well want to call this function via
the skb_tx_timestamp inline function. This patch exports the function in
order to let this happen.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 net/core/timestamping.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 7e7ca37..3b00a6b 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -68,6 +68,7 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
 		break;
 	}
 }
+EXPORT_SYMBOL_GPL(skb_clone_tx_timestamp);
 
 void skb_complete_tx_timestamp(struct sk_buff *skb,
 			       struct skb_shared_hwtstamps *hwtstamps)
-- 
1.7.0.4


^ permalink raw reply related

* ipv6 multicasting: "interface-local" scope
From: Kristoff Bonne @ 2011-04-14 17:23 UTC (permalink / raw)
  To: netdev

Hi,


(This is a repost of a message I posted yesterday in
linux.network.general but to which I have not seen any replies).


If I understand RFC 4291 correctly, the ipv6 multicast ip-addresses
ffx1:<scope> are "interface local" scope.

In draft-ietf-ipngwg-scoping-arch-02, I read this:
"The interface-local scope spans a single interface only; a multicast
address of interface-local scope is useful only for loopback delivery of
multicasts within a single node, for example, as a form of inter-process
communication within a computer".


This is exactly what I want to do for an application I am working on.

However when I try it and make a small program that sends out UDP
packats to the ipv6 multicast address ff11::1234, I do see them show up
on remote machines on my LAN!
So it looks like these packets do leave my local compute which -I think-
is the opposite of what RFC4291 tells me should happen.

Am I missing something? or is this a bug?



The machine on which I have tested this runs ubuntu 10.04.1 LTE with
kernel 2.6.32-30-generic



Anybody any ideas?



links:
- RFC4291: http://tools.ietf.org/html/rfc4291#section-2.7
- draft-ietf-ipngwg-scoping-arch-02:
http://tools.ietf.org/html/draft-ietf-ipngwg-scoping-arch-02)

Cheerio! Kr. Bonne.



^ permalink raw reply

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Eric Dumazet @ 2011-04-14 16:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Duyck, Wei Gu, netdev, Kirsher, Jeffrey T,
	Mike Galbraith
In-Reply-To: <1302800202.2035.32.camel@laptop>

Le jeudi 14 avril 2011 à 18:56 +0200, Peter Zijlstra a écrit :
> On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:
> 
> > I'm doing some more digging into this now.  One thought that occurred to 
> > me is that if the patch you mention is having some sort of effect this 
> > could be a sign of perhaps a kernel timer or scheduling problem.
> 
> Right, so the removal of the NO_HZ throttle will allow the CPU to go
> into C states more often, this could result in longer wake-up times for
> IRQs.
> 
> We reverted because:
>   - it caused significant battery drain due to not going into C states
>     often enough, and
>   - its a much better idea to implement these things in the idle
>     governor since it already has the job of guestimating the idle
>     duration.
> 
> I really can't remember back far enough to even come up with a theory of
> why kernels prior to merging the NO_HZ throttle would not exhibit this
> problem.
> 
> 
> 

Normally, Wei Gu already asked to not use C states.

http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf

How can we/he check this ?




^ permalink raw reply

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Peter Zijlstra @ 2011-04-14 16:56 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Gu, Eric Dumazet, netdev, Kirsher, Jeffrey T, Mike Galbraith
In-Reply-To: <4DA723F1.7000901@intel.com>

On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:

> I'm doing some more digging into this now.  One thought that occurred to 
> me is that if the patch you mention is having some sort of effect this 
> could be a sign of perhaps a kernel timer or scheduling problem.

Right, so the removal of the NO_HZ throttle will allow the CPU to go
into C states more often, this could result in longer wake-up times for
IRQs.

We reverted because:
  - it caused significant battery drain due to not going into C states
    often enough, and
  - its a much better idea to implement these things in the idle
    governor since it already has the job of guestimating the idle
    duration.

I really can't remember back far enough to even come up with a theory of
why kernels prior to merging the NO_HZ throttle would not exhibit this
problem.




^ permalink raw reply

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Eric Dumazet @ 2011-04-14 16:45 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Wei Gu, Peter Zijlstra, netdev, Kirsher, Jeffrey T
In-Reply-To: <4DA723F1.7000901@intel.com>

Le jeudi 14 avril 2011 à 09:42 -0700, Alexander Duyck a écrit :

> The only issue I have found so far with the ixgbe driver is the fact 
> that apparently rx_no_buffer_count is apparently always going to be 0 on 
> 82599, and that isn't so much a driver problem as a hardware limitation 
> as the HW counter was removed in 82599.  However since the hardware was 
> capable of going faster on the other kernels what this likely means is 
> that the rx_missed_errors are due to the driver not providing Rx buffers 
> fast enough.
> 
> I'm doing some more digging into this now.  One thought that occurred to 
> me is that if the patch you mention is having some sort of effect this 
> could be a sign of perhaps a kernel timer or scheduling problem.

We could try to instrument the delay betwen hardware IRQ and napi
handler called.




^ permalink raw reply

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Alexander Duyck @ 2011-04-14 16:42 UTC (permalink / raw)
  To: Wei Gu; +Cc: Eric Dumazet, Peter Zijlstra, netdev, Kirsher, Jeffrey T
In-Reply-To: <D12839161ADD3A4B8DA63D1A134D084026E49536AF@ESGSCCMS0001.eapac.ericsson.se>

On 4/13/2011 11:58 PM, Wei Gu wrote:
> I did the single flow test, it shows no rx error with 300kpps. While I was start multiple flow with same 300Kpps traffic, then it looks really bad with high rx_missing_error.
>
> Multiple Flow:
> SUM: 191925 ETH8: 0  ETH10: 191925 ETH6: 0 ETH4: 0
> SUM: 214634 ETH8: 0  ETH10: 214634 ETH6: 0 ETH4: 0
> SUM: 237600 ETH8: 0  ETH10: 237600 ETH6: 0 ETH4: 0
> SUM: 198925 ETH8: 0  ETH10: 198925 ETH6: 0 ETH4: 0
> SUM: 249290 ETH8: 0  ETH10: 249290 ETH6: 0 ETH4: 0
>
> Single Flow:
> SUM: 302018 ETH8: 0  ETH10: 302018 ETH6: 0 ETH4: 0
> SUM: 301849 ETH8: 0  ETH10: 301849 ETH6: 0 ETH4: 0
> SUM: 302163 ETH8: 0  ETH10: 302163 ETH6: 0 ETH4: 0
>
> Thanks
> WeiGu
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Thursday, April 14, 2011 2:34 PM
> To: Wei Gu
> Cc: Alexander Duyck; Peter Zijlstra; netdev; Kirsher, Jeffrey T
> Subject: RE: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
>
> Le jeudi 14 avril 2011 à 08:07 +0200, Eric Dumazet a écrit :
>> Le jeudi 14 avril 2011 à 13:42 +0800, Wei Gu a écrit :
>>> Hi guys,
>>> Do you think it was a bug in the kernel from 2.6.35.2 with Intel 10GE ixgbe driver?
>>> If so shall I issue a Bug on the bugzilla, and which category? Cause I'm not sure it was driver problem Or sched problem.
>>
>> This makes no sense to me.
>>
>> What is the maximum throughput you can get in pps before having packet
>> drops ?
>>
>> Please try with a single flow (to hit one queue, one cpu)
>>
>> Thanks
>>
>
> Also, please try to check if using smaller or bigger packets makes any change in this max throughput
>
>

The only issue I have found so far with the ixgbe driver is the fact 
that apparently rx_no_buffer_count is apparently always going to be 0 on 
82599, and that isn't so much a driver problem as a hardware limitation 
as the HW counter was removed in 82599.  However since the hardware was 
capable of going faster on the other kernels what this likely means is 
that the rx_missed_errors are due to the driver not providing Rx buffers 
fast enough.

I'm doing some more digging into this now.  One thought that occurred to 
me is that if the patch you mention is having some sort of effect this 
could be a sign of perhaps a kernel timer or scheduling problem.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH net-next] sfc: make function tables const
From: Ben Hutchings @ 2011-04-14 16:37 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Steve Hodgson, Solarflare linux maintainers, netdev
In-Reply-To: <20110414085012.3e45d7b4@nehalam>

On Thu, 2011-04-14 at 08:50 -0700, Stephen Hemminger wrote:
> The phy, mac, and board information structures should be const.

efx_nic_type actually holds information and function pointers about
different generations of the controller, not different boards.

> Since tables contain function pointer this improves security
> (at least theoretically).
> 
> Compile tested only.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* [PATCH v2] ip6_pol_route panic: Do not propagate LOOPBACK to VLAN
From: Krishna Kumar @ 2011-04-14 16:07 UTC (permalink / raw)
  To: davem; +Cc: netdev, Krishna Kumar
In-Reply-To: <20110414160704.32251.17281.sendpatchset@krkumar2.in.ibm.com>

I have tested two ways of fixing this panic:
	1. PATCH1: Do not allow vlan on lo.
	2. PATCH2: Do not propagate LOOPBACK to vlan devices.

Isn't it better to use PATCH1 and disallow vlan on lo?

The result of this patch is:

# modprobe 8021q
# vconfig add lo 43
# ifconfig lo.69 hw ether 00:80:48:BA:d1:30
# ping6 -c 3 fe80::280:48ff:feba:d130
connect: Cannot assign requested address
(no panic)

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 net/8021q/vlan_dev.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -ruNp org/net/8021q/vlan_dev.c new2/net/8021q/vlan_dev.c
--- org/net/8021q/vlan_dev.c	2011-04-14 20:42:56.000000000 +0530
+++ new2/net/8021q/vlan_dev.c	2011-04-14 20:44:35.000000000 +0530
@@ -525,7 +525,8 @@ static int vlan_dev_init(struct net_devi
 
 	/* IFF_BROADCAST|IFF_MULTICAST; ??? */
 	dev->flags  = real_dev->flags & ~(IFF_UP | IFF_PROMISC | IFF_ALLMULTI |
-					  IFF_MASTER | IFF_SLAVE);
+					  IFF_MASTER | IFF_SLAVE |
+					  IFF_LOOPBACK);
 	dev->iflink = real_dev->ifindex;
 	dev->state  = (real_dev->state & ((1<<__LINK_STATE_NOCARRIER) |
 					  (1<<__LINK_STATE_DORMANT))) |

^ permalink raw reply

* [PATCH v1] ip6_pol_route panic: Do not allow VLAN on loopback
From: Krishna Kumar @ 2011-04-14 16:07 UTC (permalink / raw)
  To: davem; +Cc: netdev, Krishna Kumar

I have tested two ways of fixing this panic:
	1. PATCH1: Do not allow vlan on lo.
	2. PATCH2: Do not propagate LOOPBACK to vlan devices.

Isn't it better to use PATCH1 and disallow vlan on lo?

The result of this patch is:

# modprobe 8021q
# vconfig add lo 43
ERROR: trying to add VLAN #43 to IF -:lo:-  error: Operation not supported

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 drivers/net/loopback.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -ruNp org/drivers/net/loopback.c new1/drivers/net/loopback.c
--- org/drivers/net/loopback.c	2011-04-14 20:45:46.000000000 +0530
+++ new1/drivers/net/loopback.c	2011-04-14 20:47:09.000000000 +0530
@@ -173,7 +173,8 @@ static void loopback_setup(struct net_de
 		| NETIF_F_RXCSUM
 		| NETIF_F_HIGHDMA
 		| NETIF_F_LLTX
-		| NETIF_F_NETNS_LOCAL;
+		| NETIF_F_NETNS_LOCAL
+		| NETIF_F_VLAN_CHALLENGED;
 	dev->ethtool_ops	= &loopback_ethtool_ops;
 	dev->header_ops		= &eth_header_ops;
 	dev->netdev_ops		= &loopback_ops;

^ permalink raw reply

* Re: [PATCH v2] net: filter: Just In Time compiler
From: Eric Dumazet @ 2011-04-14 16:05 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Hagen Paul Pfeifer, David Miller, netdev,
	Arnaldo Carvalho de Melo, Ben Hutchings
In-Reply-To: <4DA71876.803@redhat.com>

Le jeudi 14 avril 2011 à 18:53 +0300, Avi Kivity a écrit :

> IMO, it will.  I'll try to have gcc optimize your example filter later.

Sure you can JIT a C program from bpf. It should take maybe 30 minutes.
It certainly is more easy than JIT an binary/assembly code :)

Now take a look how I call slowpath, I am not sure gcc will actually
generate better code because of C conventions.

Loading a filter should be fast.
Invoking a compiler is just too much work for BPF.
Remember loading a filter is available to any user.

This idea would be good for netfilter stuff, because we dont load
iptables rules that often.

But still, the netfilter mainloop can be converted as a kernel JIT, most
probably. All the complex stuff (matches, targets) must call external
procedures anyway.




^ permalink raw reply

* Re: Network performance with small packets
From: Michael S. Tsirkin @ 2011-04-14 16:03 UTC (permalink / raw)
  To: Rusty Russell
  Cc: habanero, Shirley Ma, Krishna Kumar2, David Miller, kvm, netdev,
	steved, Tom Lendacky, borntraeger
In-Reply-To: <87bp09ax7a.fsf@rustcorp.com.au>

On Thu, Apr 14, 2011 at 08:58:41PM +0930, Rusty Russell wrote:
> On Tue, 12 Apr 2011 23:01:12 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Thu, Mar 10, 2011 at 12:19:42PM +1030, Rusty Russell wrote:
> > > Here's an old patch where I played with implementing this:
> > 
> > ...
> > 
> > > 
> > > virtio: put last_used and last_avail index into ring itself.
> > > 
> > > Generally, the other end of the virtio ring doesn't need to see where
> > > you're up to in consuming the ring.  However, to completely understand
> > > what's going on from the outside, this information must be exposed.
> > > For example, if you want to save and restore a virtio_ring, but you're
> > > not the consumer because the kernel is using it directly.
> > > 
> > > Fortunately, we have room to expand:
> > 
> > This seems to be true for x86 kvm and lguest but is it true
> > for s390?
> 
> Yes, as the ring is page aligned so there's always room.
> 
> > Will this last bit work on s390?
> > If I understand correctly the memory is allocated by host there?
> 
> They have to offer the feature, so if the have some way of allocating
> non-page-aligned amounts of memory, they'll have to add those extra 2
> bytes.
> 
> So I think it's OK...
> Rusty.

To clarify, my concern is that we always seem to try to map
these extra 2 bytes, which thinkably might fail?

-- 
MST

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox