* Re: [PATCH net-next-2.6] l2tp: fix potential rcu race
From: Paul E. McKenney @ 2011-05-12 14:26 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, James Chapman
In-Reply-To: <1305174156.3232.26.camel@edumazet-laptop>
On Thu, May 12, 2011 at 06:22:36AM +0200, Eric Dumazet wrote:
> While trying to remove useless synchronize_rcu() calls, I found l2tp is
> indeed incorrectly using two of such calls, but also bumps tunnel
> refcount after list insertion.
>
> tunnel refcount must be incremented before being made publically visible
> by rcu readers.
>
> This fix can be applied to 2.6.35+ and might need a backport for older
> kernels, since things were shuffled in commit fd558d186df2c
> (l2tp: Split pppol2tp patch into separate l2tp and ppp parts)
Ouch! Good catch, Eric!
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> CC: James Chapman <jchapman@katalix.com>
> ---
>
> I based this patch on net-next-2.6 because of recent commits in this
> file and linux-2.6 being on late RC phase. But its a stable candidate.
>
> net/l2tp/l2tp_core.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 9be095e..ed8a233 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1435,16 +1435,15 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
>
> /* Add tunnel to our list */
> INIT_LIST_HEAD(&tunnel->list);
> - spin_lock_bh(&pn->l2tp_tunnel_list_lock);
> - list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
> - spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
> - synchronize_rcu();
> atomic_inc(&l2tp_tunnel_count);
>
> /* Bump the reference count. The tunnel context is deleted
> - * only when this drops to zero.
> + * only when this drops to zero. Must be done before list insertion
> */
> l2tp_tunnel_inc_refcount(tunnel);
> + spin_lock_bh(&pn->l2tp_tunnel_list_lock);
> + list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
> + spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
>
> err = 0;
> err:
> @@ -1636,7 +1635,6 @@ struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunn
> hlist_add_head_rcu(&session->global_hlist,
> l2tp_session_id_hash_2(pn, session_id));
> spin_unlock_bh(&pn->l2tp_session_hlist_lock);
> - synchronize_rcu();
> }
>
> /* Ignore management session in session count value */
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: Ben Hutchings @ 2011-05-12 14:33 UTC (permalink / raw)
To: Vitalii Demianets
Cc: Jay Vosburgh, Andy Gospodarek, Arnaldo Carvalho de Melo, netdev,
bonding-devel
In-Reply-To: <201105121557.06679.vitas@nppfactor.kiev.ua>
On Thu, 2011-05-12 at 15:57 +0300, Vitalii Demianets wrote:
> With some combinations of arch/compiler the sizeof operator on structure
> returns value greater than expected.
Yes, some architecture ABIs set a minimum alignment for structures so
they can have greater alignment than any of their members.
> In cases when the structure is used for
> mapping PDU fields it may lead to unexpected results (e.g. holes and
> alignment problems in skb data). Attribute "packed" prevents this undesired
> behavior.
[...]
The '__packed' macro is preferred.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [Bugme-new] [Bug 33502] New: Caught 64-bit read from uninitialized memory in __alloc_skb
From: Christoph Lameter @ 2011-05-12 14:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: Vegard Nossum, Pekka Enberg, casteyde.christian, Andrew Morton,
netdev, bugzilla-daemon, bugme-daemon
In-Reply-To: <1305083543.2437.39.camel@edumazet-laptop>
On Wed, 11 May 2011, Eric Dumazet wrote:
> > The cpu sets a page flag called PageFrozen() and points a per cpu pointer
> > to the page.
> >
> >
>
> So, if I understand you, there is no problem at all and no patch even
> needed ? I can start a stress test and you guarantee there wont be a
> crash ?
>
> Sorry, its 5h11 in the morning here ;)
There is a problem if an interrupt or a preemption occurs and there is no
object left on the page. Then the current page will be unfrozen and a new
page put into place for allocation. The old page may then be freed by some other
process on another processor before we continue the interrupted
slab_alloc().
When slab_alloc() resumes in this scenario then it will ultimately see
that the tid was incremented and so the cmpxchg will fail. But before we
do the cmpxchgwe determine the pointer to the next object. And for that
we access the old page. The access must not cause a page fault (which it
currently does with CONFIG_DEBUG_PAGEALLOC). That is why we need the patch introducing
get_freepointer_safe()
The result does not matter since we will repeat the cmpxchg loop.
^ permalink raw reply
* Re: future developments of usbnet
From: Alan Stern @ 2011-05-12 14:37 UTC (permalink / raw)
To: Oliver Neukum
Cc: David Miller, shemminger-ZtmgI6mnKB3QT0dZR+AlfA,
tom.leiming-Re5JQEeQqe8AvxtiuMwx3w, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-usb-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <201105120959.28473.oliver-GvhC2dPhHPQdnm+yROfE0A@public.gmane.org>
On Thu, 12 May 2011, Oliver Neukum wrote:
> Am Mittwoch, 11. Mai 2011, 19:47:27 schrieb David Miller:
>
> > Basically once you take you interrupt, and disable device interrupts,
> > the generic net device layer calls your ->poll() routing with a "weight"
> > You should not process more RX packets than this value.
> >
> > If you have less than "weight" work to do, you should do a napi_complete(),
> > which takes you out of the polling group, and re-enable device interrupts.
> >
> > So the idea is that you keep getting ->poll()'d until there is no more
> > RX work to do.
> >
> > The "weight" argument implements fairness amongst competing, actively
> > polling, devices on the same CPU.
> >
>
> Thank you, this is very informative. Our problem here is that USB doesn't work
> sanely without interrupts. We can stop IO regarding rx, but we cannot stop
> interrupts if we want to do rx.
The idea behind NAPI can be adapted to the usbnet context. The basic
idea is that the driver is allowed to accept only a limited number of
packets, driven by polling from the NAPI core.
Therefore usbnet's poll routine should take the "weight" argument as an
indication of how many outstanding rx URBs are allowed. Each time the
poll routine is called, it should check to see if any rx URBs have
completed since the previous poll. If not then there is no network
traffic, so usbnet can take itself out of the poll loop. Otherwise,
the number of outstanding URBs should be adjusted (by unlinking some or
submitting more -- subject to some fixed maximum limit) to match the
new "weight".
Does that make sense?
Alan Stern
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v4 1/1] can: add pruss CAN driver.
From: Oliver Hartkopp @ 2011-05-12 14:41 UTC (permalink / raw)
To: Subhasish Ghosh
Cc: sachi-EvXpCiN+lbve9wHmmfpqLFaTQe2KTcn/,
davinci-linux-open-source-VycZQUHpC/PFrsHnngEfi1aTQe2KTcn/,
Arnd Bergmann, Netdev-u79uwXL29TY76Z2rM5mHXA, nsekhar-l0cyMroinI0,
open list, CAN NETWORK DRIVERS, Alan Cox, Marc Kleine-Budde,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
m-watkins-l0cyMroinI0, Wolfgang Grandegger
In-Reply-To: <4DCB88A4.2010901-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
On 12.05.2011 09:13, Wolfgang Grandegger wrote:
> On 05/11/2011 11:44 PM, Arnd Bergmann wrote:
>> On Wednesday 11 May 2011, Arnd Bergmann wrote:
>>> If that interpretation is right, I would seriously recommend rethinking
>>> the design of the CAN firmware for pruss, so you can start doing something
>>> useful with the offload engine that fits into the Socket CAN API, or that
>>> would be a useful extension to Socket CAN that is also implementable in
>>> the kernel for all other drivers in a meaningful way.
>>
>> I've looked some more into the CAN socket implementation, and I suppose that
>> the idea of the pruss driver was really to help do the work from the
>> can_rcv_filter function in hardware.
>
> That software filter is per socket while the hardware filter will be per
> device.
Hi all,
i took some while to get behind all the arguments for me :-)
Wolfgangs suggestion
> A simple interface using:
>
> ip link set can0 type can filter <id>:<mask> [<id>:<mask> ...]
>
indeed would just be fine - but IMHO it doesn't help for the pruss CAN driver.
The problem is, that you plan to filter on a CAN-identifier base. This is not
only very application dependent (as Kurt already pointed out) - it also does
not bring any safety that your system does not explode on heavy CAN load.
E.g. assume you need the CAN-IDs 0x100, 0x200 and 0x300 in your application
and for that reason you configure these IDs in the pruss CAN driver.
What if someone generates a 100% CAN busload exactly on CAN-ID 0x100 then?
Worst case (1MBit/s, DLC=0) you would need to handle about 21.000 irqs/s for
the correctly received CAN frames with the filtered CAN-ID 0x100 ...
At the beginnig of the SocketCAN development, we had a PowerPC (E603) @ 133MHz
that was able to handle 4 (dumb) SJA1000 CAN controllers without problems.
Maybe i should tell a bit more about what's happening at CAN frame receive time:
1. IRQ happens
2. Read CAN frame from CAN controller registers
3. Allocate a socketbuffer (skb) and queue it into the netdevice rx queue
4. Softirq handles the new skb
5. Check the (specialized/optimized) CAN-ID filters for this CAN device
6. Enqueue the data to the sockets recv buffer(s) and/or drop the skb.
This all depends heavily on Linux networking (skb handling, caching, etc) and
is pretty fast and optimized!! That was also the reason why it ran on the old
PowerPC that smoothly. The mostly seen effect if anything drops is when the
application (holding the socket) was not fast enough to handle the incoming
data. NB: For that reason we implemented a CAN content filter (CAN_BCM) that
is able to do content filtering and timeout monitoring in Kernelspace - all
performed in the SoftIRQ.
Having 'Mailboxes' bound to CAN-IDs is something that's useful for 8/16 bit
CPUs where an application is tightly bound to the embedded ECUs functionality.
IMO you should try to implement an 'open' pruss CAN driver without filtering
as it doesn't really help (see above). The networking stack can cope with the
load. In all cases (with/without filters) the reduction the irqload could be a
way to investigate (e.g. think about dropping CAN frames based on the irqload).
But you should try the 'open' way first. And i think with your current driver
you could just filter for one CAN-ID (e.g. 0x100) and then produce the heavy
CAN busload with this CAN-ID 0x100 and a second CAN node, right?
Regards,
Oliver
^ permalink raw reply
* Re: [PATCH net-next] net:set valid name before calling ndo_init()
From: Jiri Pirko @ 2011-05-12 15:17 UTC (permalink / raw)
To: Weiping Pan(潘卫平)
Cc: davem, eric.dumazet, xiaosuo, mirq-linux, netdev, linux-kernel
In-Reply-To: <1300936598-3971-1-git-send-email-panweiping3@gmail.com>
Thu, Mar 24, 2011 at 04:16:38AM CET, panweiping3@gmail.com wrote:
>From: Pan Weiping <panweiping3@gmail.com>
>
>A bug of bonding was invloved by e815d19ffe02bdfda1260949ef2b1806171,
>see example 1 and 2.
>
>In register_netdevice(), the name of net_device is not valid until
>dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is
>bond_init) is called before dev_get_valid_name(),
>and it uses the invalid name of net_device.
>
>I think register_netdevice() should make sure that the name of net_device is
>valid before calling ndo_init().
>
>example 1:
>modprobe bonding
>ls /proc/net/bonding/bond%d
>
>ps -eLf
>root 3398 2 3398 0 1 21:34 ? 00:00:00 [bond%d]
>
>example 2:
>modprobe bonding max_bonds=3
>
>[ 170.100292] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>[ 170.101090] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
>[ 170.102469] ------------[ cut here ]------------
>[ 170.103150] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
>[ 170.104075] Hardware name: VirtualBox
>[ 170.105065] proc_dir_entry 'bonding/bond%d' already registered
>[ 170.105613] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
>[ 170.108397] Pid: 3457, comm: modprobe Not tainted 2.6.39-rc2+ #14
>[ 170.108935] Call Trace:
>[ 170.109382] [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
>[ 170.109911] [<c051a42a>] ? proc_register+0x126/0x157
>[ 170.110329] [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
>[ 170.110846] [<c051a42a>] proc_register+0x126/0x157
>[ 170.111870] [<c051a4dd>] proc_create_data+0x82/0x98
>[ 170.112335] [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
>[ 170.112905] [<f94dd806>] bond_init+0x77/0xa5 [bonding]
>[ 170.113319] [<c0721ac6>] register_netdevice+0x8c/0x1d3
>[ 170.113848] [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
>[ 170.114322] [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
>[ 170.114879] [<c0401240>] do_one_initcall+0x76/0x122
>[ 170.115317] [<f94f4000>] ? 0xf94f3fff
>[ 170.115799] [<c0463f1e>] sys_init_module+0x1286/0x140d
>[ 170.116879] [<c07c6d9f>] sysenter_do_call+0x12/0x28
>[ 170.117404] ---[ end trace 64e4fac3ae5fff1a ]---
>[ 170.117924] bond%d: Warning: failed to register to debugfs
>[ 170.128728] ------------[ cut here ]------------
>[ 170.129360] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
>[ 170.130323] Hardware name: VirtualBox
>[ 170.130797] proc_dir_entry 'bonding/bond%d' already registered
>[ 170.131315] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
>[ 170.133731] Pid: 3457, comm: modprobe Tainted: G W 2.6.39-rc2+ #14
>[ 170.134308] Call Trace:
>[ 170.134743] [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
>[ 170.135305] [<c051a42a>] ? proc_register+0x126/0x157
>[ 170.135820] [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
>[ 170.137168] [<c051a42a>] proc_register+0x126/0x157
>[ 170.137700] [<c051a4dd>] proc_create_data+0x82/0x98
>[ 170.138174] [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
>[ 170.138745] [<f94dd806>] bond_init+0x77/0xa5 [bonding]
>[ 170.139278] [<c0721ac6>] register_netdevice+0x8c/0x1d3
>[ 170.139828] [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
>[ 170.140361] [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
>[ 170.140927] [<c0401240>] do_one_initcall+0x76/0x122
>[ 170.141494] [<f94f4000>] ? 0xf94f3fff
>[ 170.141975] [<c0463f1e>] sys_init_module+0x1286/0x140d
>[ 170.142463] [<c07c6d9f>] sysenter_do_call+0x12/0x28
>[ 170.142974] ---[ end trace 64e4fac3ae5fff1b ]---
>[ 170.144949] bond%d: Warning: failed to register to debugfs
>
>Signed-off-by: Pan Weiping <panweiping3@gmail.com>
>---
> net/core/dev.c | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 75898a3..f289117 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -5412,6 +5412,10 @@ int register_netdevice(struct net_device *dev)
>
> dev->iflink = -1;
>
>+ ret = dev_get_valid_name(dev, dev->name);
>+ if (ret < 0)
>+ goto out;
>+
> /* Init, if this function is available */
> if (dev->netdev_ops->ndo_init) {
> ret = dev->netdev_ops->ndo_init(dev);
>@@ -5422,10 +5426,6 @@ int register_netdevice(struct net_device *dev)
> }
> }
>
>- ret = dev_get_valid_name(dev, dev->name);
>- if (ret < 0)
>- goto err_uninit;
>-
> dev->ifindex = dev_new_index(net);
> if (dev->iflink == -1)
> dev->iflink = dev->ifindex;
>--
>1.7.4
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
This is looking good to me.
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
^ permalink raw reply
* Re: [PATCH net-next] net:set valid name before calling ndo_init()
From: Eric Dumazet @ 2011-05-12 15:33 UTC (permalink / raw)
To: Weiping Pan(潘卫平)
Cc: David S. Miller, Tom Herbert, Michał Mirosław,
Ben Hutchings, open list:NETWORKING [GENERAL], open list
In-Reply-To: <1305207541-13311-1-git-send-email-panweiping3@gmail.com>
Le jeudi 12 mai 2011 à 21:39 +0800, Weiping Pan(潘卫平) a écrit :
> From: Pan Weiping <panweiping3@gmail.com>
>
> A bug of bonding was invloved by e815d19ffe02bdfda1260949ef2b1806171,
> see example 1 and 2.
>
I cant find e815d19ffe02bdfda1260949ef2b1806171 in net-next-2.6
but I do find 1c5cae815d19ffe02bdfda1260949ef2b1806171
Please always use following when referring to a commit :
... in commit 1c5cae815d19 (net: call dev_alloc_name from
register_netdevice) ...
- just limit to _first_ 12 chars, no need to have full length
- give the commit title
^ permalink raw reply
* Re: [PATCH net-next] net:set valid name before calling ndo_init()
From: Changli Gao @ 2011-05-12 15:32 UTC (permalink / raw)
To: Weiping Pan(潘卫平)
Cc: davem, eric.dumazet, mirq-linux, jpirko, netdev, linux-kernel
In-Reply-To: <1300936598-3971-1-git-send-email-panweiping3@gmail.com>
On Thu, Mar 24, 2011 at 11:16 AM, Weiping Pan(潘卫平)
<panweiping3@gmail.com> wrote:
> From: Pan Weiping <panweiping3@gmail.com>
>
> A bug of bonding was invloved by e815d19ffe02bdfda1260949ef2b1806171,
> see example 1 and 2.
>
> In register_netdevice(), the name of net_device is not valid until
> dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is
> bond_init) is called before dev_get_valid_name(),
> and it uses the invalid name of net_device.
>
> I think register_netdevice() should make sure that the name of net_device is
> valid before calling ndo_init().
>
> example 1:
> modprobe bonding
> ls /proc/net/bonding/bond%d
>
> ps -eLf
> root 3398 2 3398 0 1 21:34 ? 00:00:00 [bond%d]
>
> example 2:
> modprobe bonding max_bonds=3
>
> [ 170.100292] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> [ 170.101090] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
> [ 170.102469] ------------[ cut here ]------------
> [ 170.103150] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
> [ 170.104075] Hardware name: VirtualBox
> [ 170.105065] proc_dir_entry 'bonding/bond%d' already registered
> [ 170.105613] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
> [ 170.108397] Pid: 3457, comm: modprobe Not tainted 2.6.39-rc2+ #14
> [ 170.108935] Call Trace:
> [ 170.109382] [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
> [ 170.109911] [<c051a42a>] ? proc_register+0x126/0x157
> [ 170.110329] [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
> [ 170.110846] [<c051a42a>] proc_register+0x126/0x157
> [ 170.111870] [<c051a4dd>] proc_create_data+0x82/0x98
> [ 170.112335] [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
> [ 170.112905] [<f94dd806>] bond_init+0x77/0xa5 [bonding]
> [ 170.113319] [<c0721ac6>] register_netdevice+0x8c/0x1d3
> [ 170.113848] [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
> [ 170.114322] [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
> [ 170.114879] [<c0401240>] do_one_initcall+0x76/0x122
> [ 170.115317] [<f94f4000>] ? 0xf94f3fff
> [ 170.115799] [<c0463f1e>] sys_init_module+0x1286/0x140d
> [ 170.116879] [<c07c6d9f>] sysenter_do_call+0x12/0x28
> [ 170.117404] ---[ end trace 64e4fac3ae5fff1a ]---
> [ 170.117924] bond%d: Warning: failed to register to debugfs
> [ 170.128728] ------------[ cut here ]------------
> [ 170.129360] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
> [ 170.130323] Hardware name: VirtualBox
> [ 170.130797] proc_dir_entry 'bonding/bond%d' already registered
> [ 170.131315] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
> [ 170.133731] Pid: 3457, comm: modprobe Tainted: G W 2.6.39-rc2+ #14
> [ 170.134308] Call Trace:
> [ 170.134743] [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
> [ 170.135305] [<c051a42a>] ? proc_register+0x126/0x157
> [ 170.135820] [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
> [ 170.137168] [<c051a42a>] proc_register+0x126/0x157
> [ 170.137700] [<c051a4dd>] proc_create_data+0x82/0x98
> [ 170.138174] [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
> [ 170.138745] [<f94dd806>] bond_init+0x77/0xa5 [bonding]
> [ 170.139278] [<c0721ac6>] register_netdevice+0x8c/0x1d3
> [ 170.139828] [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
> [ 170.140361] [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
> [ 170.140927] [<c0401240>] do_one_initcall+0x76/0x122
> [ 170.141494] [<f94f4000>] ? 0xf94f3fff
> [ 170.141975] [<c0463f1e>] sys_init_module+0x1286/0x140d
> [ 170.142463] [<c07c6d9f>] sysenter_do_call+0x12/0x28
> [ 170.142974] ---[ end trace 64e4fac3ae5fff1b ]---
> [ 170.144949] bond%d: Warning: failed to register to debugfs
>
> Signed-off-by: Pan Weiping <panweiping3@gmail.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
--
Regards,
Changli Gao(xiaosuo@gmail.com)
^ permalink raw reply
* Re: [PATCH] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: Vitalii Demianets @ 2011-05-12 15:45 UTC (permalink / raw)
To: Ben Hutchings
Cc: Jay Vosburgh, Andy Gospodarek, Arnaldo Carvalho de Melo, netdev,
bonding-devel
In-Reply-To: <1305210816.5214.9.camel@bwh-desktop>
On Thursday 12 May 2011 17:33:36 Ben Hutchings wrote:
[...]
> The '__packed' macro is preferred.
Ok, it looks more compatible, i"ll resubmit the patch with this macro.
Also, I'm slightly in doubt if I should split the patch in 2 parts, one for
bonding an one for LLC?
^ permalink raw reply
* Re: [PATCH] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: Joe Perches @ 2011-05-12 15:52 UTC (permalink / raw)
To: Vitalii Demianets
Cc: Ben Hutchings, Jay Vosburgh, Andy Gospodarek,
Arnaldo Carvalho de Melo, netdev, bonding-devel
In-Reply-To: <201105121845.49090.vitas@nppfactor.kiev.ua>
On Thu, 2011-05-12 at 18:45 +0300, Vitalii Demianets wrote:
> On Thursday 12 May 2011 17:33:36 Ben Hutchings wrote:
> [...]
> > The '__packed' macro is preferred.
> Also, I'm slightly in doubt if I should split the patch in 2 parts, one for
> bonding an one for LLC?
What arch needs __packed for
struct mac_addr {
unsigned char addr[6];
};
?
^ permalink raw reply
* Re: [PATCH] iproute2: use IFLA_TXQLEN when it is available
From: Stephen Hemminger @ 2011-05-12 15:57 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Changli Gao, netdev, kuznet
In-Reply-To: <1305175255.20318.4.camel@edumazet-laptop>
I just applied Eric's patch
--
^ permalink raw reply
* Re: [PATCH] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: Eric Dumazet @ 2011-05-12 16:02 UTC (permalink / raw)
To: Joe Perches
Cc: Vitalii Demianets, Ben Hutchings, Jay Vosburgh, Andy Gospodarek,
Arnaldo Carvalho de Melo, netdev, bonding-devel
In-Reply-To: <1305215554.6124.35.camel@Joe-Laptop>
Le jeudi 12 mai 2011 à 08:52 -0700, Joe Perches a écrit :
> On Thu, 2011-05-12 at 18:45 +0300, Vitalii Demianets wrote:
> > On Thursday 12 May 2011 17:33:36 Ben Hutchings wrote:
> > [...]
> > > The '__packed' macro is preferred.
> > Also, I'm slightly in doubt if I should split the patch in 2 parts, one for
> > bonding an one for LLC?
>
> What arch needs __packed for
>
> struct mac_addr {
> unsigned char addr[6];
> };
>
> ?
Believe it or not, they do exist. (ARM ...)
See "struct ethhdr" definition
Check commit 7cd636fe9ce5de0051c112 for another argument.
^ permalink raw reply
* Re: [RFC PATCH] net: fold dev_disable_lro() into netdev_fix_features()
From: Michał Mirosław @ 2011-05-12 16:06 UTC (permalink / raw)
To: David Miller
Cc: netdev, shemminger, kuznet, pekkas, jmorris, yoshfuji, kaber,
eric.dumazet, therbert, bhutchings, bridge
In-Reply-To: <20110509.120811.246541047.davem@davemloft.net>
On Mon, May 09, 2011 at 12:08:11PM -0700, David Miller wrote:
> From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Date: Sat, 7 May 2011 13:48:02 +0200 (CEST)
> > This moves checks that device is forwarding from bridge, IPv4 and IPv6
> > code into netdev_fix_features(). As a side effect, after device is no longer
> > forwarding it gets LRO back. This also means that user is not allowed to
> > enable LRO after device is put to forwarding mode.
> We need to keep the check in the protocols because we don't want to
> be testing protocol specific device state in generic code like
> net/core/dev.c
I modified it to use priv_flags that mirror the protocol-internal state.
Patch follows.
Best Regards,
Michał Mirosław
^ permalink raw reply
* [RFC PATCH v2] net: fold dev_disable_lro() into netdev_fix_features()
From: Michał Mirosław @ 2011-05-12 16:06 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Stephen Hemminger, Alexey Kuznetsov,
Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Eric Dumazet, Tom Herbert, Ben Hutchings, bridge
In-Reply-To: <20110507114803.0D80A13A6B@rere.qmqm.pl>
This implements checks for forwarding mode in netdev_fix_features() using
dev->priv_flags bits. As a side effect, after device is no longer
forwarding it gets LRO back. This also means that user is not allowed to
enable LRO after device is put to forwarding mode.
This patch depends on removal of discrete offload setting ethtool ops.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
v2: remove protocol-specific checks from core code
include/linux/if.h | 5 +++++
include/linux/netdevice.h | 1 -
net/bridge/br_if.c | 6 +++---
net/core/dev.c | 25 +++++--------------------
net/ipv4/devinet.c | 30 +++++++++++++++++++-----------
net/ipv6/addrconf.c | 17 +++++++++++++----
6 files changed, 45 insertions(+), 39 deletions(-)
diff --git a/include/linux/if.h b/include/linux/if.h
index 3bc63e6..1aef6a7 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -76,6 +76,11 @@
#define IFF_BRIDGE_PORT 0x4000 /* device used as bridge port */
#define IFF_OVS_DATAPATH 0x8000 /* device used as Open vSwitch
* datapath port */
+#define IFF_FORWARDING_IPV4 0x10000 /* IPv4 router port */
+#define IFF_FORWARDING_IPV6 0x20000 /* IPv6 router port */
+
+#define IFF_LRO_FORBIDDEN (IFF_BRIDGE_PORT | \
+ IFF_FORWARDING_IPV4 | IFF_FORWARDING_IPV6)
#define IF_GET_IFACE 0x0001 /* for querying only */
#define IF_GET_PROTO 0x0002
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 41972b9..f698730 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1628,7 +1628,6 @@ extern struct net_device *__dev_get_by_name(struct net *net, const char *name);
extern int dev_alloc_name(struct net_device *dev, const char *name);
extern int dev_open(struct net_device *dev);
extern int dev_close(struct net_device *dev);
-extern void dev_disable_lro(struct net_device *dev);
extern int dev_queue_xmit(struct sk_buff *skb);
extern int register_netdevice(struct net_device *dev);
extern void unregister_netdevice_queue(struct net_device *dev,
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 5dbdfdf..568a9dd 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -158,6 +158,8 @@ static void del_nbp(struct net_bridge_port *p)
br_netpoll_disable(p);
call_rcu(&p->rcu, destroy_nbp_rcu);
+
+ netdev_update_features(dev);
}
/* called with RTNL */
@@ -368,11 +370,9 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
dev->priv_flags |= IFF_BRIDGE_PORT;
- dev_disable_lro(dev);
-
list_add_rcu(&p->list, &br->port_list);
- netdev_update_features(br->dev);
+ netdev_change_features(dev);
spin_lock_bh(&br->lock);
changed_addr = br_stp_recalculate_bridge_id(br);
diff --git a/net/core/dev.c b/net/core/dev.c
index 491b8eb..b70b6c2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1296,26 +1296,6 @@ int dev_close(struct net_device *dev)
EXPORT_SYMBOL(dev_close);
-/**
- * dev_disable_lro - disable Large Receive Offload on a device
- * @dev: device
- *
- * Disable Large Receive Offload (LRO) on a net device. Must be
- * called under RTNL. This is needed if received packets may be
- * forwarded to another interface.
- */
-void dev_disable_lro(struct net_device *dev)
-{
- dev->wanted_features &= ~NETIF_F_LRO;
- netdev_update_features(dev);
-
- if (unlikely(dev->features & NETIF_F_LRO))
- netdev_WARN(dev, "failed to disable LRO!\n");
-
-}
-EXPORT_SYMBOL(dev_disable_lro);
-
-
static int dev_boot_phase = 1;
/**
@@ -5241,6 +5221,11 @@ u32 netdev_fix_features(struct net_device *dev, u32 features)
}
}
+ if (features & NETIF_F_LRO && dev->priv_flags & IFF_LRO_FORBIDDEN) {
+ netdev_info(dev, "Disabling LRO for forwarding interface.\n");
+ features &= NETIF_F_LRO;
+ }
+
return features;
}
EXPORT_SYMBOL(netdev_fix_features);
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 0d4a184..9552c68 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -221,6 +221,7 @@ void in_dev_finish_destroy(struct in_device *idev)
printk(KERN_DEBUG "in_dev_finish_destroy: %p=%s\n",
idev, dev ? dev->name : "NIL");
#endif
+ dev->priv_flags &= ~IFF_FORWARDING_IPV4;
dev_put(dev);
if (!idev->dead)
pr_err("Freeing alive in_device %p\n", idev);
@@ -229,6 +230,16 @@ void in_dev_finish_destroy(struct in_device *idev)
}
EXPORT_SYMBOL(in_dev_finish_destroy);
+static void mark_ipv4_forwarding(struct net_device *dev, bool on)
+{
+ if (on)
+ dev->priv_flags |= IFF_FORWARDING_IPV4;
+ else
+ dev->priv_flags &= ~IFF_FORWARDING_IPV4;
+
+ netdev_update_features(dev);
+}
+
static struct in_device *inetdev_init(struct net_device *dev)
{
struct in_device *in_dev;
@@ -245,8 +256,7 @@ static struct in_device *inetdev_init(struct net_device *dev)
in_dev->arp_parms = neigh_parms_alloc(dev, &arp_tbl);
if (!in_dev->arp_parms)
goto out_kfree;
- if (IPV4_DEVCONF(in_dev->cnf, FORWARDING))
- dev_disable_lro(dev);
+ mark_ipv4_forwarding(dev, IPV4_DEVCONF(in_dev->cnf, FORWARDING));
/* Reference in_dev->dev */
dev_hold(dev);
/* Account for reference dev->ip_ptr (below) */
@@ -1475,14 +1485,12 @@ static void inet_forward_change(struct net *net)
IPV4_DEVCONF_DFLT(net, FORWARDING) = on;
for_each_netdev(net, dev) {
- struct in_device *in_dev;
- if (on)
- dev_disable_lro(dev);
- rcu_read_lock();
- in_dev = __in_dev_get_rcu(dev);
- if (in_dev)
+ struct in_device *in_dev = __in_dev_get_rtnl(dev);
+
+ if (in_dev) {
IN_DEV_CONF_SET(in_dev, FORWARDING, on);
- rcu_read_unlock();
+ mark_ipv4_forwarding(in_dev->dev, on);
+ }
}
}
@@ -1527,11 +1535,11 @@ static int devinet_sysctl_forward(ctl_table *ctl, int write,
}
if (valp == &IPV4_DEVCONF_ALL(net, FORWARDING)) {
inet_forward_change(net);
- } else if (*valp) {
+ } else {
struct ipv4_devconf *cnf = ctl->extra1;
struct in_device *idev =
container_of(cnf, struct in_device, cnf);
- dev_disable_lro(idev->dev);
+ mark_ipv4_forwarding(idev->dev, *valp);
}
rtnl_unlock();
rt_cache_flush(net, 0);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f2f9b2e..7e0e8fa 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -333,6 +333,7 @@ void in6_dev_finish_destroy(struct inet6_dev *idev)
#ifdef NET_REFCNT_DEBUG
printk(KERN_DEBUG "in6_dev_finish_destroy: %s\n", dev ? dev->name : "NIL");
#endif
+ dev->priv_flags &= ~IFF_FORWARDING_IPV6;
dev_put(dev);
if (!idev->dead) {
pr_warning("Freeing alive inet6 device %p\n", idev);
@@ -344,6 +345,16 @@ void in6_dev_finish_destroy(struct inet6_dev *idev)
EXPORT_SYMBOL(in6_dev_finish_destroy);
+static void mark_ipv6_forwarding(struct net_device *dev, bool on)
+{
+ if (on)
+ dev->priv_flags |= IFF_FORWARDING_IPV6;
+ else
+ dev->priv_flags &= ~IFF_FORWARDING_IPV6;
+
+ netdev_update_features(dev);
+}
+
static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
{
struct inet6_dev *ndev;
@@ -370,8 +381,7 @@ static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
kfree(ndev);
return NULL;
}
- if (ndev->cnf.forwarding)
- dev_disable_lro(dev);
+ mark_ipv6_forwarding(dev, ndev->cnf.forwarding);
/* We refer to the device */
dev_hold(dev);
@@ -469,8 +479,7 @@ static void dev_forward_change(struct inet6_dev *idev)
if (!idev)
return;
dev = idev->dev;
- if (idev->cnf.forwarding)
- dev_disable_lro(dev);
+ mark_ipv6_forwarding(dev, idev->cnf.forwarding);
if (dev && (dev->flags & IFF_MULTICAST)) {
if (idev->cnf.forwarding)
ipv6_dev_mc_inc(dev, &in6addr_linklocal_allrouters);
--
1.7.2.5
^ permalink raw reply related
* pull request: sfc-2.6 2011-05-12
From: Ben Hutchings @ 2011-05-12 16:07 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
The following changes since commit ff538818f4a82c4cf02d2d6bd6ac5c7360b9d41d:
sysctl: net: call unregister_net_sysctl_table where needed (2011-05-02 16:12:14 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6.git sfc-2.6.39
Ben Hutchings (1):
sfc: Always map MCDI shared memory as uncacheable
Sorry for sending this so late. I was really hoping to get to the
bottom of this issue and find a simple fix.
Ben.
drivers/net/sfc/mcdi.c | 49 ++++++++++++++++++++++++++++------------------
drivers/net/sfc/nic.h | 2 +
drivers/net/sfc/siena.c | 25 ++++++++++++++++++++---
3 files changed, 53 insertions(+), 23 deletions(-)
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* [PATCH net-2.6] sfc: Always map MCDI shared memory as uncacheable
From: Ben Hutchings @ 2011-05-12 16:09 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1305216428.5214.13.camel@bwh-desktop>
We enabled write-combining for memory-mapped registers in commit
65f0b417dee94f779ce9b77102b7d73c93723b39, but inhibited it for the
MCDI shared memory where this is not supported. However,
write-combining mappings also allow read-reordering, which may also
be a problem.
I found that when an SFC9000-family controller is connected to an
Intel 3000 chipset, and write-combining is enabled, the controller
stops responding to PCIe read requests during driver initialisation
while the driver is polling for completion of an MCDI command. This
results in an NMI and system hang. Adding read memory barriers
between all reads to the shared memory area appears to reduce but not
eliminate the probability of this.
We have not yet established whether this is a bug in our BIU or in the
PCIe bridge. For now, work around by mapping the shared memory area
separately.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/sfc/mcdi.c | 49 ++++++++++++++++++++++++++++------------------
drivers/net/sfc/nic.h | 2 +
drivers/net/sfc/siena.c | 25 ++++++++++++++++++++---
3 files changed, 53 insertions(+), 23 deletions(-)
diff --git a/drivers/net/sfc/mcdi.c b/drivers/net/sfc/mcdi.c
index d984790..3dd45ed 100644
--- a/drivers/net/sfc/mcdi.c
+++ b/drivers/net/sfc/mcdi.c
@@ -50,6 +50,20 @@ static inline struct efx_mcdi_iface *efx_mcdi(struct efx_nic *efx)
return &nic_data->mcdi;
}
+static inline void
+efx_mcdi_readd(struct efx_nic *efx, efx_dword_t *value, unsigned reg)
+{
+ struct siena_nic_data *nic_data = efx->nic_data;
+ value->u32[0] = (__force __le32)__raw_readl(nic_data->mcdi_smem + reg);
+}
+
+static inline void
+efx_mcdi_writed(struct efx_nic *efx, const efx_dword_t *value, unsigned reg)
+{
+ struct siena_nic_data *nic_data = efx->nic_data;
+ __raw_writel((__force u32)value->u32[0], nic_data->mcdi_smem + reg);
+}
+
void efx_mcdi_init(struct efx_nic *efx)
{
struct efx_mcdi_iface *mcdi;
@@ -70,8 +84,8 @@ static void efx_mcdi_copyin(struct efx_nic *efx, unsigned cmd,
const u8 *inbuf, size_t inlen)
{
struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
- unsigned pdu = FR_CZ_MC_TREG_SMEM + MCDI_PDU(efx);
- unsigned doorbell = FR_CZ_MC_TREG_SMEM + MCDI_DOORBELL(efx);
+ unsigned pdu = MCDI_PDU(efx);
+ unsigned doorbell = MCDI_DOORBELL(efx);
unsigned int i;
efx_dword_t hdr;
u32 xflags, seqno;
@@ -92,30 +106,28 @@ static void efx_mcdi_copyin(struct efx_nic *efx, unsigned cmd,
MCDI_HEADER_SEQ, seqno,
MCDI_HEADER_XFLAGS, xflags);
- efx_writed(efx, &hdr, pdu);
+ efx_mcdi_writed(efx, &hdr, pdu);
- for (i = 0; i < inlen; i += 4) {
- _efx_writed(efx, *((__le32 *)(inbuf + i)), pdu + 4 + i);
- /* use wmb() within loop to inhibit write combining */
- wmb();
- }
+ for (i = 0; i < inlen; i += 4)
+ efx_mcdi_writed(efx, (const efx_dword_t *)(inbuf + i),
+ pdu + 4 + i);
/* ring the doorbell with a distinctive value */
- _efx_writed(efx, (__force __le32) 0x45789abc, doorbell);
- wmb();
+ EFX_POPULATE_DWORD_1(hdr, EFX_DWORD_0, 0x45789abc);
+ efx_mcdi_writed(efx, &hdr, doorbell);
}
static void efx_mcdi_copyout(struct efx_nic *efx, u8 *outbuf, size_t outlen)
{
struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
- unsigned int pdu = FR_CZ_MC_TREG_SMEM + MCDI_PDU(efx);
+ unsigned int pdu = MCDI_PDU(efx);
int i;
BUG_ON(atomic_read(&mcdi->state) == MCDI_STATE_QUIESCENT);
BUG_ON(outlen & 3 || outlen >= 0x100);
for (i = 0; i < outlen; i += 4)
- *((__le32 *)(outbuf + i)) = _efx_readd(efx, pdu + 4 + i);
+ efx_mcdi_readd(efx, (efx_dword_t *)(outbuf + i), pdu + 4 + i);
}
static int efx_mcdi_poll(struct efx_nic *efx)
@@ -123,7 +135,7 @@ static int efx_mcdi_poll(struct efx_nic *efx)
struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
unsigned int time, finish;
unsigned int respseq, respcmd, error;
- unsigned int pdu = FR_CZ_MC_TREG_SMEM + MCDI_PDU(efx);
+ unsigned int pdu = MCDI_PDU(efx);
unsigned int rc, spins;
efx_dword_t reg;
@@ -149,8 +161,7 @@ static int efx_mcdi_poll(struct efx_nic *efx)
time = get_seconds();
- rmb();
- efx_readd(efx, ®, pdu);
+ efx_mcdi_readd(efx, ®, pdu);
/* All 1's indicates that shared memory is in reset (and is
* not a valid header). Wait for it to come out reset before
@@ -177,7 +188,7 @@ static int efx_mcdi_poll(struct efx_nic *efx)
respseq, mcdi->seqno);
rc = EIO;
} else if (error) {
- efx_readd(efx, ®, pdu + 4);
+ efx_mcdi_readd(efx, ®, pdu + 4);
switch (EFX_DWORD_FIELD(reg, EFX_DWORD_0)) {
#define TRANSLATE_ERROR(name) \
case MC_CMD_ERR_ ## name: \
@@ -211,21 +222,21 @@ out:
/* Test and clear MC-rebooted flag for this port/function */
int efx_mcdi_poll_reboot(struct efx_nic *efx)
{
- unsigned int addr = FR_CZ_MC_TREG_SMEM + MCDI_REBOOT_FLAG(efx);
+ unsigned int addr = MCDI_REBOOT_FLAG(efx);
efx_dword_t reg;
uint32_t value;
if (efx_nic_rev(efx) < EFX_REV_SIENA_A0)
return false;
- efx_readd(efx, ®, addr);
+ efx_mcdi_readd(efx, ®, addr);
value = EFX_DWORD_FIELD(reg, EFX_DWORD_0);
if (value == 0)
return 0;
EFX_ZERO_DWORD(reg);
- efx_writed(efx, ®, addr);
+ efx_mcdi_writed(efx, ®, addr);
if (value == MC_STATUS_DWORD_ASSERT)
return -EINTR;
diff --git a/drivers/net/sfc/nic.h b/drivers/net/sfc/nic.h
index a42db6e..d91701a 100644
--- a/drivers/net/sfc/nic.h
+++ b/drivers/net/sfc/nic.h
@@ -143,10 +143,12 @@ static inline struct falcon_board *falcon_board(struct efx_nic *efx)
/**
* struct siena_nic_data - Siena NIC state
* @mcdi: Management-Controller-to-Driver Interface
+ * @mcdi_smem: MCDI shared memory mapping. The mapping is always uncacheable.
* @wol_filter_id: Wake-on-LAN packet filter id
*/
struct siena_nic_data {
struct efx_mcdi_iface mcdi;
+ void __iomem *mcdi_smem;
int wol_filter_id;
};
diff --git a/drivers/net/sfc/siena.c b/drivers/net/sfc/siena.c
index e4dd898..837869b 100644
--- a/drivers/net/sfc/siena.c
+++ b/drivers/net/sfc/siena.c
@@ -220,12 +220,26 @@ static int siena_probe_nic(struct efx_nic *efx)
efx_reado(efx, ®, FR_AZ_CS_DEBUG);
efx->net_dev->dev_id = EFX_OWORD_FIELD(reg, FRF_CZ_CS_PORT_NUM) - 1;
+ /* Initialise MCDI */
+ nic_data->mcdi_smem = ioremap_nocache(efx->membase_phys +
+ FR_CZ_MC_TREG_SMEM,
+ FR_CZ_MC_TREG_SMEM_STEP *
+ FR_CZ_MC_TREG_SMEM_ROWS);
+ if (!nic_data->mcdi_smem) {
+ netif_err(efx, probe, efx->net_dev,
+ "could not map MCDI at %llx+%x\n",
+ (unsigned long long)efx->membase_phys +
+ FR_CZ_MC_TREG_SMEM,
+ FR_CZ_MC_TREG_SMEM_STEP * FR_CZ_MC_TREG_SMEM_ROWS);
+ rc = -ENOMEM;
+ goto fail1;
+ }
efx_mcdi_init(efx);
/* Recover from a failed assertion before probing */
rc = efx_mcdi_handle_assertion(efx);
if (rc)
- goto fail1;
+ goto fail2;
/* Let the BMC know that the driver is now in charge of link and
* filter settings. We must do this before we reset the NIC */
@@ -280,6 +294,7 @@ fail4:
fail3:
efx_mcdi_drv_attach(efx, false, NULL);
fail2:
+ iounmap(nic_data->mcdi_smem);
fail1:
kfree(efx->nic_data);
return rc;
@@ -359,6 +374,8 @@ static int siena_init_nic(struct efx_nic *efx)
static void siena_remove_nic(struct efx_nic *efx)
{
+ struct siena_nic_data *nic_data = efx->nic_data;
+
efx_nic_free_buffer(efx, &efx->irq_status);
siena_reset_hw(efx, RESET_TYPE_ALL);
@@ -368,7 +385,8 @@ static void siena_remove_nic(struct efx_nic *efx)
efx_mcdi_drv_attach(efx, false, NULL);
/* Tear down the private nic state */
- kfree(efx->nic_data);
+ iounmap(nic_data->mcdi_smem);
+ kfree(nic_data);
efx->nic_data = NULL;
}
@@ -606,8 +624,7 @@ struct efx_nic_type siena_a0_nic_type = {
.default_mac_ops = &efx_mcdi_mac_operations,
.revision = EFX_REV_SIENA_A0,
- .mem_map_size = (FR_CZ_MC_TREG_SMEM +
- FR_CZ_MC_TREG_SMEM_STEP * FR_CZ_MC_TREG_SMEM_ROWS),
+ .mem_map_size = FR_CZ_MC_TREG_SMEM, /* MC_TREG_SMEM mapped separately */
.txd_ptr_tbl_base = FR_BZ_TX_DESC_PTR_TBL,
.rxd_ptr_tbl_base = FR_BZ_RX_DESC_PTR_TBL,
.buf_tbl_base = FR_BZ_BUF_FULL_TBL,
--
1.7.4
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* Re: [PATCH] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: Vitalii Demianets @ 2011-05-12 16:10 UTC (permalink / raw)
To: Joe Perches
Cc: Ben Hutchings, Jay Vosburgh, Andy Gospodarek,
Arnaldo Carvalho de Melo, netdev, bonding-devel
In-Reply-To: <1305215554.6124.35.camel@Joe-Laptop>
On Thursday 12 May 2011 18:52:34 you wrote:
[...]
> What arch needs __packed for
>
> struct mac_addr {
> unsigned char addr[6];
> };
>
> ?
Currently I'm working with AT91SAM9260 which is an ARM CPU, and without
__packed I get sizeof(struct mac_addr) == 8.
^ permalink raw reply
* Re: [PATCH] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: Joe Perches @ 2011-05-12 16:16 UTC (permalink / raw)
To: Vitalii Demianets
Cc: Jay Vosburgh, Ben Hutchings, Andy Gospodarek,
Arnaldo Carvalho de Melo, netdev, bonding-devel
In-Reply-To: <201105121910.13090.vitas@nppfactor.kiev.ua>
On Thu, 2011-05-12 at 19:10 +0300, Vitalii Demianets wrote:
> On Thursday 12 May 2011 18:52:34 you wrote:
> [...]
> > What arch needs __packed for
> > struct mac_addr {
> > unsigned char addr[6];
> > };
> > ?
> Currently I'm working with AT91SAM9260 which is an ARM CPU, and without
> __packed I get sizeof(struct mac_addr) == 8.
Thanks.
^ permalink raw reply
* Re: [RFC PATCH v2] net: fold dev_disable_lro() into netdev_fix_features()
From: Ben Hutchings @ 2011-05-12 16:29 UTC (permalink / raw)
To: Michał Mirosław
Cc: netdev, David S. Miller, Stephen Hemminger, Alexey Kuznetsov,
Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Eric Dumazet, Tom Herbert, bridge
In-Reply-To: <20110512160640.2A0B713A6B@rere.qmqm.pl>
On Thu, 2011-05-12 at 18:06 +0200, Michał Mirosław wrote:
> This implements checks for forwarding mode in netdev_fix_features() using
> dev->priv_flags bits. As a side effect, after device is no longer
> forwarding it gets LRO back. This also means that user is not allowed to
> enable LRO after device is put to forwarding mode.
>
> This patch depends on removal of discrete offload setting ethtool ops.
This is nice, but:
[...]
> --- a/net/bridge/br_if.c
> +++ b/net/bridge/br_if.c
> @@ -158,6 +158,8 @@ static void del_nbp(struct net_bridge_port *p)
> br_netpoll_disable(p);
>
> call_rcu(&p->rcu, destroy_nbp_rcu);
> +
> + netdev_update_features(dev);
> }
>
> /* called with RTNL */
> @@ -368,11 +370,9 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
>
> dev->priv_flags |= IFF_BRIDGE_PORT;
>
> - dev_disable_lro(dev);
> -
> list_add_rcu(&p->list, &br->port_list);
>
> - netdev_update_features(br->dev);
> + netdev_change_features(dev);
>
> spin_lock_bh(&br->lock);
> changed_addr = br_stp_recalculate_bridge_id(br);
Why netdev_change_features() here? I thought that was primarily for use
when vlan_features may have been changed.
[...]
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
[...]
> @@ -5241,6 +5221,11 @@ u32 netdev_fix_features(struct net_device *dev, u32 features)
> }
> }
>
> + if (features & NETIF_F_LRO && dev->priv_flags & IFF_LRO_FORBIDDEN) {
> + netdev_info(dev, "Disabling LRO for forwarding interface.\n");
> + features &= NETIF_F_LRO;
[...]
Mising '~'.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [RFC PATCH v2] net: fold dev_disable_lro() into netdev_fix_features()
From: Michał Mirosław @ 2011-05-12 16:35 UTC (permalink / raw)
To: Ben Hutchings
Cc: netdev, David S. Miller, Stephen Hemminger, Alexey Kuznetsov,
Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Eric Dumazet, Tom Herbert, bridge
In-Reply-To: <1305217747.5214.17.camel@bwh-desktop>
On Thu, May 12, 2011 at 05:29:07PM +0100, Ben Hutchings wrote:
> On Thu, 2011-05-12 at 18:06 +0200, Michał Mirosław wrote:
> > This implements checks for forwarding mode in netdev_fix_features() using
> > dev->priv_flags bits. As a side effect, after device is no longer
> > forwarding it gets LRO back. This also means that user is not allowed to
> > enable LRO after device is put to forwarding mode.
> >
> > This patch depends on removal of discrete offload setting ethtool ops.
>
> This is nice, but:
>
> [...]
> > --- a/net/bridge/br_if.c
> > +++ b/net/bridge/br_if.c
> > @@ -158,6 +158,8 @@ static void del_nbp(struct net_bridge_port *p)
> > br_netpoll_disable(p);
> >
> > call_rcu(&p->rcu, destroy_nbp_rcu);
> > +
> > + netdev_update_features(dev);
> > }
> >
> > /* called with RTNL */
> > @@ -368,11 +370,9 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
> >
> > dev->priv_flags |= IFF_BRIDGE_PORT;
> >
> > - dev_disable_lro(dev);
> > -
> > list_add_rcu(&p->list, &br->port_list);
> >
> > - netdev_update_features(br->dev);
> > + netdev_change_features(dev);
> >
> > spin_lock_bh(&br->lock);
> > changed_addr = br_stp_recalculate_bridge_id(br);
> Why netdev_change_features() here? I thought that was primarily for use
> when vlan_features may have been changed.
Because this recalculates port's features and then cascades to bridge's
device. I could have written:
netdev_update_features(dev);
netdev_update_features(br->dev);
But that might cause redundant recalculations for br->dev (first through
notifier calls). I'll add a comment here.
>
> [...]
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> [...]
> > @@ -5241,6 +5221,11 @@ u32 netdev_fix_features(struct net_device *dev, u32 features)
> > }
> > }
> >
> > + if (features & NETIF_F_LRO && dev->priv_flags & IFF_LRO_FORBIDDEN) {
> > + netdev_info(dev, "Disabling LRO for forwarding interface.\n");
> > + features &= NETIF_F_LRO;
> [...]
>
> Mising '~'.
Fixed.
Best Regards,
Michał Mirosław
^ permalink raw reply
* [RFC PATCH v3] net: fold dev_disable_lro() into netdev_fix_features()
From: Michał Mirosław @ 2011-05-12 16:37 UTC (permalink / raw)
To: netdev
Cc: netdev, David S. Miller, Stephen Hemminger, Alexey Kuznetsov,
Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Eric Dumazet, Tom Herbert, bridge
In-Reply-To: <20110512160640.2A0B713A6B@rere.qmqm.pl>
This implements checks for forwarding mode in netdev_fix_features() using
dev->priv_flags bits. As a side effect, after device is no longer
forwarding it gets LRO back. This also means that user is not allowed to
enable LRO after device is put to forwarding mode.
This patch depends on removal of discrete offload setting ethtool ops.
v2: remove protocol-specific checks from core code
v3: add comment and missing negation
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
include/linux/if.h | 5 +++++
include/linux/netdevice.h | 1 -
net/bridge/br_if.c | 9 ++++++---
net/core/dev.c | 25 +++++--------------------
net/ipv4/devinet.c | 30 +++++++++++++++++++-----------
net/ipv6/addrconf.c | 17 +++++++++++++----
6 files changed, 48 insertions(+), 39 deletions(-)
diff --git a/include/linux/if.h b/include/linux/if.h
index 3bc63e6..1aef6a7 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -76,6 +76,11 @@
#define IFF_BRIDGE_PORT 0x4000 /* device used as bridge port */
#define IFF_OVS_DATAPATH 0x8000 /* device used as Open vSwitch
* datapath port */
+#define IFF_FORWARDING_IPV4 0x10000 /* IPv4 router port */
+#define IFF_FORWARDING_IPV6 0x20000 /* IPv6 router port */
+
+#define IFF_LRO_FORBIDDEN (IFF_BRIDGE_PORT | \
+ IFF_FORWARDING_IPV4 | IFF_FORWARDING_IPV6)
#define IF_GET_IFACE 0x0001 /* for querying only */
#define IF_GET_PROTO 0x0002
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 41972b9..f698730 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1628,7 +1628,6 @@ extern struct net_device *__dev_get_by_name(struct net *net, const char *name);
extern int dev_alloc_name(struct net_device *dev, const char *name);
extern int dev_open(struct net_device *dev);
extern int dev_close(struct net_device *dev);
-extern void dev_disable_lro(struct net_device *dev);
extern int dev_queue_xmit(struct sk_buff *skb);
extern int register_netdevice(struct net_device *dev);
extern void unregister_netdevice_queue(struct net_device *dev,
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 5dbdfdf..31ba100 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -158,6 +158,8 @@ static void del_nbp(struct net_bridge_port *p)
br_netpoll_disable(p);
call_rcu(&p->rcu, destroy_nbp_rcu);
+
+ netdev_update_features(dev);
}
/* called with RTNL */
@@ -368,11 +370,12 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
dev->priv_flags |= IFF_BRIDGE_PORT;
- dev_disable_lro(dev);
-
list_add_rcu(&p->list, &br->port_list);
- netdev_update_features(br->dev);
+ /* possibly disable LRO and force recalculation of bridge dev's
+ * features through NETDEV_FEAT_CHANGE notifier call.
+ */
+ netdev_change_features(dev);
spin_lock_bh(&br->lock);
changed_addr = br_stp_recalculate_bridge_id(br);
diff --git a/net/core/dev.c b/net/core/dev.c
index 491b8eb..7c7df7c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1296,26 +1296,6 @@ int dev_close(struct net_device *dev)
EXPORT_SYMBOL(dev_close);
-/**
- * dev_disable_lro - disable Large Receive Offload on a device
- * @dev: device
- *
- * Disable Large Receive Offload (LRO) on a net device. Must be
- * called under RTNL. This is needed if received packets may be
- * forwarded to another interface.
- */
-void dev_disable_lro(struct net_device *dev)
-{
- dev->wanted_features &= ~NETIF_F_LRO;
- netdev_update_features(dev);
-
- if (unlikely(dev->features & NETIF_F_LRO))
- netdev_WARN(dev, "failed to disable LRO!\n");
-
-}
-EXPORT_SYMBOL(dev_disable_lro);
-
-
static int dev_boot_phase = 1;
/**
@@ -5241,6 +5221,11 @@ u32 netdev_fix_features(struct net_device *dev, u32 features)
}
}
+ if (features & NETIF_F_LRO && dev->priv_flags & IFF_LRO_FORBIDDEN) {
+ netdev_info(dev, "Disabling LRO for forwarding interface.\n");
+ features &= ~NETIF_F_LRO;
+ }
+
return features;
}
EXPORT_SYMBOL(netdev_fix_features);
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 0d4a184..9552c68 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -221,6 +221,7 @@ void in_dev_finish_destroy(struct in_device *idev)
printk(KERN_DEBUG "in_dev_finish_destroy: %p=%s\n",
idev, dev ? dev->name : "NIL");
#endif
+ dev->priv_flags &= ~IFF_FORWARDING_IPV4;
dev_put(dev);
if (!idev->dead)
pr_err("Freeing alive in_device %p\n", idev);
@@ -229,6 +230,16 @@ void in_dev_finish_destroy(struct in_device *idev)
}
EXPORT_SYMBOL(in_dev_finish_destroy);
+static void mark_ipv4_forwarding(struct net_device *dev, bool on)
+{
+ if (on)
+ dev->priv_flags |= IFF_FORWARDING_IPV4;
+ else
+ dev->priv_flags &= ~IFF_FORWARDING_IPV4;
+
+ netdev_update_features(dev);
+}
+
static struct in_device *inetdev_init(struct net_device *dev)
{
struct in_device *in_dev;
@@ -245,8 +256,7 @@ static struct in_device *inetdev_init(struct net_device *dev)
in_dev->arp_parms = neigh_parms_alloc(dev, &arp_tbl);
if (!in_dev->arp_parms)
goto out_kfree;
- if (IPV4_DEVCONF(in_dev->cnf, FORWARDING))
- dev_disable_lro(dev);
+ mark_ipv4_forwarding(dev, IPV4_DEVCONF(in_dev->cnf, FORWARDING));
/* Reference in_dev->dev */
dev_hold(dev);
/* Account for reference dev->ip_ptr (below) */
@@ -1475,14 +1485,12 @@ static void inet_forward_change(struct net *net)
IPV4_DEVCONF_DFLT(net, FORWARDING) = on;
for_each_netdev(net, dev) {
- struct in_device *in_dev;
- if (on)
- dev_disable_lro(dev);
- rcu_read_lock();
- in_dev = __in_dev_get_rcu(dev);
- if (in_dev)
+ struct in_device *in_dev = __in_dev_get_rtnl(dev);
+
+ if (in_dev) {
IN_DEV_CONF_SET(in_dev, FORWARDING, on);
- rcu_read_unlock();
+ mark_ipv4_forwarding(in_dev->dev, on);
+ }
}
}
@@ -1527,11 +1535,11 @@ static int devinet_sysctl_forward(ctl_table *ctl, int write,
}
if (valp == &IPV4_DEVCONF_ALL(net, FORWARDING)) {
inet_forward_change(net);
- } else if (*valp) {
+ } else {
struct ipv4_devconf *cnf = ctl->extra1;
struct in_device *idev =
container_of(cnf, struct in_device, cnf);
- dev_disable_lro(idev->dev);
+ mark_ipv4_forwarding(idev->dev, *valp);
}
rtnl_unlock();
rt_cache_flush(net, 0);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f2f9b2e..7e0e8fa 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -333,6 +333,7 @@ void in6_dev_finish_destroy(struct inet6_dev *idev)
#ifdef NET_REFCNT_DEBUG
printk(KERN_DEBUG "in6_dev_finish_destroy: %s\n", dev ? dev->name : "NIL");
#endif
+ dev->priv_flags &= ~IFF_FORWARDING_IPV6;
dev_put(dev);
if (!idev->dead) {
pr_warning("Freeing alive inet6 device %p\n", idev);
@@ -344,6 +345,16 @@ void in6_dev_finish_destroy(struct inet6_dev *idev)
EXPORT_SYMBOL(in6_dev_finish_destroy);
+static void mark_ipv6_forwarding(struct net_device *dev, bool on)
+{
+ if (on)
+ dev->priv_flags |= IFF_FORWARDING_IPV6;
+ else
+ dev->priv_flags &= ~IFF_FORWARDING_IPV6;
+
+ netdev_update_features(dev);
+}
+
static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
{
struct inet6_dev *ndev;
@@ -370,8 +381,7 @@ static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
kfree(ndev);
return NULL;
}
- if (ndev->cnf.forwarding)
- dev_disable_lro(dev);
+ mark_ipv6_forwarding(dev, ndev->cnf.forwarding);
/* We refer to the device */
dev_hold(dev);
@@ -469,8 +479,7 @@ static void dev_forward_change(struct inet6_dev *idev)
if (!idev)
return;
dev = idev->dev;
- if (idev->cnf.forwarding)
- dev_disable_lro(dev);
+ mark_ipv6_forwarding(dev, idev->cnf.forwarding);
if (dev && (dev->flags & IFF_MULTICAST)) {
if (idev->cnf.forwarding)
ipv6_dev_mc_inc(dev, &in6addr_linklocal_allrouters);
--
1.7.2.5
^ permalink raw reply related
* [PATCH RESEND net-next] ethtool: bring back missing comma in netdev features strings
From: franco @ 2011-05-12 16:42 UTC (permalink / raw)
To: netdev; +Cc: maheshb, mirqus, franco
The issue was introduced in commit eed2a12f1ed9aabf.
Signed-off-by: Franco Fichtner <franco@lastsummer.de>
---
net/core/ethtool.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index b6f4058..b8c2b10 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -361,7 +361,7 @@ static const char
netdev_features_strings[ETHTOOL_DEV_FEATURE_WORDS * 32][ETH_GS
/* NETIF_F_NTUPLE */ "rx-ntuple-filter",
/* NETIF_F_RXHASH */ "rx-hashing",
/* NETIF_F_RXCSUM */ "rx-checksum",
- /* NETIF_F_NOCACHE_COPY */ "tx-nocache-copy"
+ /* NETIF_F_NOCACHE_COPY */ "tx-nocache-copy",
/* NETIF_F_LOOPBACK */ "loopback",
};
--
1.7.3.2.493.g0b0cd
^ permalink raw reply related
* Re: [PATCH RESEND net-next] ethtool: bring back missing comma in netdev features strings
From: Michał Mirosław @ 2011-05-12 16:45 UTC (permalink / raw)
To: franco; +Cc: netdev, maheshb
In-Reply-To: <20110512164204.EC8391220061@host64.kissl.de>
2011/5/12 <franco@lastsummer.de>:
> The issue was introduced in commit eed2a12f1ed9aabf.
>
> Signed-off-by: Franco Fichtner <franco@lastsummer.de>
> ---
> net/core/ethtool.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index b6f4058..b8c2b10 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -361,7 +361,7 @@ static const char
> netdev_features_strings[ETHTOOL_DEV_FEATURE_WORDS * 32][ETH_GS
> /* NETIF_F_NTUPLE */ "rx-ntuple-filter",
> /* NETIF_F_RXHASH */ "rx-hashing",
> /* NETIF_F_RXCSUM */ "rx-checksum",
> - /* NETIF_F_NOCACHE_COPY */ "tx-nocache-copy"
> + /* NETIF_F_NOCACHE_COPY */ "tx-nocache-copy",
> /* NETIF_F_LOOPBACK */ "loopback",
> };
Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
^ permalink raw reply
* Re: [RFC PATCH v2] net: fold dev_disable_lro() into netdev_fix_features()
From: Stephen Hemminger @ 2011-05-12 16:57 UTC (permalink / raw)
To: Ben Hutchings
Cc: Michał Mirosław, netdev, David S. Miller,
Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Eric Dumazet, Tom Herbert,
bridge
In-Reply-To: <1305217747.5214.17.camel@bwh-desktop>
On Thu, 12 May 2011 17:29:07 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:
> > dev->priv_flags |= IFF_BRIDGE_PORT;
> >
> > - dev_disable_lro(dev);
> > -
> > list_add_rcu(&p->list, &br->port_list);
> >
> > - netdev_update_features(br->dev);
> > + netdev_change_features(dev);
> >
> > spin_lock_bh(&br->lock);
> > changed_addr = br_stp_recalculate_bridge_id(br);
>
> Why netdev_change_features() here? I thought that was primarily for use
> when vlan_features may have been changed.
Setting IFF_BRIDGE_PORT in priv_flags causes change_features
to disable LRO.
--
^ permalink raw reply
* Re: [PATCHv3 net-next-2.6] ethtool: Added support for FW dump
From: Ben Hutchings @ 2011-05-12 17:04 UTC (permalink / raw)
To: Anirban Chakraborty; +Cc: netdev, David Miller
In-Reply-To: <1305154448-9687-2-git-send-email-anirban.chakraborty@qlogic.com>
On Wed, 2011-05-11 at 15:54 -0700, Anirban Chakraborty wrote:
> Added code to take FW dump via ethtool. Dump level can be controlled via setting the
> dump flag. A get function is provided to query the current setting of the dump flag.
> Dump data is obtained from the driver via a separate get function.
>
> Changes from v2:
> Provided separate commands for get flag and data.
> Check for minimum of the two buffer length obtained via ethtool and driver and
> use that for dump buffer
> Pass up the driver return error codes up to the caller.
> Added kernel doc comments.
>
> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
> ---
> include/linux/ethtool.h | 28 +++++++++++++++
> net/core/ethtool.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 116 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index bd0b50b..5ac7e18 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -601,6 +601,24 @@ struct ethtool_flash {
> char data[ETHTOOL_FLASH_MAX_FILENAME];
> };
>
> +/**
> + * struct ethtool_dump - used for retrieving, setting device dump
> + * @cmd: Command number - %ETHTOOL_GET_DUMP_FLAG, %ETHTOOL_GET_DUMP_DATA, or
> + * %ETHTOOL_SET_DUMP
> + * @version: FW version of the dump, filled in by driver
> + * @flag: driver dependent flag for dump setting, filled in by driver during
> + * get and filled in by ethtool for set operation
> + * @len: length of dump data, returned by driver for get operation
This is also used as the length of the user buffer on entry to
ETHTOOL_GET_DUMP_DATA.
> + * @data: data collected for get dump data operation
> + */
> +struct ethtool_dump {
> + __u32 cmd;
> + __u32 version;
> + __u32 flag;
> + __u32 len;
> + u8 data[0];
> +};
> +
> /* for returning and changing feature sets */
>
> /**
> @@ -853,6 +871,9 @@ bool ethtool_invalid_flags(struct net_device *dev, u32 data, u32 supported);
> * @get_channels: Get number of channels.
> * @set_channels: Set number of channels. Returns a negative error code or
> * zero.
> + * @get_dump_flag: Get dump flag indicating current dump settings of the device.
This operation must also provide the dump length.
> + * @get_dump_data: Get dump data.
> + * @set_dump: Set dump specific flags to the device.
> *
> * All operations are optional (i.e. the function pointer may be set
> * to %NULL) and callers must take this into account. Callers must
> @@ -927,6 +948,10 @@ struct ethtool_ops {
> const struct ethtool_rxfh_indir *);
> void (*get_channels)(struct net_device *, struct ethtool_channels *);
> int (*set_channels)(struct net_device *, struct ethtool_channels *);
> + int (*get_dump_flag)(struct net_device *, struct ethtool_dump *);
> + int (*get_dump_data)(struct net_device *,
> + struct ethtool_dump *, void *);
> + int (*set_dump)(struct net_device *, struct ethtool_dump *);
>
> };
> #endif /* __KERNEL__ */
[...]
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
[...]
> +static int ethtool_get_dump_data(struct net_device *dev,
> + void __user *useraddr)
> +{
> + int ret;
> + __u32 len;
> + struct ethtool_dump dump, tmp;
> + const struct ethtool_ops *ops = dev->ethtool_ops;
> + void *data = NULL;
> +
> + if (!dev->ethtool_ops->get_dump_data ||
> + !dev->ethtool_ops->get_dump_flag)
> + return -EOPNOTSUPP;
> +
> + if (copy_from_user(&dump, useraddr, sizeof(dump)))
> + return -EFAULT;
> +
> + memset(&tmp, 0, sizeof(tmp));
> + tmp.cmd = dump.cmd;
tmp.cmd should be ETHTOOL_GET_DUMP_FLAG.
> + ret = ops->get_dump_flag(dev, &tmp);
> + if (ret)
> + return ret;
> +
> + len = (tmp.len > dump.len) ? dump.len : tmp.len;
> +
> + data = vzalloc(len);
> + if (!data)
> + return -ENOMEM;
[...]
The kernel buffer size must be tmp.len, otherwise the driver may overrun
the buffer.
'len' is the minimum of the two buffer sizes and should only be used as
the size copied to user space (which you have done).
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox