* Re: net-next warning at kernel/softirq.c:159 local_bh_enable
From: Or Gerlitz @ 2012-07-06 4:17 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Shlomo Pongratz
In-Reply-To: <alpine.LRH.2.00.1207060616590.21283@ogerlitz.voltaire.com>
FWIW - I've updated my net-next clone to the lastest and still
see this, even after the IPv6 related fix.
Or.
IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
------------[ cut here ]------------
WARNING: at kernel/softirq.c:159 local_bh_enable+0x45/0xc2()
Hardware name: X7DWU
Modules linked in: ib_ipoib ib_cm ib_sa ib_uverbs netconsole nfs lockd auth_rpcgss nfs_acl autofs4 sunrpc fcoe libfcoe libfc scsi_transport_fc 8021q ipmi_devintf ipmi_si ipmi_msghandler ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr rng_core ioatdma dca dm_mod shpchp button sr_mod ext3 jbd usb_storage sd_mod ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12501-g1405080-dirty #86
Call Trace:
<IRQ> [<ffffffff81027f1d>] warn_slowpath_common+0x80/0x98
[<ffffffffa0245266>] ? ip6_neigh_lookup+0x157/0x187 [ipv6]
[<ffffffff81027f4a>] warn_slowpath_null+0x15/0x17
[<ffffffff8102f56a>] local_bh_enable+0x45/0xc2
[<ffffffffa0245266>] ip6_neigh_lookup+0x157/0x187 [ipv6]
[<ffffffffa0245168>] ? ip6_neigh_lookup+0x59/0x187 [ipv6]
[<ffffffffa030895d>] ipoib_mcast_send+0x375/0x48b [ib_ipoib]
[<ffffffffa0308908>] ? ipoib_mcast_send+0x320/0x48b [ib_ipoib]
[<ffffffff8102f5e2>] ? local_bh_enable+0xbd/0xc2
[<ffffffffa030494f>] ipoib_path_lookup+0x2f3/0x331 [ib_ipoib]
[<ffffffffa0245168>] ? ip6_neigh_lookup+0x59/0x187 [ipv6]
[<ffffffffa0304b4d>] ipoib_start_xmit+0x1c0/0x54f [ib_ipoib]
[<ffffffffa030498d>] ? ipoib_path_lookup+0x331/0x331 [ib_ipoib]
[<ffffffff812c1be3>] dev_hard_start_xmit+0x449/0x60e
[<ffffffff812c1804>] ? dev_hard_start_xmit+0x6a/0x60e
[<ffffffff812d95b9>] sch_direct_xmit+0x72/0x201
[<ffffffff812c2221>] dev_queue_xmit+0x479/0x70e
[<ffffffff812c1da8>] ? dev_hard_start_xmit+0x60e/0x60e
[<ffffffff812ca20e>] neigh_connected_output+0xb6/0xd4
[<ffffffffa023c85b>] ip6_finish_output2+0x304/0x38e [ipv6]
[<ffffffffa023c777>] ? ip6_finish_output2+0x220/0x38e [ipv6]
[<ffffffffa0245057>] ? ip6_mtu+0x88/0xa1 [ipv6]
[<ffffffffa023ca29>] ip6_finish_output+0x144/0x149 [ipv6]
[<ffffffffa023cb41>] ip6_output+0x113/0x11c [ipv6]
[<ffffffffa024c51f>] ndisc_send_skb+0x176/0x200 [ipv6]
[<ffffffffa024c413>] ? ndisc_send_skb+0x6a/0x200 [ipv6]
[<ffffffff81065350>] ? mark_held_locks+0xcd/0xf5
[<ffffffffa024ce1a>] __ndisc_send+0x51/0x5d [ipv6]
[<ffffffffa024e4a0>] ndisc_send_ns+0x75/0x82 [ipv6]
[<ffffffffa02413ee>] addrconf_dad_timer+0x118/0x136 [ipv6]
[<ffffffff8103547a>] run_timer_softirq+0x27a/0x38b
[<ffffffff810353b5>] ? run_timer_softirq+0x1b5/0x38b
[<ffffffffa02412d6>] ? addrconf_rs_timer+0xf2/0xf2 [ipv6]
[<ffffffff8102f746>] __do_softirq+0xff/0x1de
[<ffffffff81362dcc>] call_softirq+0x1c/0x26
[<ffffffff81003090>] do_softirq+0x38/0x80
[<ffffffff8102f41f>] irq_exit+0x4e/0x83
[<ffffffff81019d2e>] smp_apic_timer_interrupt+0x86/0x94
[<ffffffff8136255c>] apic_timer_interrupt+0x6c/0x80
<EOI> [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
[<ffffffff810083e3>] ? mwait_idle+0x133/0x208
[<ffffffff810088d1>] cpu_idle+0x6e/0xab
[<ffffffff81343a23>] rest_init+0xc7/0xce
[<ffffffff8134395c>] ? csum_partial_copy_generic+0x16c/0x16c
[<ffffffff8167fbf3>] start_kernel+0x332/0x33f
[<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
[<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
[<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
---[ end trace f71d7a59b646ab87 ]---
^ permalink raw reply
* [PATCH 5 2/2] OMAP4 PANDA register ethernet and wlan for automatic mac allocation
From: Andy Green @ 2012-07-06 4:09 UTC (permalink / raw)
To: linux-omap
Cc: s-jan, arnd, patches, tony, netdev, linux-kernel, rostedt,
linux-arm-kernel
In-Reply-To: <20120706040938.6669.77083.stgit@build.warmcat.com>
From: Andy Green <andy@warmcat.com>
This provides the board-specific device paths needed to get
the panda boardfile working with the eth-mac-platform api.
On Pandaboard / ES, neither the onboard Ethernet or onboard WLAN
module have onboard arrangements for MAC storage, without this
series yielding randomized MAC per-boot and consequent DHCP problems,
or in the case of wlan0 a MAC set by a firmware file in the rootfs
which unless customized yields a MAC of 00:00:00:00:00:00. No
official MAC is reserved for either network device even if you do
take the approach to customize the firmware file.
This gets sane, consistent MAC addresses on both devices which
should stand a good probability of differing between PandaBoards.
Signed-off-by: Andy Green <andy.green@linaro.org>
---
arch/arm/mach-omap2/Kconfig | 1 +
arch/arm/mach-omap2/board-omap4panda.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 31 insertions(+)
diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index 83fb31c..61c6a3d 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -358,6 +358,7 @@ config MACH_OMAP4_PANDA
select OMAP_PACKAGE_CBL
select OMAP_PACKAGE_CBS
select REGULATOR_FIXED_VOLTAGE if REGULATOR
+ select ETH_MAC_PLATFORM
config MACH_PCM049
bool "OMAP4 based phyCORE OMAP4"
diff --git a/arch/arm/mach-omap2/board-omap4panda.c b/arch/arm/mach-omap2/board-omap4panda.c
index 982fb26..75d93cc 100644
--- a/arch/arm/mach-omap2/board-omap4panda.c
+++ b/arch/arm/mach-omap2/board-omap4panda.c
@@ -32,7 +32,10 @@
#include <linux/wl12xx.h>
#include <linux/platform_data/omap-abe-twl6040.h>
+#include <net/eth-mac-platform.h>
+
#include <mach/hardware.h>
+#include <mach/id.h>
#include <asm/hardware/gic.h>
#include <asm/mach-types.h>
#include <asm/mach/arch.h>
@@ -486,16 +489,43 @@ static void omap4_panda_init_rev(void)
}
}
+/*
+ * These device paths represent onboard network devices which have
+ * no MAC address set at boot, and need synthetic ones assigning
+ */
+static __initdata struct eth_mac_platform panda_eth_mac_platform[] = {
+
+ { /* smsc USB <-> Ethernet bridge */
+ .device_path = "usb1/1-1/1-1.1/1-1.1:1.0",
+ },
+ { /* wlan0 module */
+ .device_path = "wl12xx",
+ },
+ { /* terminator */
+ }
+};
+
static void __init omap4_panda_init(void)
{
int package = OMAP_PACKAGE_CBS;
int ret;
+ int n;
if (omap_rev() == OMAP4430_REV_ES1_0)
package = OMAP_PACKAGE_CBL;
omap4_mux_init(board_mux, NULL, package);
omap_panda_wlan_data.irq = gpio_to_irq(GPIO_WIFI_IRQ);
+
+ /*
+ * provide MAC addresses computed from device ID for network
+ * devices that have no MAC address otherwise on Panda
+ */
+ for (n = 0; n < ARRAY_SIZE(panda_eth_mac_platform) - 1; n++)
+ omap_die_id_to_ethernet_mac(panda_eth_mac_platform[n].mac, n);
+ if (eth_mac_platform_register_device_macs(panda_eth_mac_platform))
+ pr_err("Unable to register eth_mac_platform devices\n");
+
ret = wl12xx_set_platform_data(&omap_panda_wlan_data);
if (ret)
pr_err("error setting wl12xx data: %d\n", ret);
^ permalink raw reply related
* [PATCH 5 1/2] NET ethernet introduce eth-mac-platform helper
From: Andy Green @ 2012-07-06 4:09 UTC (permalink / raw)
To: linux-omap
Cc: s-jan, arnd, patches, tony, netdev, linux-kernel, rostedt,
linux-arm-kernel
From: Andy Green <andy@warmcat.com>
This introduces a small helper in net/ethernet, which registers a network
notifier at core_initcall time, and accepts registrations mapping expected
asynchronously-probed network device paths (like, "usb1/1-1/1-1.1/1-1.1:1.0")
and the MAC that is needed to be assigned to the device when it appears.
This allows platform code to enforce valid, consistent MAC addresses on to
devices that have not been probed at boot-time, but due to being wired on the
board are always present at the same interface. It has been tested with USB
and SDIO probed devices.
Other parts of this series provide an OMAP API that computes a valid
locally administered MAC address from CPU ID bits that are unique for each
physical SoC, and register those against devices wired to the board.
This solves a longstanding problem in at least Panda case that there are no
reserved MACs for either onboard Ethernet nor onboard WLAN module, and without
this patch a randomized MAC is assigned to Ethernet and 00:00:00:00:00:00 or
0xdeadbeef is assigned as the WLAN MAC address. The series provides sane,
constant locally-administered MAC addresses that have a high probability of
differing between boards.
To make use of this safely you also need to make sure that any drivers that
may compete for the bus ordinal you are using (eg, mUSB and ehci in Panda
case) are loaded in a deterministic order.
At registration it makes a copy of the incoming data, so the data may be
__initdata or otherwise transitory. Registration can be called multiple times
so registrations from Device Tree and platform may be mixed.
Since it needs to be called quite early in boot and there is no lifecycle for
what it does, it could not be modular and is not a driver.
Via suggestions from Arnd Bergmann and Tony Lindgren (and Alan Cox for the
network notifier concept).
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andy Green <andy.green@linaro.org>
---
include/net/eth-mac-platform.h | 40 ++++++++++
net/Kconfig | 5 +
net/ethernet/Makefile | 3 +
net/ethernet/eth-mac-platform.c | 150 +++++++++++++++++++++++++++++++++++++++
4 files changed, 197 insertions(+), 1 deletion(-)
create mode 100644 include/net/eth-mac-platform.h
create mode 100644 net/ethernet/eth-mac-platform.c
diff --git a/include/net/eth-mac-platform.h b/include/net/eth-mac-platform.h
new file mode 100644
index 0000000..752f1de
--- /dev/null
+++ b/include/net/eth-mac-platform.h
@@ -0,0 +1,40 @@
+/*
+ * eth-mac-platform.h: Enforces platform-defined MAC for Async probed devices
+ */
+
+#ifndef __ETH_NET_MAC_PLATFORM_H__
+#define __ETH_NET_MAC_PLATFORM_H__
+
+#include <linux/if_ether.h>
+
+/**
+ * struct eth_mac_platform - associates asynchronously probed device path with
+ * MAC address to be assigned to the device when it
+ * is created
+ *
+ * @device_path: device path name of network device
+ * @mac: MAC address to assign to network device matching device path
+ * @list: can be left uninitialized when passing from platform
+ */
+
+struct eth_mac_platform {
+ char *device_path;
+ u8 mac[ETH_ALEN];
+ struct list_head list; /* unused in platform data usage */
+};
+
+#ifdef CONFIG_NET
+/**
+ * eth_mac_platform_register_device_macs - add an array of device path to
+ * monitor and MAC to apply when the network
+ * device at the device path appears
+ * @macs: array of struct eth_mac_platform terminated by entry with
+ * NULL device_path
+ */
+int eth_mac_platform_register_device_macs(const struct eth_mac_platform *macs);
+#else
+static inline int eth_mac_platform_register_device_macs(
+ const struct eth_mac_platform *macs) { return 0; }
+#endif /* !CONFIG_NET */
+
+#endif /* __ETH_NET_MAC_PLATFORM_H__ */
diff --git a/net/Kconfig b/net/Kconfig
index 245831b..dd8ab96 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -335,9 +335,12 @@ source "net/caif/Kconfig"
source "net/ceph/Kconfig"
source "net/nfc/Kconfig"
-
endif # if NET
+# used by board / dt platform to enforce Ethernet MACs for Async-probed devices
+config ETH_MAC_PLATFORM
+ bool
+
# Used by archs to tell that they support BPF_JIT
config HAVE_BPF_JIT
bool
diff --git a/net/ethernet/Makefile b/net/ethernet/Makefile
index 7cef1d8..7362f46 100644
--- a/net/ethernet/Makefile
+++ b/net/ethernet/Makefile
@@ -5,3 +5,6 @@
obj-y += eth.o
obj-$(subst m,y,$(CONFIG_IPX)) += pe2.o
obj-$(subst m,y,$(CONFIG_ATALK)) += pe2.o
+ifneq ($(CONFIG_NET),)
+obj-$(CONFIG_ETH_MAC_PLATFORM) += eth-mac-platform.o
+endif
diff --git a/net/ethernet/eth-mac-platform.c b/net/ethernet/eth-mac-platform.c
new file mode 100644
index 0000000..9b2ad69
--- /dev/null
+++ b/net/ethernet/eth-mac-platform.c
@@ -0,0 +1,150 @@
+/*
+ * Helper to allow platform code to enforce association of a locally-
+ * administered MAC address automatically on asynchronously probed devices,
+ * such as SDIO and USB based devices.
+ *
+ * Because the "device path" is used for matching, this is only useful for
+ * network assets physcally wired on the board, and also requires any
+ * different drivers that can compete for bus ordinals (eg mUSB vs ehci) to
+ * have fixed initialization ordering, eg, by having ehci in monolithic
+ * kernel
+ *
+ * Neither a driver nor a module as needs to be callable from machine file
+ * before the network devices are registered.
+ *
+ * (c) 2012 Andy Green <andy.green@linaro.org>
+ */
+
+#include <linux/netdevice.h>
+#include <net/eth-mac-platform.h>
+
+static LIST_HEAD(eth_mac_platform_list);
+static DEFINE_MUTEX(eth_mac_platform_mutex);
+
+static struct eth_mac_platform *__eth_mac_platform_check(struct device *dev)
+{
+ const char *path;
+ const char *p;
+ const char *try;
+ int len;
+ struct device *devn;
+ struct eth_mac_platform *tmp;
+ struct list_head *pos;
+
+ list_for_each(pos, ð_mac_platform_list) {
+
+ tmp = list_entry(pos, struct eth_mac_platform, list);
+
+ try = tmp->device_path;
+
+ p = try + strlen(try);
+ devn = dev;
+
+ while (devn) {
+
+ path = dev_name(devn);
+ len = strlen(path);
+
+ if ((p - try) < len) {
+ devn = NULL;
+ continue;
+ }
+
+ p -= len;
+
+ if (strncmp(path, p, len)) {
+ devn = NULL;
+ continue;
+ }
+
+ devn = devn->parent;
+ if (p == try)
+ return tmp;
+
+ if (devn != NULL && (p - try) < 2)
+ devn = NULL;
+
+ p--;
+ if (devn != NULL && *p != '/')
+ devn = NULL;
+ }
+ }
+
+ return NULL;
+}
+
+static int eth_mac_platform_netdev_event(struct notifier_block *this,
+ unsigned long event, void *ptr)
+{
+ struct net_device *dev = ptr;
+ struct sockaddr sa;
+ struct eth_mac_platform *match;
+
+ if (event != NETDEV_REGISTER)
+ return NOTIFY_DONE;
+
+ mutex_lock(ð_mac_platform_mutex);
+
+ match = __eth_mac_platform_check(dev->dev.parent);
+ if (match == NULL)
+ goto bail;
+
+ sa.sa_family = dev->type;
+ memcpy(sa.sa_data, match->mac, sizeof match->mac);
+ dev->netdev_ops->ndo_set_mac_address(dev, &sa);
+
+bail:
+ mutex_unlock(ð_mac_platform_mutex);
+
+ return NOTIFY_DONE;
+}
+
+int eth_mac_platform_register_device_macs(const struct eth_mac_platform *macs)
+{
+ struct eth_mac_platform *next;
+ int ret = 0;
+
+ mutex_lock(ð_mac_platform_mutex);
+
+ while (macs->device_path) {
+
+ next = kmemdup(macs, sizeof(*macs), GFP_KERNEL);
+ if (!next) {
+ ret = -ENOMEM;
+ goto bail;
+ }
+
+ next->device_path = kstrdup(macs->device_path, GFP_KERNEL);
+ if (!next->device_path) {
+ kfree(next);
+ ret = -ENOMEM;
+ goto bail;
+ }
+
+ list_add(&next->list, ð_mac_platform_list);
+
+ macs++;
+ }
+bail:
+ mutex_unlock(ð_mac_platform_mutex);
+
+ return ret;
+}
+
+static struct notifier_block eth_mac_platform_netdev_notifier = {
+ .notifier_call = eth_mac_platform_netdev_event,
+ .priority = 1,
+};
+
+static int __init eth_mac_platform_init(void)
+{
+ int ret;
+
+ ret = register_netdevice_notifier(ð_mac_platform_netdev_notifier);
+ if (ret)
+ pr_err("eth_mac_platform_init: Notifier registration failed\n");
+
+ return ret;
+}
+
+core_initcall(eth_mac_platform_init);
^ permalink raw reply related
* Re: net-next warning at kernel/softirq.c:159 local_bh_enable
From: David Miller @ 2012-07-06 4:08 UTC (permalink / raw)
To: ogerlitz; +Cc: netdev, shlomop
In-Reply-To: <alpine.LRH.2.00.1207060616590.21283@ogerlitz.voltaire.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Fri, 6 Jul 2012 06:20:58 +0300
> Shlomo saw too the ipv6 dst crash that was reported over
> the list and also this warning when he rebased to net-next yesterday
Here is what I commited to fix this:
====================
[PATCH] ipoib: Need to do dst_neigh_lookup_skb() outside of priv->lock.
Otherwise local_bh_enable() complains.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index fbb95ee..7cecb16 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -658,9 +658,15 @@ static int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast)
void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
+ struct dst_entry *dst = skb_dst(skb);
struct ipoib_mcast *mcast;
+ struct neighbour *n;
unsigned long flags;
+ n = NULL;
+ if (dst)
+ n = dst_neigh_lookup_skb(dst, skb);
+
spin_lock_irqsave(&priv->lock, flags);
if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags) ||
@@ -715,12 +721,6 @@ void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb)
out:
if (mcast && mcast->ah) {
- struct dst_entry *dst = skb_dst(skb);
- struct neighbour *n = NULL;
-
- rcu_read_lock();
- if (dst)
- n = dst_neigh_lookup_skb(dst, skb);
if (n) {
if (!*to_ipoib_neigh(n)) {
struct ipoib_neigh *neigh;
@@ -735,13 +735,14 @@ out:
}
neigh_release(n);
}
- rcu_read_unlock();
spin_unlock_irqrestore(&priv->lock, flags);
ipoib_send(dev, skb, mcast->ah, IB_MULTICAST_QPN);
return;
}
unlock:
+ if (n)
+ neigh_release(n);
spin_unlock_irqrestore(&priv->lock, flags);
}
--
1.7.10.4
^ permalink raw reply related
* net-next warning at kernel/softirq.c:159 local_bh_enable
From: Or Gerlitz @ 2012-07-06 3:20 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Shlomo Pongratz
Dave,
Shlomo saw too the ipv6 dst crash that was reported over
the list and also this warning when he rebased to net-next yesterday
Or.
[ 285.551769] ------------[ cut here ]------------
[ 285.552693] WARNING: at kernel/softirq.c:159 local_bh_enable+0x7a/0xa0()
[ 285.552693] Hardware name: X7DWU
[ 285.552693] Modules linked in: ib_ipoib ib_cm ib_sa mlx4_ib mlx4_en
mlx4_core ib_mad ib_core netconsole nfs lockd fscache auth_rpcgss nfs_acl
bnx2fc cnic 8021q uio garp fcoe libfcoe stp libfc llc scsi_transport_fc
scsi_tgt sunrpc binfmt_misc uinput joydev coretemp kvm_intel kvm microcode
pcspkr serio_raw i2c_i801 lpc_ich mfd_core i5400_edac edac_core i5k_amb
igb ptp pps_core ioatdma ixgbe dca mdio shpchp usb_storage floppy radeon
ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded:
scsi_wait_scan]
[ 285.552693] Pid: 1662, comm: avahi-daemon Not tainted 3.5.0-rc5+ #2
[ 285.552693] Call Trace:
[ 285.552693] [<ffffffff81058ebf>] warn_slowpath_common+0x7f/0xc0
[ 285.552693] [<ffffffff81058f1a>] warn_slowpath_null+0x1a/0x20
[ 285.552693] [<ffffffff810618ba>] local_bh_enable+0x7a/0xa0
[ 285.552693] [<ffffffff815294e0>] ipv4_neigh_lookup+0xc0/0x130
[ 285.552693] [<ffffffffa04f0966>] ipoib_mcast_send+0x1b6/0x480 [ib_ipoib]
[ 285.552693] [<ffffffff81093716>] ? update_sd_lb_stats+0x146/0x690
[ 285.552693] [<ffffffff814eb35b>] ? __alloc_skb+0x4b/0x170
[ 285.552693] [<ffffffffa04ec15d>] ipoib_path_lookup+0x11d/0x300 [ib_ipoib]
[ 285.552693] [<ffffffff814eb35b>] ? __alloc_skb+0x4b/0x170
[ 285.552693] [<ffffffff814e7718>] ? sock_alloc_send_pskb+0x1d8/0x380
[ 285.552693] [<ffffffffa04ec97d>] ipoib_start_xmit+0x1ad/0x430 [ib_ipoib]
[ 285.552693] [<ffffffff814fa64a>] dev_hard_start_xmit+0x24a/0x670
[ 285.552693] [<ffffffff8151790a>] sch_direct_xmit+0xfa/0x1d0
[ 285.552693] [<ffffffff814fabff>] dev_queue_xmit+0x18f/0x5e0
[ 285.552693] [<ffffffff81503667>] neigh_connected_output+0xc7/0x110
[ 285.552693] [<ffffffff81535c0d>] ip_finish_output+0x2ed/0x410
[ 285.552693] [<ffffffff81535ee9>] ip_mc_output+0x119/0x250
[ 285.552693] [<ffffffff81534fe9>] ip_local_out+0x29/0x30
[ 285.552693] [<ffffffff8153500b>] ip_send_skb+0x1b/0x50
[ 285.552693] [<ffffffff815598ad>] udp_send_skb+0x10d/0x3a0
[ 285.552693] [<ffffffff81534190>] ? ip_append_page+0x500/0x500
[ 285.552693] [<ffffffff8155ab9b>] udp_sendmsg+0x33b/0x970
[ 285.552693] [<ffffffff81080650>] ? update_rmtp+0x80/0x80
[ 285.552693] [<ffffffff81198011>] ? do_sys_poll+0x351/0x510
[ 285.552693] [<ffffffff81563a78>] inet_sendmsg+0x48/0xb0
[ 285.552693] [<ffffffff814e3398>] sock_sendmsg+0xf8/0x130
[ 285.552693] [<ffffffff81197330>] ? __pollwait+0xf0/0xf0
[ 285.552693] [<ffffffff81060707>] ? current_fs_time+0x27/0x30
[ 285.552693] [<ffffffff8119d3e1>] ? update_time+0x81/0xc0
[ 285.552693] [<ffffffff81197330>] ? __pollwait+0xf0/0xf0
[ 285.552693] [<ffffffff8119d469>] ? file_update_time+0x49/0xe0
[ 285.552693] [<ffffffff814e1e6e>] ? move_addr_to_kernel+0x4e/0x90
[ 285.552693] [<ffffffff814e1690>] ? copy_from_user+0x30/0x40
[ 285.552693] [<ffffffff814e4acd>] __sys_sendmsg+0x3fd/0x420
[ 285.552693] [<ffffffff81184aa2>] ? do_sync_write+0xe2/0x120
[ 285.552693] [<ffffffff811c1b05>] ? fsnotify+0x1d5/0x2f0
[ 285.552693] [<ffffffff814e4cf9>] sys_sendmsg+0x49/0x90
[ 285.552693] [<ffffffff81607da9>] system_call_fastpath+0x16/0x1b
[ 285.552693] ---[ end trace cabd6aea7003ad6f ]---
^ permalink raw reply
* Re: [net-next RFC V5 5/5] virtio_net: support negotiating the number of queues through ctrl vq
From: Jason Wang @ 2012-07-06 3:20 UTC (permalink / raw)
To: Sasha Levin
Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <1341492679.18786.18.camel@lappy>
On 07/05/2012 08:51 PM, Sasha Levin wrote:
> On Thu, 2012-07-05 at 18:29 +0800, Jason Wang wrote:
>> @@ -1387,6 +1404,10 @@ static int virtnet_probe(struct virtio_device *vdev)
>> if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
>> vi->has_cvq = true;
>>
>> + /* Use single tx/rx queue pair as default */
>> + vi->num_queue_pairs = 1;
>> + vi->total_queue_pairs = num_queue_pairs;
> The code is using this "default" even if the amount of queue pairs it
> wants was specified during initialization. This basically limits any
> device to use 1 pair when starting up.
>
Yes, currently the virtio-net driver would use 1 txq/txq by default
since multiqueue may not outperform in all kinds of workload. So it's
better to keep it as default and let user enable multiqueue by ethtool -L.
^ permalink raw reply
* Re: [net-next RFC V5 2/5] virtio_ring: move queue_index to vring_virtqueue
From: Jason Wang @ 2012-07-06 3:17 UTC (permalink / raw)
To: Sasha Levin
Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <1341488454.18786.15.camel@lappy>
On 07/05/2012 07:40 PM, Sasha Levin wrote:
> On Thu, 2012-07-05 at 18:29 +0800, Jason Wang wrote:
>> Instead of storing the queue index in virtio infos, this patch moves them to
>> vring_virtqueue and introduces helpers to set and get the value. This would
>> simplify the management and tracing.
>>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
> This patch actually fails to compile:
>
> drivers/virtio/virtio_mmio.c: In function ‘vm_notify’:
> drivers/virtio/virtio_mmio.c:229:13: error: ‘struct virtio_mmio_vq_info’ has no member named ‘queue_index’
> drivers/virtio/virtio_mmio.c: In function ‘vm_del_vq’:
> drivers/virtio/virtio_mmio.c:278:13: error: ‘struct virtio_mmio_vq_info’ has no member named ‘queue_index’
> make[2]: *** [drivers/virtio/virtio_mmio.o] Error 1
>
> It probably misses the following hunks:
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index f5432b6..12b6180 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -222,11 +222,10 @@ static void vm_reset(struct virtio_device *vdev)
> static void vm_notify(struct virtqueue *vq)
> {
> struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
> - struct virtio_mmio_vq_info *info = vq->priv;
>
> /* We write the queue's selector into the notification register to
> * signal the other end */
> - writel(info->queue_index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> + writel(virtqueue_get_queue_index(vq), vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> }
>
> /* Notify all virtqueues on an interrupt. */
> @@ -275,7 +274,7 @@ static void vm_del_vq(struct virtqueue *vq)
> vring_del_virtqueue(vq);
>
> /* Select and deactivate the queue */
> - writel(info->queue_index, vm_dev->base + VIRTIO_MMIO_QUEUE_SEL);
> + writel(virtqueue_get_queue_index(vq), vm_dev->base + VIRTIO_MMIO_QUEUE_SEL);
> writel(0, vm_dev->base + VIRTIO_MMIO_QUEUE_PFN);
>
> size = PAGE_ALIGN(vring_size(info->num, VIRTIO_MMIO_VRING_ALIGN));
>
Oops, I miss the virtio mmio part, thanks for pointing this.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH net-next] asix: avoid copies in tx path
From: Ming Lei @ 2012-07-06 1:16 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, netdev, Greg Kroah-Hartman, Allan Chou,
Trond Wuellner, Grant Grundler
In-Reply-To: <1341498661.2583.4162.camel@edumazet-glaptop>
On Thu, Jul 5, 2012 at 10:31 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> I noticed excess calls to skb_copy_expand() or memmove() in asix driver.
>
> This driver needs to push 4 bytes in front of frame (packet_len)
> and maybe add 4 bytes after the end (if padlen is 4)
>
> So it should set needed_headroom & needed_tailroom to avoid
> copies. But its not enough, because many packets are cloned
> before entering asix_tx_fixup() and this driver use skb_cloned()
> as a lazy way to check if it can push and put additional bytes in frame.
>
> Avoid skb_copy_expand() expensive call, using following rules :
>
> - We are allowed to push 4 bytes in headroom if skb_header_cloned()
> is false (and if we have 4 bytes of headroom)
>
> - We are allowed to put 4 bytes at tail if skb_cloned()
> is false (and if we have 4 bytes of tailroom)
>
> TCP packets for example are cloned, but skb_header_release()
> was called in tcp stack, allowing us to use headroom for our needs.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Allan Chou <allan@asix.com.tw>
> Cc: Trond Wuellner <trond@chromium.org>
> Cc: Grant Grundler <grundler@chromium.org>
> Cc: Paul Stewart <pstew@chromium.org>
> Cc: Ming Lei <tom.leiming@gmail.com>
After testing the patch on beagle-xm with external DLINK DUB-E100 NIC,
the transmit performance is increased from ~75Mbps to ~91Mbps when
DEBUG_SLAB is enabled, follows the test command and result:
[root@root]#iperf -c 192.168.0.103 -w 131072 -t 10
------------------------------------------------------------
Client connecting to 192.168.0.103, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 3] local 192.168.0.102 port 57888 connected with 192.168.0.103 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 109 MBytes 91.6 Mbits/sec
Tested-by: Ming Lei <ming.lei@canonical.com>
Thanks,
--
Ming Lei
^ permalink raw reply
* Re: TCP transmit performance regression
From: Ming Lei @ 2012-07-06 0:45 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Network Development, David Miller
In-Reply-To: <1341500196.2583.4222.camel@edumazet-glaptop>
On Thu, Jul 5, 2012 at 10:56 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-07-05 at 22:01 +0800, Ming Lei wrote:
>
>> At default SMSC95xx turbo mode is true, rx buffer will be very big
>> (17.5K). Or the large rx buffer size puts limit on concurrent URBs/SKBs
>> count. Both two may cause the problem.
>
> I see. So we should try to recycle these large rx buffers in usbnet
> instead of allocating/freeing them for each incoming packet.
>
> Following patch does the copybreak of all incoming frames.
>
> It has nice property of not lying anymore on skb truesize ;)
>
> It should be applied on both sender and receiver
In fact, I run the below command in the test beagle-xm box with SMSC95xx
NIC:
iperf -c 192.168.0.103 -w 131072 -t 10
and run the below command in one x86 production machine(e1000e NIC)
running ubuntu 12.04:
iperf -s -w 131072
The current problem is that the transmit performance on beagle-xm is
not good with the above iperf test if DEBUG_SLAB is enabled. But if
I set dev->rx_usb_size as 2048, the transmit performance can be
doubled, looks it is caused by the large rx buffer.
>
> drivers/net/usb/smsc95xx.c | 19 +++----------------
> 1 file changed, 3 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
> index b1112e7..3d9566f 100644
> --- a/drivers/net/usb/smsc95xx.c
> +++ b/drivers/net/usb/smsc95xx.c
> @@ -1080,30 +1080,17 @@ static int smsc95xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
> return 0;
> }
>
> - /* last frame in this batch */
> - if (skb->len == size) {
> - if (dev->net->features & NETIF_F_RXCSUM)
> - smsc95xx_rx_csum_offload(skb);
> - skb_trim(skb, skb->len - 4); /* remove fcs */
> - skb->truesize = size + sizeof(struct sk_buff);
> -
> - return 1;
> - }
> -
> - ax_skb = skb_clone(skb, GFP_ATOMIC);
> + ax_skb = netdev_alloc_skb_ip_align(dev->net, size);
> if (unlikely(!ax_skb)) {
> netdev_warn(dev->net, "Error allocating skb\n");
> return 0;
> }
>
> - ax_skb->len = size;
> - ax_skb->data = packet;
> - skb_set_tail_pointer(ax_skb, size);
> + memcpy(skb_put(ax_skb, size), packet, size);
>
> if (dev->net->features & NETIF_F_RXCSUM)
> smsc95xx_rx_csum_offload(ax_skb);
> - skb_trim(ax_skb, ax_skb->len - 4); /* remove fcs */
> - ax_skb->truesize = size + sizeof(struct sk_buff);
> + __skb_trim(ax_skb, ax_skb->len - 4); /* remove fcs */
>
> usbnet_skb_return(dev, ax_skb);
> }
>
>
Unfortunately, the patch still hasn't any improvement on the transmit
performance of beagle-xm.
Thanks,
--
Ming Lei
^ permalink raw reply
* Re: Network namespace and bonding WARNING at fs/proc/generic.c remove_proc_entry
From: Eric W. Biederman @ 2012-07-06 0:41 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: Dilip Daya, linux-kernel, containers, netdev
In-Reply-To: <20120705220749.GA11255@mail.hallyn.com>
"Serge E. Hallyn" <serge@hallyn.com> writes:
> Quoting Dilip Daya (dilip.daya@hp.com):
>> Hi,
>>
>> I'd discussed the following with Serge Hallyn.
>>
>> => Environment based on 3.2.18 / x86_64 kernel.
>> => WARNING: at fs/proc/generic.c:808 remove_proc_entry+0xdb/0x21f()
>> => WARNING: at fs/proc/generic.c:849 remove_proc_entry+0x208/0x21f()
>
> Hi,
>
> thanks much for sending this. I'm still getting this error on
> 3.5.0-2-generic (today's ubuntu quantal kernel)
>
>> network namespace and bonding
>> -----------------------------
>>
>> * Migrate two phy nics from host to netns (netns0).
>> - ip link set ethX netns netns0
>>
>> * In host environment:
>> - load bonding module, /sbin/modprobe -v bonding mode=1 miimon=100
>> - /sys/class/net/bond0 exists.
>> - /proc/net/bonding/bond0 exists.
>> - /sys/class/net/bonding_masters has bond0.
>>
>> * Migrate bond0 to netns (netns0):
>> - ip link set bond0 netns netns0.
>>
>> * Within netns (netns0):
>> - /sys/class/net/bonding_masters is empty.
>> - /sys/class/net/bond0 exist.
>> - configure bond0 and ifenslave with two phy nics.
>> - /proc/net/bonding/bond0 does not exist within netns0, but does
>> exist in the host environment.
>> - /sys/class/net/bonding_masters is empty.
>
> mine is not empty, fwiw. However
>
>> - ping to remote end of bond0 works.
>>
>> * Within netns (netns0), flushing ethX and bondY:
>> - down bond0 and its phy nic interfaces:
>> - ip link set ... down
>> - ip addr flush dev [bond0 | eth#]
>> - deleting bond0, /sbin/ip link del dev bond0
>
> Yup I still get a remove_proc_entry WARNING at fs/proc/generic.c:808,
> which is the warning when (!de)
It looks like Dilip is running an old kernel. There should have been
some version of /sys/class/net/bonding_masters in every network
namespace since sometime in 2009.
>From the warning it looks like the proc files are being added/removed
to the wrong network namespace. So in one namespace we get an error
when we delete the moved device and in the other network namespace
we get an error when we remove the /proc/directory.
An old kernel without proper network namespace support is the only
reason I can imagine someone would be moving an existing bond device
between network namespaces.
If there are other reasons for wanting to move a bonding device between
network namespaces it is possible to catch the NETDEV_UNREGISTER and
NETDEV_REGISTER events to remove/add the per device proc files at the
appropriate time.
However since moving bonding devices appears to be an unneded operation
let's just do things simply and forbid moving bonding devices between
network namespaces. Serge, Dilip can you two test the patch below
and see if it fixes the warnings.
Eric
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2ee8cf9..818ed64 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4345,6 +4345,9 @@ static void bond_setup(struct net_device *bond_dev)
bond_dev->priv_flags |= IFF_BONDING;
bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
+ /* Don't allow bond devices to change network namespaces. */
+ bond_dev->features |= NETIF_F_LOCAL;
+
/* At first, we block adding VLANs. That's the only way to
* prevent problems that occur when adding VLANs over an
* empty bond. The block will be removed once non-challenged
^ permalink raw reply related
* Re: BISECTED: Re: REGRESSION: 3.4.0->3.5.0-rc2 kernel WARNING on cable plug on Acer Aspire One, no network
From: Alex Villacís Lasso @ 2012-07-06 0:35 UTC (permalink / raw)
To: Marek Szyprowski; +Cc: 'Francois Romieu', netdev
In-Reply-To: <012601cd5a7b$886fd4c0$994f7e40$%szyprowski@samsung.com>
El 05/07/12 01:58, Marek Szyprowski escribió:
> Hello,
>
> On Thursday, July 05, 2012 6:15 AM Alex Villacís Lasso wrote:
>
>> El 04/07/12 02:02, Marek Szyprowski escribió:
>>> Hello,
>>>
>>> On Tuesday, July 03, 2012 4:27 PM Alex Villací¬s Lasso wrote:
>>>
>>>> El 03/07/12 00:40, Marek Szyprowski escribió:
>>>>> Hi Alex,
>>>>>
>>>>> On Tuesday, July 03, 2012 4:45 AM Alex Villacís Lasso wrote:
>>>>>
>>>>>> -------- Mensaje original --------
>>>>>> Asunto: BISECTED: Re: REGRESSION: 3.4.0->3.5.0-rc2 kernel WARNING on cable
>>>>>> plug on Acer Aspire One, no network Fecha: Mon, 02 Jul 2012 21:33:41 -0500 De:
>>>>>> Alex Villacís Lasso <a_villacis@palosanto.com> Para: Francois Romieu
>>>>>> <romieu@fr.zoreil.com> CC: netdev@vger.kernel.org
>>>>>> El 01/07/12 08:50, Alex Villacís Lasso escribió:
>>>>>>> El 11/06/12 16:38, Francois Romieu escribió:
>>>>>>>> Alex Villacís Lasso <a_villacis@palosanto.com> :
>>>>>>>> [...]
>>>>>>>>> $ grep XID dmesg-3.5.0-rc2.txt
>>>>>>>>> [ 15.873858] r8169 0000:02:00.0: eth0: RTL8102e at 0xf7c0e000,
>>>>>>>>> 00:1e:68:e5:5d:b1, XID 04a00000 IRQ 44
>>>>>>>> The 8102e has not been touched by that many suspect patches but I do
>>>>>>>> not see where the problem is :o(
>>>>>>>>
>>>>>>>> Can you peel off the r8169 patches between 3.4.0 and 3.5-rc ?
>>>>>>>>
>>>>>>> Still present in 3.5-rc5. Bisection still in progress.
>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>> My full bisection points to this commit:
>>>>>>
>>>>>> commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6
>>>>>> Author: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>> Date: Thu Dec 29 13:09:51 2011 +0100
>>>>>>
>>>>>> X86: integrate CMA with DMA-mapping subsystem
>>>>>>
>>>>>> This patch adds support for CMA to dma-mapping subsystem for x86
>>>>>> architecture that uses common pci-dma/pci-nommu implementation. This
>>>>>> allows to test CMA on KVM/QEMU and a lot of common x86 boxes.
>>>>>>
>>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>>>> CC: Michal Nazarewicz <mina86@mina86.com>
>>>>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>>>>>
>>>>>> Is this commit somehow messing with the network card DMA?
>>>>> This commit in fact touches DMA-mapping subsystem and introduces a bug,
>>>>> which has been finally fixed by commit c080e26edc3a2a3 merged to v3.5-rc3.
>>>>> After applying it the DMA-mapping subsystem should work exactly the same was
>>>>> as in v3.4. Could you please check if it fixes this issue?
>>>>>
>>>>> Best regards
>>>> No. It still fails in 3.5-rc5, as mentioned before.
>>> Hmm. I was a bit confused, because both the subject and git bisect log pointed to v3.5-rc2,
>>> which had that bug. Maybe there is one some other issue present in v3.5-rc5 not related to
>>> my patches?
>>>
>>> Could you check with v3.5-rc5 if reverting patch c080e26edc3a2a3cdfa4c430c663ee1c3bbd8fae
>>> and 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6 fixes the problems with rtl driver?
>>>
>>> Best regards
>> Reverting the two patches indeed fixes the bug on -rc5.
> That's really strange. Could you check if you have CMA disabled in the config? After preparing
> a c080e26edc3a2a3cdfa4c430c663ee1c3bbd8fae fixup patch, I was really convinced that there are
> no functional changes in x86 dma mapping code when CMA is disabled. I will provide some
> patches to revert different parts of my changes, so we will find which line causes issues.
>
> Best regards
The affected system is an Acer Aspire One, a 32-bit only system. The
option to enable or disable CMA simply does not appear as available in
menuconfig to either enable or disable, and it also does not appear in
the .config file as either set or unset. I assume this means that CMA is
disabled.
^ permalink raw reply
* [PATCH net] cnic: Don't use netdev->base_addr
From: Michael Chan @ 2012-07-06 0:21 UTC (permalink / raw)
To: davem; +Cc: netdev
commit c0357e975afdbbedab5c662d19bef865f02adc17
bnx2: stop using net_device.{base_addr, irq}.
removed netdev->base_addr so we need to update cnic to get the MMIO
base address from pci_resource_start(). Otherwise, mmap of the uio
device will fail.
Signed-off-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/ethernet/broadcom/cnic.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index c95e7b5..3c95065 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -1053,12 +1053,13 @@ static int cnic_init_uio(struct cnic_dev *dev)
uinfo = &udev->cnic_uinfo;
- uinfo->mem[0].addr = dev->netdev->base_addr;
+ uinfo->mem[0].addr = pci_resource_start(dev->pcidev, 0);
uinfo->mem[0].internal_addr = dev->regview;
- uinfo->mem[0].size = dev->netdev->mem_end - dev->netdev->mem_start;
uinfo->mem[0].memtype = UIO_MEM_PHYS;
if (test_bit(CNIC_F_BNX2_CLASS, &dev->flags)) {
+ uinfo->mem[0].size = MB_GET_CID_ADDR(TX_TSS_CID +
+ TX_MAX_TSS_RINGS + 1);
uinfo->mem[1].addr = (unsigned long) cp->status_blk.gen &
PAGE_MASK;
if (cp->ethdev->drv_state & CNIC_DRV_STATE_USING_MSIX)
@@ -1068,6 +1069,8 @@ static int cnic_init_uio(struct cnic_dev *dev)
uinfo->name = "bnx2_cnic";
} else if (test_bit(CNIC_F_BNX2X_CLASS, &dev->flags)) {
+ uinfo->mem[0].size = pci_resource_len(dev->pcidev, 0);
+
uinfo->mem[1].addr = (unsigned long) cp->bnx2x_def_status_blk &
PAGE_MASK;
uinfo->mem[1].size = sizeof(*cp->bnx2x_def_status_blk);
--
1.7.1
^ permalink raw reply related
* Re: [PATCH net-next] cnic: Fix mmap regression.
From: Michael Chan @ 2012-07-05 23:34 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20120705.153638.790030674286651971.davem@davemloft.net>
On Thu, 2012-07-05 at 15:36 -0700, David Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Thu, 5 Jul 2012 14:59:46 -0700
>
> > Or you want me to send you the equivalent patches for net.
>
> Please do so.
>
OK. I'll send you one patch to fix it in net, instead of one that
causes regression, and another one to fix it.
^ permalink raw reply
* Re: [PATCH] force dentry revalidation after namespace change
From: Eric W. Biederman @ 2012-07-05 23:31 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-kernel, netdev, Andrew Morton, Tejun Heo,
Greg Kroah-Hartman
In-Reply-To: <1341496805-26394-1-git-send-email-glommer@parallels.com>
Glauber Costa <glommer@parallels.com> writes:
> When we change the namespace tag of a sysfs entry, the associated dentry
> is still kept around. readdir() will work correctly and not display the
> old entries, but open() will still succeed, so will reads and writes.
>
> This will no longer happen if sysfs is remounted, hinting that this is a
> cache-related problem.
Equalivalently to remounting you can do
echo 3 > /proc/sys/vm/drop_caches.
> I am using the following sequence to demonstrate that:
>
> shell1:
> ip link add type veth
> unshare -nm
>
> shell2:
> ip link set veth1 <pid_of_shell_1>
> cat /sys/devices/virtual/net/veth1/ifindex
>
> Before that patch, this will succeed (fail to fail). After it, it will
> correctly return an error. Differently from a normal rename, which we
> handle fine, changing the object namespace will keep it's path intact.
> So this check seems necessary as well.
Overall good bug spotting, and good spotting of where the fix should
live.
Your summary should have said:
[PATCH] fail dentry revalidation after namespace change
And you have the test slightly wrong below.
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> CC: Tejun Heo <tj@kernel.org>
> CC: Eric W. Biederman <ebiederm@xmission.com>
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> fs/sysfs/dir.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index e6bb9b2..c24bdd9 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -307,6 +307,7 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> struct sysfs_dirent *sd;
> int is_dir;
> + int type;
>
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
> @@ -314,6 +315,10 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> sd = dentry->d_fsdata;
> mutex_lock(&sysfs_mutex);
>
> + type = sysfs_ns_type(sd);
> + if (sd->s_ns && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
> + goto out_bad;
> +
First this check should be down below with after the other rename
checks.
Second the test should be:
type = KOBJ_NS_TYPE_NONE;
if (sd->s_parent)
type = sysfs_ns_type(sd->s_parent);
if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
goto out_bad;
The important difference there it is the directory that the dirent is
in that the type comes from. Not the dirent itself.
> /* The sysfs dirent has been deleted */
> if (sd->s_flags & SYSFS_FLAG_REMOVED)
> goto out_bad;
Glauber. Do you think you can fix your patch and resubmit.
Eric
^ permalink raw reply
* Re: [iproute2] display vlan configuration
From: John Fastabend @ 2012-07-05 23:20 UTC (permalink / raw)
To: Fabien C.; +Cc: netdev
In-Reply-To: <4FF61DE9.7000507@jetable.org>
On 7/5/2012 4:06 PM, Fabien C. wrote:
> Hello,
>
> it looks like there is no way to show the vlan configuration with iproute (nor with any other tool apparently).
>
> This can lead to trouble since :
> # ip link add link eth0 name eth2.333 type vlan id 444
>
> will create an interface that will show up like this with "ip link show" :
> 51: eth2.333@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
>
> The only hint we have is the interface name, which may not be related to the vlan id we set earlier.
Here you need to show the details,
#ip -d link show dev eth2.333
From my current setup,
# ip -d link show dev vlan0
33: vlan0@eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 00:1b:21:55:23:59 brd ff:ff:ff:ff:ff:ff
vlan id 101 <REORDER_HDR>
^ permalink raw reply
* [iproute2] display vlan configuration
From: Fabien C. @ 2012-07-05 23:06 UTC (permalink / raw)
To: netdev
Hello,
it looks like there is no way to show the vlan configuration with iproute (nor with any other tool apparently).
This can lead to trouble since :
# ip link add link eth0 name eth2.333 type vlan id 444
will create an interface that will show up like this with "ip link show" :
51: eth2.333@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
The only hint we have is the interface name, which may not be related to the vlan id we set earlier.
Is there any way to get that information?
Thanks,
Fabien
^ permalink raw reply
* Re: [B.A.T.M.A.N.] [PATCH net] Bug fix for batman-adv 2012-07-06
From: Antonio Quartulli @ 2012-07-05 22:51 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n
In-Reply-To: <1341528514-27906-1-git-send-email-ordex@autistici.org>
[-- Attachment #1: Type: text/plain, Size: 4380 bytes --]
On Fri, Jul 06, 2012 at 12:48:33 +0200, Antonio Quartulli wrote:
> here I have a fix intended for net/linux-3.5.
...
Hello David,
here you have our instructions to resolve the conflicts that you will hit while
merging net into net-next:
Conflict 1 (bridge_loop_avoidance.c):
<<<<<<<
int batadv_bla_rx(struct batadv_priv *bat_priv, struct sk_buff *skb, short vid)
=======
int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid,
bool is_bcast)
>>>>>>>
resolves to:
int batadv_bla_rx(struct batadv_priv *bat_priv, struct sk_buff *skb, short vid,
bool is_bcast)
Conflict 2 (bridge_loop_avoidance.h):
<<<<<<<
int batadv_bla_rx(struct batadv_priv *bat_priv, struct sk_buff *skb, short vid);
int batadv_bla_tx(struct batadv_priv *bat_priv, struct sk_buff *skb, short vid);
int batadv_bla_is_backbone_gw(struct sk_buff *skb,
struct batadv_orig_node *orig_node, int hdr_size);
int batadv_bla_claim_table_seq_print_text(struct seq_file *seq, void *offset);
int batadv_bla_is_backbone_gw_orig(struct batadv_priv *bat_priv, uint8_t *orig);
int batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv,
struct batadv_bcast_packet *bcast_packet,
int hdr_size);
void batadv_bla_update_orig_address(struct batadv_priv *bat_priv,
struct batadv_hard_iface *primary_if,
struct batadv_hard_iface *oldif);
int batadv_bla_init(struct batadv_priv *bat_priv);
void batadv_bla_free(struct batadv_priv *bat_priv);
=======
int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid,
bool is_bcast);
int bla_tx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid);
int bla_is_backbone_gw(struct sk_buff *skb,
struct orig_node *orig_node, int hdr_size);
int bla_claim_table_seq_print_text(struct seq_file *seq, void *offset);
int bla_is_backbone_gw_orig(struct bat_priv *bat_priv, uint8_t *orig);
int bla_check_bcast_duplist(struct bat_priv *bat_priv,
struct bcast_packet *bcast_packet, int hdr_size);
void bla_update_orig_address(struct bat_priv *bat_priv,
struct hard_iface *primary_if,
struct hard_iface *oldif);
int bla_init(struct bat_priv *bat_priv);
void bla_free(struct bat_priv *bat_priv);
>>>>>>>
resolves to:
int batadv_bla_rx(struct batadv_priv *bat_priv, struct sk_buff *skb, short vid,
bool is_bcast);
int batadv_bla_tx(struct batadv_priv *bat_priv, struct sk_buff *skb, short vid);
int batadv_bla_is_backbone_gw(struct sk_buff *skb,
struct batadv_orig_node *orig_node, int hdr_size);
int batadv_bla_claim_table_seq_print_text(struct seq_file *seq, void *offset);
int batadv_bla_is_backbone_gw_orig(struct batadv_priv *bat_priv, uint8_t *orig);
int batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv,
struct batadv_bcast_packet *bcast_packet,
int hdr_size);
void batadv_bla_update_orig_address(struct batadv_priv *bat_priv,
struct batadv_hard_iface *primary_if,
struct batadv_hard_iface *oldif);
int batadv_bla_init(struct batadv_priv *bat_priv);
void batadv_bla_free(struct batadv_priv *bat_priv);
Conflict 3 (bridge_loop_avoidance.h):
<<<<<<<
static inline int batadv_bla_rx(struct batadv_priv *bat_priv,
struct sk_buff *skb, short vid)
=======
static inline int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb,
short vid, bool is_bcast)
>>>>>>>
resolves to:
static inline int batadv_bla_rx(struct batadv_priv *bat_priv,
struct sk_buff *skb, short vid, bool is_bcast)
Conflict 4 (soft-interface.c):
<<<<<<<
__be16 ethertype = __constant_htons(BATADV_ETH_P_BATMAN);
=======
bool is_bcast;
is_bcast = (batadv_header->packet_type == BAT_BCAST);
>>>>>>>
resolves to:
bool is_bcast;
__be16 ethertype = __constant_htons(BATADV_ETH_P_BATMAN);
is_bcast = (batadv_header->packet_type == BATADV_BCAST);
Conflict 5 (soft-interface.c):
<<<<<<<
if (batadv_bla_rx(bat_priv, skb, vid))
=======
if (bla_rx(bat_priv, skb, vid, is_bcast))
>>>>>>>
resolves to:
if (batadv_bla_rx(bat_priv, skb, vid, is_bcast))
Wrong merge by git (soft-interface.c):
line 270 must look like this:
struct batadv_header *batadv_header = (struct batadv_header *)skb->data;
--
Antonio Quartulli
..each of us alone is worth nothing..
Ernesto "Che" Guevara
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* [PATCH net] batman-adv: check incoming packet type for bla
From: Antonio Quartulli @ 2012-07-05 22:48 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Simon Wunderlich, Simon Wunderlich
In-Reply-To: <1341528514-27906-1-git-send-email-ordex@autistici.org>
From: Simon Wunderlich <simon.wunderlich@s2003.tu-chemnitz.de>
If the gateway functionality is used, some broadcast packets (DHCP
requests) may be transmitted as unicast packets. As the bridge loop
avoidance code now only considers the payload Ethernet destination,
it may drop the DHCP request for clients which are claimed by other
backbone gateways, because it falsely infers from the broadcast address
that the right backbone gateway should havehandled the broadcast.
Fix this by checking and delegating the batman-adv packet type used
for transmission.
Reported-by: Guido Iribarren <guidoiribarren@buenosaireslibre.org>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
---
net/batman-adv/bridge_loop_avoidance.c | 15 +++++++++++----
net/batman-adv/bridge_loop_avoidance.h | 5 +++--
net/batman-adv/soft-interface.c | 6 +++++-
3 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c
index 8bf9751..c5863f4 100644
--- a/net/batman-adv/bridge_loop_avoidance.c
+++ b/net/batman-adv/bridge_loop_avoidance.c
@@ -1351,6 +1351,7 @@ void bla_free(struct bat_priv *bat_priv)
* @bat_priv: the bat priv with all the soft interface information
* @skb: the frame to be checked
* @vid: the VLAN ID of the frame
+ * @is_bcast: the packet came in a broadcast packet type.
*
* bla_rx avoidance checks if:
* * we have to race for a claim
@@ -1361,7 +1362,8 @@ void bla_free(struct bat_priv *bat_priv)
* process the skb.
*
*/
-int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid)
+int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid,
+ bool is_bcast)
{
struct ethhdr *ethhdr;
struct claim search_claim, *claim = NULL;
@@ -1380,7 +1382,7 @@ int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid)
if (unlikely(atomic_read(&bat_priv->bla_num_requests)))
/* don't allow broadcasts while requests are in flight */
- if (is_multicast_ether_addr(ethhdr->h_dest))
+ if (is_multicast_ether_addr(ethhdr->h_dest) && is_bcast)
goto handled;
memcpy(search_claim.addr, ethhdr->h_source, ETH_ALEN);
@@ -1406,8 +1408,13 @@ int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid)
}
/* if it is a broadcast ... */
- if (is_multicast_ether_addr(ethhdr->h_dest)) {
- /* ... drop it. the responsible gateway is in charge. */
+ if (is_multicast_ether_addr(ethhdr->h_dest) && is_bcast) {
+ /* ... drop it. the responsible gateway is in charge.
+ *
+ * We need to check is_bcast because with the gateway
+ * feature, broadcasts (like DHCP requests) may be sent
+ * using a unicast packet type.
+ */
goto handled;
} else {
/* seems the client considers us as its best gateway.
diff --git a/net/batman-adv/bridge_loop_avoidance.h b/net/batman-adv/bridge_loop_avoidance.h
index e39f93a..dc5227b 100644
--- a/net/batman-adv/bridge_loop_avoidance.h
+++ b/net/batman-adv/bridge_loop_avoidance.h
@@ -23,7 +23,8 @@
#define _NET_BATMAN_ADV_BLA_H_
#ifdef CONFIG_BATMAN_ADV_BLA
-int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid);
+int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid,
+ bool is_bcast);
int bla_tx(struct bat_priv *bat_priv, struct sk_buff *skb, short vid);
int bla_is_backbone_gw(struct sk_buff *skb,
struct orig_node *orig_node, int hdr_size);
@@ -41,7 +42,7 @@ void bla_free(struct bat_priv *bat_priv);
#else /* ifdef CONFIG_BATMAN_ADV_BLA */
static inline int bla_rx(struct bat_priv *bat_priv, struct sk_buff *skb,
- short vid)
+ short vid, bool is_bcast)
{
return 0;
}
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
index 6e2530b..a0ec0e4 100644
--- a/net/batman-adv/soft-interface.c
+++ b/net/batman-adv/soft-interface.c
@@ -256,7 +256,11 @@ void interface_rx(struct net_device *soft_iface,
struct bat_priv *bat_priv = netdev_priv(soft_iface);
struct ethhdr *ethhdr;
struct vlan_ethhdr *vhdr;
+ struct batman_header *batadv_header = (struct batman_header *)skb->data;
short vid __maybe_unused = -1;
+ bool is_bcast;
+
+ is_bcast = (batadv_header->packet_type == BAT_BCAST);
/* check if enough space is available for pulling, and pull */
if (!pskb_may_pull(skb, hdr_size))
@@ -302,7 +306,7 @@ void interface_rx(struct net_device *soft_iface,
/* Let the bridge loop avoidance check the packet. If will
* not handle it, we can safely push it up.
*/
- if (bla_rx(bat_priv, skb, vid))
+ if (bla_rx(bat_priv, skb, vid, is_bcast))
goto out;
netif_rx(skb);
--
1.7.9.4
^ permalink raw reply related
* [PATCH net] Bug fix for batman-adv 2012-07-06
From: Antonio Quartulli @ 2012-07-05 22:48 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n
here I have a fix intended for net/linux-3.5.
The bug, discovered by Guido Iribarren and fixed by Simon Wunderlich, is caused
by the wrong interaction between the Bridge Loop Avoidance and the Gateway
feature of batman-adv.
Let me know if there are problems.
Thank you,
Antonio
The following changes since commit 9e85a6f9dc231f3ed3c1dc1b12217505d970142a:
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux (2012-07-03 18:06:49 -0700)
are available in the git repository at:
git://git.open-mesh.org/linux-merge.git tags/batman-adv-fix-for-davem
for you to fetch changes up to 2d3f6ccc4ea5c74d4b4af1b47c56b4cff4bbfcb7:
batman-adv: check incoming packet type for bla (2012-07-06 00:08:46 +0200)
----------------------------------------------------------------
Included changes:
- fix a bug generated by the wrong interaction between the GW feature and the
Bridge Loop Avoidance
----------------------------------------------------------------
Simon Wunderlich (1):
batman-adv: check incoming packet type for bla
net/batman-adv/bridge_loop_avoidance.c | 15 +++++++++++----
net/batman-adv/bridge_loop_avoidance.h | 5 +++--
net/batman-adv/soft-interface.c | 6 +++++-
3 files changed, 19 insertions(+), 7 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next] cnic: Fix mmap regression.
From: David Miller @ 2012-07-05 22:36 UTC (permalink / raw)
To: mchan; +Cc: netdev
In-Reply-To: <1341525586.7472.25.camel@LTIRV-MCHAN1.corp.ad.broadcom.com>
From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 5 Jul 2012 14:59:46 -0700
> Or you want me to send you the equivalent patches for net.
Please do so.
^ permalink raw reply
* Re: [PATCH] force dentry revalidation after namespace change
From: Serge E. Hallyn @ 2012-07-05 22:17 UTC (permalink / raw)
To: Glauber Costa
Cc: linux-kernel, netdev, Andrew Morton, Tejun Heo, Eric W. Biederman,
Greg Kroah-Hartman
In-Reply-To: <1341496805-26394-1-git-send-email-glommer@parallels.com>
Quoting Glauber Costa (glommer@parallels.com):
> When we change the namespace tag of a sysfs entry, the associated dentry
> is still kept around. readdir() will work correctly and not display the
> old entries, but open() will still succeed, so will reads and writes.
>
> This will no longer happen if sysfs is remounted, hinting that this is a
> cache-related problem.
>
> I am using the following sequence to demonstrate that:
>
> shell1:
> ip link add type veth
> unshare -nm
>
> shell2:
> ip link set veth1 <pid_of_shell_1>
> cat /sys/devices/virtual/net/veth1/ifindex
>
> Before that patch, this will succeed (fail to fail). After it, it will
Confirmed that it currently fails to fail :)
> correctly return an error. Differently from a normal rename, which we
> handle fine, changing the object namespace will keep it's path intact.
> So this check seems necessary as well.
>
> Signed-off-by: Glauber Costa <glommer@parallels.com>
Haven't run it, but the patch looks good. Thanks, Glauber.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> CC: Tejun Heo <tj@kernel.org>
> CC: Eric W. Biederman <ebiederm@xmission.com>
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> fs/sysfs/dir.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index e6bb9b2..c24bdd9 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -307,6 +307,7 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> struct sysfs_dirent *sd;
> int is_dir;
> + int type;
>
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
> @@ -314,6 +315,10 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> sd = dentry->d_fsdata;
> mutex_lock(&sysfs_mutex);
>
> + type = sysfs_ns_type(sd);
> + if (sd->s_ns && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
> + goto out_bad;
> +
> /* The sysfs dirent has been deleted */
> if (sd->s_flags & SYSFS_FLAG_REMOVED)
> goto out_bad;
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply
* Re: [PATCH net-next] cnic: Fix mmap regression.
From: Michael Chan @ 2012-07-05 21:59 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20120629.153425.24594752441419170.davem@davemloft.net>
On Fri, 2012-06-29 at 15:34 -0700, David Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Fri, 29 Jun 2012 12:32:45 -0700
>
> > commit 1f85d58cdf15354a7120fc9ccc9bb9c45b53af88
> > cnic: Remove uio mem[0].
> >
> > introduced a regression as older versions of userspace app still rely
> > on this mmap. Restore the mmap functionality and get the base address
> > from pci_resource_start() as the nedev->base_addr has been deprecated for
> > PCI devices.
> >
> > Update version to 2.5.12.
> >
> > Signed-off-by: Michael Chan <mchan@broadocm.com>
>
> I really couldn't believe what you guys were doing in the original
> commit, but I decided to let you do stupid things and find out the
> hard way that removing any user visible interface is basically
> impossible.
>
> Applied, thanks.
>
David, this patch plus the earlier commit are also needed for the net
tree because netdev->base_addr was removed there. Can you apply these
directly to the net tree? Or you want me to send you the equivalent
patches for net. Thanks.
^ permalink raw reply
* [PATCH] gianfar: fix potential sk_wmem_alloc imbalance
From: Eric Dumazet @ 2012-07-05 21:45 UTC (permalink / raw)
To: David Miller
Cc: netdev, Manfred Rudigier, Claudiu Manoil, Jiajun Wu,
Paul Gortmaker, Andy Fleming
From: Eric Dumazet <edumazet@google.com>
commit db83d136d7f753 (gianfar: Fix missing sock reference when
processing TX time stamps) added a potential sk_wmem_alloc imbalance
If the new skb has a different truesize than old one, we can get a
negative sk_wmem_alloc once new skb is orphaned at TX completion.
Now we no longer early orphan skbs in dev_hard_start_xmit(), this
probably can lead to fatal bugs.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Manfred Rudigier <manfred.rudigier@omicron.at>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Cc: Jiajun Wu <b06378@freescale.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Andy Fleming <afleming@freescale.com>
---
Note : I don't have the hardware and discovered this problem by code
analysis. So please compile and run this patch before Acking it,
thanks !
BTW, dev->needed_headroom should be set to GMAC_FCB_LEN + GMAC_TXPAL_LEN
to avoid reallocations...
drivers/net/ethernet/freescale/gianfar.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index f2db8fc..ab1d80f 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2063,10 +2063,9 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
return NETDEV_TX_OK;
}
- /* Steal sock reference for processing TX time stamps */
- swap(skb_new->sk, skb->sk);
- swap(skb_new->destructor, skb->destructor);
- kfree_skb(skb);
+ if (skb->sk)
+ skb_set_owner_w(skb_new, skb->sk);
+ consume_skb(skb);
skb = skb_new;
}
^ permalink raw reply related
* Re: [PATCH 0/5] rtcache remove respin
From: David Miller @ 2012-07-05 21:32 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341515017.3265.6.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 05 Jul 2012 21:03:37 +0200
> If route cache is removed, I believe we can remove all paddings.
>
> Each tcp session will have its own dst_entry, instead of being shared.
Not really, the routing cache removal patches have poor performance
and won't go-in as-is. :-) Once PMTU/redirect/TCP-metrics are reworked
I plan to do things like the patch below to make the performance loss
more acceptable.
And then I'll do the same for input routes too, at which point your
'noref' case can be put back.
So really, we have to consider how to rework the layout of this
structure.
Thanks.
====================
ipv4: Cache output routes in fib_info nexthops.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/net/ip_fib.h | 3 +++
net/ipv4/fib_semantics.c | 2 ++
net/ipv4/route.c | 9 +++++++++
3 files changed, 14 insertions(+)
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 3dc7c96..ff9f0c4 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -45,6 +45,7 @@ struct fib_config {
};
struct fib_info;
+struct rtable;
struct fib_nh {
struct net_device *nh_dev;
@@ -63,6 +64,8 @@ struct fib_nh {
__be32 nh_gw;
__be32 nh_saddr;
int nh_saddr_genid;
+
+ struct rtable *rth;
};
/*
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index c46c20b..f3ada74 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -148,6 +148,8 @@ static void free_fib_info_rcu(struct rcu_head *head)
change_nexthops(fi) {
if (nexthop_nh->nh_dev)
dev_put(nexthop_nh->nh_dev);
+ if (nexthop_nh->rth)
+ dst_release(&nexthop_nh->rth->dst);
} endfor_nexthops(fi);
release_net(fi->fib_net);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 9f68f74..35bfd98 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -914,6 +914,8 @@ static void rt_set_nexthop(struct rtable *rt, const struct flowi4 *fl4,
#ifdef CONFIG_IP_ROUTE_CLASSID
dst->tclassid = FIB_RES_NH(*res).nh_tclassid;
#endif
+ FIB_RES_NH(*res).rth = rt;
+ dst_clone(&rt->dst);
}
if (dst_mtu(dst) > IP_MAX_MTU)
@@ -1399,6 +1401,13 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
fi = NULL;
}
+ if (fi) {
+ rth = FIB_RES_NH(*res).rth;
+ if (rth) {
+ dst_use(&rth->dst, jiffies);
+ return rth;
+ }
+ }
rth = rt_dst_alloc(dev_out,
IN_DEV_CONF_GET(in_dev, NOPOLICY),
IN_DEV_CONF_GET(in_dev, NOXFRM));
--
1.7.10
^ permalink raw reply related
* Re: ipv6 problem with 6lowpan
From: David Miller @ 2012-07-05 21:22 UTC (permalink / raw)
To: alex.bluesman.smirnov; +Cc: netdev
In-Reply-To: <CAJmB2rD8U1ihy4Ai6y5QGjj4f7txDabszesrNrQ=pgEbscePqQ@mail.gmail.com>
Should be fixed by Steffen Kassert's patch which I just pushed into net-next
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox