Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [regression] fib6_del() bug from 2.6.34-rc1 still present in 2.6.34
From: Shan Wei @ 2010-05-24  1:56 UTC (permalink / raw)
  To: Julien BLACHE; +Cc: netdev, linux-kernel
In-Reply-To: <87632g53tx.fsf@sonic.technologeek.org>

Julien BLACHE wrote, at 05/22/2010 11:55 PM:
> Hi,
> 
> [subscribed to lkml but not netdev, Cc me on replies]
> 
> I'm seeing a warning in fib6_del() that is very close to what was
> reported by Emil S Tantilov back in march/april for 2.6.34-rc1:
> 
> <http://kerneltrap.org/mailarchive/linux-netdev/2010/4/9/6274401/thread>
> 
> It looks like there hasn't been a fix, other than what was mentioned in
> this thread for net-next and Emil reported that it did not fix it for
> him. So it looks like it's still there, alive and kicking.
> 
> This is the warning I'm getting:

I saw the warning on 2.6.34-rc3. But on net-next tree the warning disappeared.
So I think the bug is fixed in net-next tree. My NIC driver is r8169.

The net-next tree is newer than 2.6.34, maybe the fix is not queued to 2.6.34.
So the warning also be present in 2.6.34.
Can you try net-next tree firstly?

-- 
Best Regards
-----
Shan Wei


> 
> ------------[ cut here ]------------
> WARNING: at net/ipv6/ip6_fib.c:1160 fib6_del+0x506/0x5b0()
> Hardware name: MacBookPro2,2
> Modules linked in: sco bnep rfcomm l2cap crc16 cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsd nfs lockd auth_rpcgss sunrpc uinput btusb ath9k ath9k_common mac80211 ath9k_hw ath isight_firmware joydev cfg80211 i2c_i801 ohci1394 ieee1394 [last unloaded: scsi_wait_scan]
> Pid: 4020, comm: ifconfig Not tainted 2.6.34 #1
> Call Trace:
>  [<ffffffff810389d3>] ? warn_slowpath_common+0x73/0xb0
>  [<ffffffff8142f956>] ? fib6_del+0x506/0x5b0
>  [<ffffffff8102ceb3>] ? __wake_up+0x43/0x70
>  [<ffffffff813c68ef>] ? netlink_broadcast+0x21f/0x410
>  [<ffffffff8142c2ab>] ? __ip6_del_rt+0x4b/0x80
>  [<ffffffff8142c436>] ? ip6_del_rt+0x26/0x30
>  [<ffffffff81426dff>] ? __ipv6_ifa_notify+0x15f/0x200
>  [<ffffffff81428d99>] ? addrconf_ifdown+0x159/0x350
>  [<ffffffff8142915d>] ? addrconf_notify+0xed/0x920
>  [<ffffffff81043d33>] ? lock_timer_base+0x33/0x70
>  [<ffffffff810445ab>] ? mod_timer+0x11b/0x1a0
>  [<ffffffff81054826>] ? notifier_call_chain+0x46/0x70
>  [<ffffffff813b1ae5>] ? __dev_notify_flags+0x65/0x90
>  [<ffffffff813b1b4b>] ? dev_change_flags+0x3b/0x70
>  [<ffffffff813fd2a2>] ? devinet_ioctl+0x602/0x750
>  [<ffffffff813a12ea>] ? T.945+0x1a/0x50
>  [<ffffffff813a1589>] ? sock_ioctl+0x59/0x2a0
>  [<ffffffff810bee55>] ? vfs_ioctl+0x35/0xd0
>  [<ffffffff810bf018>] ? do_vfs_ioctl+0x88/0x570
>  [<ffffffff810bf549>] ? sys_ioctl+0x49/0x80
>  [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
> ---[ end trace b5a833c8e5539431 ]---
> 
> I can reliably reproduce it on both ath9k and sky2 with the
> following sequence:
> 
>  # ifconfig eth0 up
>  # ifconfig eth0 add 2001:7a8:5dd7:123::12/64
>  # ifconfig eth0 down
>  # ifconfig eth0 up
>  # ifconfig eth0 add 2001:7a8:5dd7:123::12/64 <=== fails
>  # ifconfig eth0 down <=== triggers the warning
> 
> Note that this sequence is equivalent to:
>  # ifup eth0
>  # ifdown eth0
>  # ifup eth0 (will fail because it cannot add the v6 address)
>  # ifconfig eth0 down
> 
> This regression breaks ifupdown as it always tries to add the v6 address
> when configuring the interface. It's a behaviour change compared to
> previous kernel versions.
> 
> It looks like triggering this warning a couple times (3-4) in a row ends
> up locking up the machine, too.
> 
> I can test patches etc.
> 
> JB.
> 

^ permalink raw reply

* [PATCH] ethoc: fix null dereference in ethoc_probe
From: Thomas Chou @ 2010-05-24  2:44 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, thierry.reding, nios2-dev, Dan Carpenter,
	Thomas Chou
In-Reply-To: <20100522155455.GE22515@bicker>

Dan reported the patch 0baa080c75c: "ethoc: use system memory
as buffer" introduced a potential null dereference.

  1060  free:
  1061          if (priv->dma_alloc)
                    ^^^^^^^^^^^^^^^
	priv can be null here.

He also suggested that the error handling is not complete.

This patch fixes the null priv issue and improves resources
releasing in ethoc_probe() and ethoc_remove().

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
---
 drivers/net/ethoc.c |   34 ++++++++++++++++++++++++++++++----
 1 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index a8d9250..fddd5f5 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -174,6 +174,7 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
  * @iobase:	pointer to I/O memory region
  * @membase:	pointer to buffer memory region
  * @dma_alloc:	dma allocated buffer size
+ * @io_region_size:	I/O memory region size
  * @num_tx:	number of send buffers
  * @cur_tx:	last send buffer written
  * @dty_tx:	last buffer actually sent
@@ -193,6 +194,7 @@ struct ethoc {
 	void __iomem *iobase;
 	void __iomem *membase;
 	int dma_alloc;
+	resource_size_t io_region_size;
 
 	unsigned int num_tx;
 	unsigned int cur_tx;
@@ -944,6 +946,7 @@ static int ethoc_probe(struct platform_device *pdev)
 	priv = netdev_priv(netdev);
 	priv->netdev = netdev;
 	priv->dma_alloc = 0;
+	priv->io_region_size = mmio->end - mmio->start + 1;
 
 	priv->iobase = devm_ioremap_nocache(&pdev->dev, netdev->base_addr,
 			resource_size(mmio));
@@ -1049,20 +1052,34 @@ static int ethoc_probe(struct platform_device *pdev)
 	ret = register_netdev(netdev);
 	if (ret < 0) {
 		dev_err(&netdev->dev, "failed to register interface\n");
-		goto error;
+		goto error2;
 	}
 
 	goto out;
 
+error2:
+	netif_napi_del(&priv->napi);
 error:
 	mdiobus_unregister(priv->mdio);
 free_mdio:
 	kfree(priv->mdio->irq);
 	mdiobus_free(priv->mdio);
 free:
-	if (priv->dma_alloc)
-		dma_free_coherent(NULL, priv->dma_alloc, priv->membase,
-			netdev->mem_start);
+	if (priv) {
+		if (priv->dma_alloc)
+			dma_free_coherent(NULL, priv->dma_alloc, priv->membase,
+					  netdev->mem_start);
+		else if (priv->membase)
+			devm_iounmap(&pdev->dev, priv->membase);
+		if (priv->iobase)
+			devm_iounmap(&pdev->dev, priv->iobase);
+	}
+	if (mem)
+		devm_release_mem_region(&pdev->dev, mem->start,
+					mem->end - mem->start + 1);
+	if (mmio)
+		devm_release_mem_region(&pdev->dev, mmio->start,
+					mmio->end - mmio->start + 1);
 	free_netdev(netdev);
 out:
 	return ret;
@@ -1080,6 +1097,7 @@ static int ethoc_remove(struct platform_device *pdev)
 	platform_set_drvdata(pdev, NULL);
 
 	if (netdev) {
+		netif_napi_del(&priv->napi);
 		phy_disconnect(priv->phy);
 		priv->phy = NULL;
 
@@ -1091,6 +1109,14 @@ static int ethoc_remove(struct platform_device *pdev)
 		if (priv->dma_alloc)
 			dma_free_coherent(NULL, priv->dma_alloc, priv->membase,
 				netdev->mem_start);
+		else {
+			devm_iounmap(&pdev->dev, priv->membase);
+			devm_release_mem_region(&pdev->dev, netdev->mem_start,
+				netdev->mem_end - netdev->mem_start + 1);
+		}
+		devm_iounmap(&pdev->dev, priv->iobase);
+		devm_release_mem_region(&pdev->dev, netdev->base_addr,
+			priv->io_region_size);
 		unregister_netdev(netdev);
 		free_netdev(netdev);
 	}
-- 
1.7.1.86.g0e460


^ permalink raw reply related

* [RFC PATCH 0/2] netdev: show the number of tx-packets in device
From: Koki Sanagi @ 2010-05-24  4:49 UTC (permalink / raw)
  To: netdev; +Cc: davem, nhorman, kaneshige.kenji, izumi.taku

This patch-set adds tracepoints to dev_hard_start_xmit, consume_skb and
dev_kfree_skb_irq and perf script which calculates the time from entry of
ndo_start_xmit to dev_kfree_skb_* and the number of tx-packets in device.

-Perf script description
This script calculates two metric.

metric1: lap time between start_xmit - free_skb
This script calculate the time a packet passes from entry of ndo_start_xmit to
dev_kfree_skb_irq or consume_skb. It indicate a time driver/device owns that
packet. This script outputs an average time of all packets and a longest of
that.

metric2: number of packets in device
>From the above time, we can calculate the number of packets in device at a
certain time. This script outputs an average of the number of packets in device
and a largest of that.

-Merit
These tracepoints and script have two merits.

1. Detecting a packed tx-ring of network device
2. Detecting a defect of transmit functionality of network device

merit1: Detecting a packed tx-ring of network device

Using attached scripts, we can get a maximum number of packets in a device. If
it reaches to the number of packets a device can own, tx-ring of that device is
full and causes loss of network transmit performance. Because the driver of the
device drop packet or stops tx-ring and reject it until it keeps some space.
So, to keep good network transmit performance, it is good to keep some space in
tx-ring. To keep some space in tx-ring, these tracepoints and script are
useful.

To check this merit, I did a test.

Before starting a test, I want you to know that a maximum number of tx-packets
e1000e can own is (tx-ring size - 20) / 2 packets.
Because e1000e keeps 20 descriptors for frags and 1 packet needs 2 descriptors
due to tx-checksum.
So, if tx-ring size is 256,
(256 - 20) / 2 = 118 packets
If tx-ring size is 512,
(512 - 20) / 2 = 246 packets

Environment:
Test NIC:     Intel 82571EB (InterruptThrottleRate=1000)
Opponent NIC: Broadcom BCM5703X
InterruptThrottleRate was set to 1000 to make tx-ring packed deliberately.

Test load tool:
netperf -H XXX.XXX.XXX.XXX -t UDP_STREAM -- -m 1

With this environment, I compared following 2 cases.
1.Tx-ring size=256 case
2.Tx-ring size=512 case

Result:
1.The case of Tx-ring=256:
eth0    TX packets=1137811
        lap time between start_xmit - free_skb:
            avg=0.795687msec
            max=0.985911msec
         number of packets in device:
            avg=  64.66
            max= 118

netperf's result:
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec
112640       1   10.00     1179077      0       0.94
109568           10.00      544750              0.44

2.The case of Tx-ring=512:
eth0    TX packets=1531629
        lap time between start_xmit - free_skb:
            avg=0.370052msec
            max=0.982069msec
        number of packets in device:
            avg=  49.70
            max= 164

netperf's result:
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec
112640       1   10.00     1577840      0       1.26
109568           10.00      871058              0.70

In the case of tx-ring size=256(default),  maximum number of packets in device
reaches to 118. So this tx-ring is full and becomes a cause of network
performance loss.
On the other hand, in the case of tx-ring size=512, e1000e can own 246 packets,
but maximum number of packets in device doesn't reach it. so tx-ring has always
some space and there is  no network performance loss caused by packed tx-ring.
Actually, about transmit throughput, The case of Tx-ring size=512 is better
than the case of tx-ring size=256.
Like this, the number of packets in device is available to tune tx-ring size or
other parameters to avoid packed tx-ring and is related to network transmit
performance.

merit2: Detecting a defect of transmit functionality of network device

When device can't transmit due to hardware fault or driver's bug(I've
encountered this), we can detect it. Because in such case, dev_hard_start_xmit
is passed, but dev_kfree_skb_* is not passed.

NOTE:
This script has some problem,

-The number of tx-packets of netperf and  that of this script are not equal.
-Sometimes The max number of packets in device larger than the device can own.

Thanks,
Koki Sanagi.

^ permalink raw reply

* [RFC PATCH 1/2] netdev: add tracepoint to dev_hard_start_xmit, consume_skb and dev_kfree_skb_irq
From: Koki Sanagi @ 2010-05-24  4:51 UTC (permalink / raw)
  To: netdev; +Cc: davem, nhorman, kaneshige.kenji, izumi.taku
In-Reply-To: <4BFA0551.3080304@jp.fujitsu.com>

This patch adds tracepoint to dev_hard_start_xmit, consume_skb and
dev_kfree_skb_irq to monitor tx-packets in device.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/trace/events/skb.h |   64 ++++++++++++++++++++++++++++++++++++++++++++
 net/core/dev.c             |    4 +++
 net/core/skbuff.c          |    1 +
 3 files changed, 69 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 4b2be6d..ed97580 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -9,6 +9,52 @@
 #include <linux/tracepoint.h>
 
 /*
+ *  netdev_start_xmit - invoked when skb is passed to the driver
+ *  @skb:		pointer to struct sk_buff
+ *  @dev:		pointer to struct net_device
+ */
+TRACE_EVENT(netdev_start_xmit,
+
+	TP_PROTO(struct sk_buff *skb,
+		 struct net_device *dev),
+
+	TP_ARGS(skb, dev),
+
+	TP_STRUCT__entry(
+		__field(	const void *,	skbaddr		)
+		__field(	unsigned int,	len		)
+		__string(	name,		dev->name	)
+       ),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__assign_str(name, dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u",
+		__get_str(name), __entry->skbaddr, __entry->len)
+);
+
+TRACE_EVENT(consume_skb,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+	),
+
+	TP_printk("skbaddr=%p",
+		__entry->skbaddr)
+);
+
+/*
  * Tracepoint for free an sk_buff:
  */
 TRACE_EVENT(kfree_skb,
@@ -35,6 +81,24 @@ TRACE_EVENT(kfree_skb,
 		__entry->skbaddr, __entry->protocol, __entry->location)
 );
 
+TRACE_EVENT(dev_kfree_skb_irq,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+	),
+
+	TP_printk("skbaddr=%p",
+		__entry->skbaddr)
+);
+
 TRACE_EVENT(skb_copy_datagram_iovec,
 
 	TP_PROTO(const struct sk_buff *skb, int len),
diff --git a/net/core/dev.c b/net/core/dev.c
index 32611c8..647d812 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -130,6 +130,7 @@
 #include <linux/jhash.h>
 #include <linux/random.h>
 #include <trace/events/napi.h>
+#include <trace/events/skb.h>
 #include <linux/pci.h>
 
 #include "net-sysfs.h"
@@ -1577,6 +1578,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
 		struct softnet_data *sd;
 		unsigned long flags;
 
+		trace_dev_kfree_skb_irq(skb);
 		local_irq_save(flags);
 		sd = &__get_cpu_var(softnet_data);
 		skb->next = sd->completion_queue;
@@ -1919,6 +1921,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				goto gso;
 		}
 
+		trace_netdev_start_xmit(skb, dev);
 		rc = ops->ndo_start_xmit(skb, dev);
 		if (rc == NETDEV_TX_OK)
 			txq_trans_update(txq);
@@ -1939,6 +1942,7 @@ gso:
 		if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
 			skb_dst_drop(nskb);
 
+		trace_netdev_start_xmit(nskb, dev);
 		rc = ops->ndo_start_xmit(nskb, dev);
 		if (unlikely(rc != NETDEV_TX_OK)) {
 			if (rc & ~NETDEV_TX_MASK)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a9b0e1f..b9c963c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
 		smp_rmb();
 	else if (likely(!atomic_dec_and_test(&skb->users)))
 		return;
+	trace_consume_skb(skb);
 	__kfree_skb(skb);
 }
 EXPORT_SYMBOL(consume_skb);


^ permalink raw reply related

* [RFC PATCH 2/2] netdev: perf script to show the number of tx-packets in device
From: Koki Sanagi @ 2010-05-24  4:52 UTC (permalink / raw)
  To: netdev; +Cc: davem, nhorman, kaneshige.kenji, izumi.taku
In-Reply-To: <4BFA0551.3080304@jp.fujitsu.com>

This patch adds perf script to calculate the time from entry of
ndo_start_xmit to dev_kfree_skb_* and the number of tx-packets in device.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 .../scripts/python/bin/tx-packet-in-device-record  |    2 +
 .../scripts/python/bin/tx-packet-in-device-report  |    4 +
 tools/perf/scripts/python/tx-packet-in-device.py   |  109 ++++++++++++++++++++
 3 files changed, 115 insertions(+), 0 deletions(-)

diff --git a/tools/perf/scripts/python/bin/tx-packet-in-device-record b/tools/perf/scripts/python/bin/tx-packet-in-device-record
new file mode 100644
index 0000000..18f2356
--- /dev/null
+++ b/tools/perf/scripts/python/bin/tx-packet-in-device-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -c 1 -f -a -M -R -e skb:netdev_start_xmit -e skb:consume_skb -e skb:dev_kfree_skb_irq
diff --git a/tools/perf/scripts/python/bin/tx-packet-in-device-report b/tools/perf/scripts/python/bin/tx-packet-in-device-report
new file mode 100644
index 0000000..8ef4cc2
--- /dev/null
+++ b/tools/perf/scripts/python/bin/tx-packet-in-device-report
@@ -0,0 +1,4 @@
+#!/bin/bash
+# description: netif_receive_skb counts per poll
+# args: [comm]
+perf trace -s ~/libexec/perf-core/scripts/python/tx-packet-in-device.py $1
diff --git a/tools/perf/scripts/python/tx-packet-in-device.py b/tools/perf/scripts/python/tx-packet-in-device.py
new file mode 100644
index 0000000..fb1933f
--- /dev/null
+++ b/tools/perf/scripts/python/tx-packet-in-device.py
@@ -0,0 +1,109 @@
+# perf trace event handlers, generated by perf trace -g python
+# Licensed under the terms of the GNU GPL License version 2
+
+# The common_* event handler fields are the most useful fields common to
+# all events.  They don't necessarily correspond to the 'common_*' fields
+# in the format files.  Those fields not available as handler params can
+# be retrieved using Python functions of the form common_*(context).
+# See the perf-trace-python Documentation for the list of available functions.
+
+import os
+import sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+	'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+# skb_dic = {skbaddr: {name:*, start_time:*}}
+#
+# skbaddr: address of skb through dev_hard_start_xmit
+# name: name of device
+# start_time: the time dev_start_hard_xmit pass
+skb_dic = {};
+
+# dev_stat_dic = {name: {pkt_in_tx:*, max_pkt_in_tx:*, total_pkt:*,
+#			 prev_time:*, total_time:*, max_lap_time:*,
+#			 total_lap_time:*}}
+#
+# name: name of device
+# pkt_in_tx: tx-packets a device has currently
+# max_pkt_in_tx: maximum of the above
+# total_pkt: total tx-packets through a device
+# prev_time: the time starting xmit or freeing skb
+#            happened previously
+# total_time: the time from first starting xmit to now
+# max_lap_time: maximum time from starting xmit to freein skb
+# total_lap_time: sum of time tx-packet is in device
+dev_stat_dic = {};
+
+def trace_end():
+	for name in sorted(dev_stat_dic.keys()):
+		cstat = dev_stat_dic[name]
+		print "%s\tTX packets=%d" %\
+			(name, cstat['total_pkt'])
+		print "\tlap time between start_xmit - free_skb:"
+		avg_nsec = avg(1.0 * cstat['total_lap_time'],
+				cstat['total_pkt'])
+		print "\t    avg=%fmsec" % (avg_nsec / 1000000.0)
+		print "\t    max=%fmsec" % (cstat['max_lap_time'] / 1000000.0)
+		print "\tnumber of packets in device:"
+		print "\t    avg=%7.2f" % avg(cstat['total_lap_time'] * 1.0,
+						cstat['total_time'])
+		print "\t    max=%4d" % cstat['max_pkt_in_tx']
+		print ""
+
+def skb__dev_kfree_skb_irq(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+		free_skb(event_name, context, common_cpu,
+		common_secs, common_nsecs, common_pid, common_comm,
+		skbaddr)
+
+def skb__consume_skb(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+		free_skb(event_name, context, common_cpu,
+		common_secs, common_nsecs, common_pid, common_comm,
+		skbaddr)
+
+def free_skb(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+		if skbaddr in skb_dic.keys():
+			ctime = nsecs(common_secs, common_nsecs)
+			lap_time = ctime - skb_dic[skbaddr]['start_time']
+			cstat = dev_stat_dic[skb_dic[skbaddr]['name']]
+			cstat['total_lap_time'] += lap_time;
+			cstat['total_pkt'] += 1;
+			cstat = dev_stat_dic[skb_dic[skbaddr]['name']]
+			cstat['total_time'] += ctime - cstat['prev_time']
+			cstat['prev_time'] = ctime
+			cstat['pkt_in_tx'] -= 1;
+			if lap_time > cstat['max_lap_time']:
+				cstat['max_lap_time'] = lap_time
+			del skb_dic[skbaddr]
+
+def skb__netdev_start_xmit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, len, name):
+		retry = 0
+		ctime = nsecs(common_secs, common_nsecs)
+		if skbaddr in skb_dic.keys():
+			retry = 1;
+		skb_dic[skbaddr] = {'name':name, 'start_time':ctime}
+		if name not in dev_stat_dic.keys():
+			dev_stat_dic[name] = {'pkt_in_tx':0, 'max_pkt_in_tx':0,\
+					'total_pkt':0,\
+					'prev_time':ctime, 'total_time':0,\
+					'max_lap_time':0, 'total_lap_time':0}
+		cstat = dev_stat_dic[name]
+		cstat['total_time'] += ctime - cstat['prev_time']
+		cstat['prev_time'] = ctime
+		if retry == 0:
+			cstat['pkt_in_tx'] += 1
+		if cstat['pkt_in_tx'] > cstat['max_pkt_in_tx']:
+			cstat['max_pkt_in_tx'] = cstat['pkt_in_tx']
+


^ permalink raw reply related

* Re: cls_cgroup: Store classid in struct sock
From: Herbert Xu @ 2010-05-24  5:42 UTC (permalink / raw)
  To: Neil Horman; +Cc: David Miller, eric.dumazet, bmb, tgraf, nhorman, netdev
In-Reply-To: <20100522122632.GA2075@localhost.localdomain>

On Sat, May 22, 2010 at 08:26:32AM -0400, Neil Horman wrote:
>
> first place.  When we register the cgroup subsystem, we don't register an attach
> method, so we never get a chance to assign task_cls_sate(tsk)->classid to any
> non-zero value.  I've got a version of the patch that add that, but for some

No I don't think you need an attach method.

The task_struct has a pointer to the cgroups, which in turn has
a pointer to the cgroup_subsys_state that we allocated in the
cls_cgroup module.

Did you try building cls_cgroup into the kernel?

Perhaps there is a bug in how we register it at run-time, or
perhaps the cgroups infrastructure itself is buggy.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH] ipv4: Allow configuring subnets as local addresses
From: Tom Herbert @ 2010-05-24  5:54 UTC (permalink / raw)
  To: davem; +Cc: netdev

This patch allows a host to be configured to respond to any address in
a specified range as if it were local, without actually needing to
configure the address on an interface.  This is done through routing
table configuration.  For instance, to configure a host to respond
to any address in 10.1/16 received on eth0 as a local address we can do:

ip rule add from all iif eth0 lookup 200
ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200

This host is now reachable by any 10.1/16 address (route lookup on
input for packets received on eth0 can find the route).  On output, the
rule will not be matched so that this host can still send packets to
10.1/16 (not sent on loopback).  Presumably, external routing can be
configured to make sense out of this.

To make this work, we needed to modify the logic in finding the
interface which is assigned a given source address for output
(dev_ip_find).  We perform a normal fib_lookup instead of just a
lookup on the local table, and in the lookup we ignore the input
interface for matching.

This patch is useful to implement IP-anycast for subnets of virtual
addresses.

Signed-off-by: Tom Herbert <therbert@google.com>
---
diff --git a/include/net/flow.h b/include/net/flow.h
index bb08692..0ac3fb5 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -49,6 +49,7 @@ struct flowi {
 	__u8	proto;
 	__u8	flags;
 #define FLOWI_FLAG_ANYSRC 0x01
+#define FLOWI_FLAG_MATCH_ANY_IIF 0x02
 	union {
 		struct {
 			__be16	sport;
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 42e84e0..f6e18b2 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -182,7 +182,8 @@ static int fib_rule_match(struct fib_rule *rule, struct fib_rules_ops *ops,
 {
 	int ret = 0;

-	if (rule->iifindex && (rule->iifindex != fl->iif))
+	if (rule->iifindex && (rule->iifindex != fl->iif) &&
+	    !(fl->flags & FLOWI_FLAG_MATCH_ANY_IIF))
 		goto out;

 	if (rule->oifindex && (rule->oifindex != fl->oif))
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 4f0ed45..64f953e 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -153,17 +153,16 @@ static void fib_flush(struct net *net)

 struct net_device * ip_dev_find(struct net *net, __be32 addr)
 {
-	struct flowi fl = { .nl_u = { .ip4_u = { .daddr = addr } } };
+	struct flowi fl = { .nl_u = { .ip4_u = { .daddr = addr } },
+			    .flags = FLOWI_FLAG_MATCH_ANY_IIF };
 	struct fib_result res;
 	struct net_device *dev = NULL;
-	struct fib_table *local_table;

 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	res.r = NULL;
 #endif

-	local_table = fib_get_table(net, RT_TABLE_LOCAL);
-	if (!local_table || fib_table_lookup(local_table, &fl, &res))
+	if (fib_lookup(net, &fl, &res))
 		return NULL;
 	if (res.type != RTN_LOCAL)
 		goto out;

^ permalink raw reply related

* Re: [net-next-2.6 PATCH 1/2] enic: bug fix: sprintf UUID to string as u8[] rather than u16[] array
From: David Miller @ 2010-05-24  6:11 UTC (permalink / raw)
  To: scofeldm; +Cc: netdev, chrisw
In-Reply-To: <20100523032952.20200.30148.stgit@savbu-pc100.cisco.com>

From: Scott Feldman <scofeldm@cisco.com>
Date: Sat, 22 May 2010 20:29:52 -0700

> From: Scott Feldman <scofeldm@cisco.com>
> 
> 
> 
> Signed-off-by: Scott Feldman <scofeldm@cisco.com>

Applied.

^ permalink raw reply

* Re: [PATCH] ethoc: fix null dereference in ethoc_probe
From: David Miller @ 2010-05-24  6:11 UTC (permalink / raw)
  To: thomas; +Cc: netdev, linux-kernel, thierry.reding, nios2-dev, error27
In-Reply-To: <1274669042-1901-1-git-send-email-thomas@wytron.com.tw>

From: Thomas Chou <thomas@wytron.com.tw>
Date: Mon, 24 May 2010 10:44:02 +0800

> Dan reported the patch 0baa080c75c: "ethoc: use system memory
> as buffer" introduced a potential null dereference.
> 
>   1060  free:
>   1061          if (priv->dma_alloc)
>                     ^^^^^^^^^^^^^^^
> 	priv can be null here.
> 
> He also suggested that the error handling is not complete.
> 
> This patch fixes the null priv issue and improves resources
> releasing in ethoc_probe() and ethoc_remove().
> 
> Reported-by: Dan Carpenter <error27@gmail.com>
> Signed-off-by: Thomas Chou <thomas@wytron.com.tw>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 2/2] enic: Use random mac addr when associating port-profile
From: David Miller @ 2010-05-24  6:11 UTC (permalink / raw)
  To: scofeldm; +Cc: netdev, chrisw
In-Reply-To: <20100523032957.20200.59700.stgit@savbu-pc100.cisco.com>

From: Scott Feldman <scofeldm@cisco.com>
Date: Sat, 22 May 2010 20:29:58 -0700

> From: Scott Feldman <scofeldm@cisco.com>
> 
> Use random mac addr for interface when associating port-profile to 
> dynamic enic device, in the case no mac addr was previous assigned.
> 
> Signed-off-by: Scott Feldman <scofeldm@cisco.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net_sched: Fix qdisc_notify()
From: David Miller @ 2010-05-24  6:11 UTC (permalink / raw)
  To: eric.dumazet; +Cc: hadi, kaber, blp, netdev
In-Reply-To: <1274596664.5020.40.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 23 May 2010 08:37:44 +0200

> [PATCH] net_sched: Fix qdisc_notify()
> 
> Ben Pfaff reported a kernel oops and provided a test program to
> reproduce it.
> 
> https://kerneltrap.org/mailarchive/linux-netdev/2010/5/21/6277805
> 
> tc_fill_qdisc() should not be called for builtin qdisc, or it
> dereference a NULL pointer to get device ifindex.
> 
> Fix is to always use tc_qdisc_dump_ignore() before calling
> tc_fill_qdisc().
> 
> Reported-by: Ben Pfaff <blp@nicira.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] ieee802154: Fix possible NULL pointer dereference in wpan_phy_alloc
From: David Miller @ 2010-05-24  6:12 UTC (permalink / raw)
  To: dkirjanov; +Cc: dbaryshkov, netdev
In-Reply-To: <20100523154545.GA22578@hera.kernel.org>

From: Denis Kirjanov <dkirjanov@hera.kernel.org>
Date: Sun, 23 May 2010 15:45:45 +0000

> Check for NULL pointer after kzalloc
> 
> Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>

Applied.

^ permalink raw reply

* Re: [PATCH] rtnetlink: Fix error handling in do_setlink()
From: David Miller @ 2010-05-24  6:12 UTC (permalink / raw)
  To: chrisw; +Cc: dhowells, netdev
In-Reply-To: <20100522065212.GW8301@sequoia.sous-sol.org>

From: Chris Wright <chrisw@sous-sol.org>
Date: Fri, 21 May 2010 23:52:12 -0700

> * David Howells (dhowells@redhat.com) wrote:
>> Commit c02db8c6290bb992442fec1407643c94cc414375:
>> 
>> 	Author:  Chris Wright <chrisw@sous-sol.org>
>> 	Date:    Sun May 16 01:05:45 2010 -0700
>> 	Subject: rtnetlink: make SR-IOV VF interface symmetric
>> 
>> adds broken error handling to do_setlink() in net/core/rtnetlink.c.  The
>> problem is the following chunk of code:
>> 
>> 	if (tb[IFLA_VFINFO_LIST]) {
>> 		struct nlattr *attr;
>> 		int rem;
>> 		nla_for_each_nested(attr, tb[IFLA_VFINFO_LIST], rem) {
>> 			if (nla_type(attr) != IFLA_VF_INFO)
>>   ---->				goto errout;
>> 			err = do_setvfinfo(dev, attr);
>> 			if (err < 0)
>> 				goto errout;
>> 			modified = 1;
>> 		}
>> 	}
>> 
>> which can get to errout without setting err, resulting in the following error:
>> 
>> net/core/rtnetlink.c: In function 'do_setlink':
>> net/core/rtnetlink.c:904: warning: 'err' may be used uninitialized in this function
>> 
>> Change the code to return -EINVAL in this case.  Note that this might not be
>> the appropriate error though.
>> 
>> Signed-off-by: David Howells <dhowells@redhat.com>
> 
> Acked-by: Chris Wright <chrisw@sous-sol.org>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] net-caif: drop redundant Kconfig entries
From: David Miller @ 2010-05-24  6:12 UTC (permalink / raw)
  To: vapier; +Cc: netdev, sjur.brandeland
In-Reply-To: <1274474720-9865-1-git-send-email-vapier@gentoo.org>

From: Mike Frysinger <vapier@gentoo.org>
Date: Fri, 21 May 2010 16:45:20 -0400

> There is already a submenu entry that is always displayed, so there is
> no need to also show a dedicated CAIF comment.
> 
> Drop dead commented code while we're here, and change the submenu text
> to better match the style everyone else is using.
> 
> Signed-off-by: Mike Frysinger <vapier@gentoo.org>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] ethoc: Write bus addresses to registers
From: David Miller @ 2010-05-24  6:17 UTC (permalink / raw)
  To: jonas; +Cc: netdev
In-Reply-To: <1274161166-18521-2-git-send-email-jonas@southpole.se>

From: Jonas Bonn <jonas@southpole.se>
Date: Tue, 18 May 2010 07:39:24 +0200

> The ethoc driver should be writing bus addresses to the ethoc registers, not
> virtual addresses.
> 
> Also, use bus_to_virt instead of phys_to_virt to make this explicit.

Portable drivers must not use bus_to_virt().

You should keep track of the virtual addresses of the mapped RX
packets in your RX ring datastructure.

^ permalink raw reply

* Re: [PATCH] net: Fix definition of netif_vdbg() when VERBOSE_DEBUG is not defined
From: David Miller @ 2010-05-24  6:21 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, linux-net-drivers, joe
In-Reply-To: <1274201792.2113.3.camel@achroite.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Tue, 18 May 2010 17:56:32 +0100

> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

Applied, thanks Ben.

^ permalink raw reply

* Re: [PATCH] net-2.6 : V2 - fix dev_get_valid_name
From: David Miller @ 2010-05-24  6:25 UTC (permalink / raw)
  To: opurdila; +Cc: daniel.lezcano, netdev
In-Reply-To: <201005211625.06234.opurdila@ixiacom.com>

From: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri, 21 May 2010 16:25:06 +0300

> On Friday 21 May 2010 16:10:13 you wrote:
>> On 05/19/2010 10:12 PM, Daniel Lezcano wrote:
>> > Signed-off-by: Daniel Lezcano<daniel.lezcano@free.fr>
 ...
> Reviewed-by: Octavian Purdila <opurdila@ixiacom.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: fec: add support for PHY interface platform data
From: David Miller @ 2010-05-24  6:32 UTC (permalink / raw)
  To: baruch; +Cc: netdev

This patch not longer applied cleanly at all.

Pretty much in every portion of the FEC driver your patch touches, the
code looks totally different in the current tree.

You'll need to respin this against Linus's current tree if you still
want me to apply it.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/2] fixes to arp_notify for virtual machine migration use case
From: David Miller @ 2010-05-24  6:37 UTC (permalink / raw)
  To: Ian.Campbell; +Cc: netdev, shemminger, jeremy.fitzhardinge, stable
In-Reply-To: <1273671554.7572.11190.camel@zakaz.uk.xensource.com>

From: Ian Campbell <Ian.Campbell@citrix.com>
Date: Wed, 12 May 2010 14:39:14 +0100

> Ian Campbell (2):
>       arp_notify: generate broadcast ARP reply not request.
>       arp_notify: generate arp_notify event on NETDEV_CHANGE too

I don't agree with these changes.

For the first one, I think the documentation is just wrong and the
code is what expresses the intent.  The idea is not to spam the
world with a broadcast, only interested parties.

Patch #2 I have major issues with, carriers flapping occaisionally is
very common.  I have several interfaces which do this even on lightly
loaded networks.  Iff we decide to do something like this (big "if")
it would need to be rate limited so that it doesn't trigger due to
normal flapping.

If you want your VM networking devices to trigger this event maybe
the best thing to do is to create a special notification which
allows us to prevent from doing this ARP notify for spurious physical
network device carrier flaps.

^ permalink raw reply

* Re: [PATCH net-2.6 1/6] caif: Bugfix - wait_ev*_timeout returns long.
From: David Miller @ 2010-05-24  6:40 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: netdev, sjurbr, linus.walleij, marcel
In-Reply-To: <1274444172-31969-1-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Fri, 21 May 2010 14:16:07 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Discovered bug when testing on 64bit architecture.
> Fixed by using long to store result from wait_event_interruptible_timeout.
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 2/6] caif: Bugfix - use standard Linux lists
From: David Miller @ 2010-05-24  6:40 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: netdev, sjurbr, linus.walleij, marcel
In-Reply-To: <1274444172-31969-2-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Fri, 21 May 2010 14:16:08 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Discovered bug when running high number of parallel connect requests.
> Replace buggy home brewed list with linux/list.h.
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 3/6] caif: Bugfix - handle mem-allocation failures
From: David Miller @ 2010-05-24  6:40 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: netdev, sjurbr, linus.walleij, marcel
In-Reply-To: <1274444172-31969-3-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Fri, 21 May 2010 14:16:09 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Discovered bugs when injecting slab allocation failures.
> Add checks on all memory allocation.
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 4/6] caif: Bugfix - Poll can't return POLLHUP while connecting.
From: David Miller @ 2010-05-24  6:40 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: netdev, sjurbr, linus.walleij, marcel
In-Reply-To: <1274444172-31969-4-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Fri, 21 May 2010 14:16:10 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Discovered bug when testing async connect.
> While connecting poll should not return POLLHUP,
> but POLLOUT when connected. 
> Also fixed the sysfs flow-control-counters.
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 5/6] caif: Bugfix - missing spin_unlock
From: David Miller @ 2010-05-24  6:40 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: netdev, sjurbr, linus.walleij, marcel
In-Reply-To: <1274444172-31969-5-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Fri, 21 May 2010 14:16:11 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Splint found missing spin_unlock.
> Corrected this an some other trivial split warnings.
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 6/6] caif: Bugfix - use MSG_TRUNC in receive
From: David Miller @ 2010-05-24  6:40 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: netdev, sjurbr, linus.walleij, marcel
In-Reply-To: <1274444172-31969-6-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Fri, 21 May 2010 14:16:12 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Fixed handling when skb don't fit in user buffer,
> instead of returning -EMSGSIZE, the buffer is truncated (just
> as unix seqpakcet does).
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox