Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2] fail dentry revalidation after namespace change
From: Andrew Morton @ 2012-07-10  2:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Glauber Costa, linux-kernel, netdev, Greg Thelen, Serge Hallyn,
	Tejun Heo, Greg Kroah-Hartman
In-Reply-To: <87629wxu1i.fsf@xmission.com>

On Mon, 09 Jul 2012 18:51:37 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:

> Andrew Morton <akpm@linux-foundation.org> writes:
> 
> > On Mon, 09 Jul 2012 17:30:48 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:
> >
> >> Andrew Morton <akpm@linux-foundation.org> writes:
> >> 
> >> >>  {
> >> >>  	struct sysfs_dirent *sd;
> >> >>  	int is_dir;
> >> >> +	int type;
> >> >>  
> >> >>  	if (nd->flags & LOOKUP_RCU)
> >> >>  		return -ECHILD;
> >> >> @@ -326,6 +327,13 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> >> >>  	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
> >> >>  		goto out_bad;
> >> >>  
> >> >> +	/* The sysfs dirent has been moved to a different namespace */
> >> >> +	type = KOBJ_NS_TYPE_NONE;
> >> >> +	if (sd->s_parent)
> >> >> +		type = sysfs_ns_type(sd->s_parent);
> >> >> +	if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
> >> >
> >> > eww, the code is assuming that KOBJ_NS_TYPE_NONE has a value of zero. 
> >> > Don't do that; it smells bad.
> >> 
> >> Gag.  An incomplete change in idiom.
> >> 
> >> KOBJ_NS_TYPE_NONE is explicitly defined as 0 so that it can be used
> >> this way, and every where else in fs/sysfs/dir.c uses this idiom.
> >
> > One man's idiom is another man's idiocy.
> 
> And code that uses inconsistent idioms is even harder to read.

Not true.  That patch is more readable when it is changed to use
correct types.  If only because readers don't need to go in and check
that KOBJ_NS_TYPE_NONE has value zero.

> > Seriously.  What sort of idea is that?  Create an enumerated type and
> > then just ignore it?
> 
> It isn't ignored.  It just has a well defined NULL value. That is hardly
> controversial.

If it's uncontroversial, why are we talking about it?  Why did I, an
experienced C and kernel developer, think that it looked stupid and
possibly buggy?

I'm uncomfortable with propagating this idiotic and unnecessary trick
any further.  It's better to fix it.

> >> Pray tell in what parallel universe is that monstrosity above more
> >> readable than the line it replaces?
> >
> > Don't be silly, it is not a "monstrosity".  The code it is modifying
> > contains an unneeded test-and-branch.  It's a test and branch which the
> > compiler might be able to avoid.  If we can demonstrate that the
> > compiler does indeed optimise it, or if we can find a less monstrous
> > way of implementing it then fine.  Otherwise, efficiency wins.
> 
> Efficiency wins?  In a rarely used function?  Which kernel are you
> working on?

One in which we frequently optimise uncommon code paths.

> Readable maintainable code wins.  Unreadable code causes regressions.

Dude, the whole reason for having enums and enumerated types is for
readability and maintainability.  If we didn't care about that, we'd
use literal constants everywhere.  And here you are arguing against
that readability and maintainability.

If you want to say "yes, the sysfs code is bad but I can't be bothered
fixing it all" then grumble, but OK.  But for heavens sake, don't go
and *defend* what that code is doing.

^ permalink raw reply

* RE: [PATCH 0/4] Add a driver for the ASIX AX88172A
From: ASIX Allan Email [office] @ 2012-07-10  2:20 UTC (permalink / raw)
  To: 'Mark Lord', 'Grant Grundler'
  Cc: 'Christian Riesch', netdev, 'Oliver Neukum',
	'Eric Dumazet', 'Ming Lei',
	'Michael Riesch'
In-Reply-To: <4FFB5AC6.3000506@teksavvy.com>

Dear All,

>From ASIX support viewpoint, it might be hard to support all AX88172A target applications on Linux kernel native ax88172a.c driver because some of AX88172A applications are embedded on customers' special target applications such as the AX88172A (PHY mode or Dual-PHY mode) + external MAC controller on-board design applications. For these kinds of AX88172A applications, the AX88172A Linux driver was qualified in our customers' site directly. It means ASIX doesn't have those customers' AX88172A devices in our site for testing. 

But for some AX88172A target applications such as AX88172A + external Fiber PHY and AX88172A + external National DP83640 PHY (Christian is testing under), etc. USB dongle applications, it is possible to support them on Linux kernel native ax88172a.c driver but *** the readme of this driver source might need to indicate clearly what kinds of AX88172A devices had been verified on this ax88172a.c native driver source ***. It will avoid users were confusing why their AX88172A devices couldn't work fine on Linux kernel native driver in future. 

Please let us know if you need more information. Thanks a lot. 

---
Best regards,
Allan Chou
Technical Support Division
ASIX Electronics Corporation
TEL: 886-3-5799500 ext.228
FAX: 886-3-5799558
E-mail: allan@asix.com.tw 
http://www.asix.com.tw/ 

-----Original Message-----
From: Mark Lord [mailto:kernel@teksavvy.com] 
Sent: Tuesday, July 10, 2012 6:27 AM
To: Grant Grundler
Cc: Christian Riesch; netdev@vger.kernel.org; Oliver Neukum; Eric Dumazet; Allan Chou; Ming Lei; Michael Riesch
Subject: Re: [PATCH 0/4] Add a driver for the ASIX AX88172A

On 12-07-09 01:45 PM, Grant Grundler wrote:
> Christian,
> Here's my $0.02 response to your questions.
> 
> On Fri, Jul 6, 2012 at 4:33 AM, Christian Riesch
> <christian.riesch@omicron.at> wrote:
> ...
>> I have a few questions:
>>
>> 1) Is it ok to factor out the common code like I did? Or should
>>    it go into a separate kernel module?
> 
> I think it's ok. I'd rather not see additional kernel modules unless
> the driver is substantially different.

I'll second that.  Ideally, somebody should pick up the pieces
from my aborted efforts last fall, and just get the real ASIX driver
itself tidied and into the kernel.  Then *everything* would work.

But I doubt that would be feasible at this point.

Cheers

^ permalink raw reply

* [PATCH] net: cgroup: fix access the unallocated memory in netprio cgroup
From: Gao feng @ 2012-07-10  2:31 UTC (permalink / raw)
  To: eric.dumazet, nhorman
  Cc: davem, linux-kernel, netdev, lizefan, tj, Gao feng, Eric Dumazet
In-Reply-To: <1341837625.3265.2748.camel@edumazet-glaptop>

there are some out of bound accesses in netprio cgroup.
when creating a new netprio cgroup,we only set a prioidx for
the new cgroup,without allocate memory for dev->priomap.

because we don't want to see additional bound checkings in
fast path, so I think the best way is to allocate memory when we
creating a new netprio cgroup.

and because netdev can be created or registered after cgroup being
created, so extend_netdev_table is also needed in write_priomap.

this patch add a return value for update_netdev_tables & extend_netdev_table,
so when new_priomap is allocated failed,write_priomap will stop to access
the priomap,and return -ENOMEM back to the userspace to tell the user
what happend.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <edumazet@google.com>
---
 net/core/netprio_cgroup.c |   43 +++++++++++++++++++++++++++++--------------
 1 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index aa907ed..3554f28 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -65,7 +65,7 @@ static void put_prioidx(u32 idx)
 	spin_unlock_irqrestore(&prioidx_map_lock, flags);
 }
 
-static void extend_netdev_table(struct net_device *dev, u32 new_len)
+static int extend_netdev_table(struct net_device *dev, u32 new_len)
 {
 	size_t new_size = sizeof(struct netprio_map) +
 			   ((sizeof(u32) * new_len));
@@ -77,7 +77,7 @@ static void extend_netdev_table(struct net_device *dev, u32 new_len)
 
 	if (!new_priomap) {
 		pr_warn("Unable to alloc new priomap!\n");
-		return;
+		return -ENOMEM;
 	}
 
 	for (i = 0;
@@ -90,10 +90,12 @@ static void extend_netdev_table(struct net_device *dev, u32 new_len)
 	rcu_assign_pointer(dev->priomap, new_priomap);
 	if (old_priomap)
 		kfree_rcu(old_priomap, rcu);
+	return 0;
 }
 
-static void update_netdev_tables(void)
+static int update_netdev_tables(void)
 {
+	int ret = 0;
 	struct net_device *dev;
 	u32 max_len = atomic_read(&max_prioidx) + 1;
 	struct netprio_map *map;
@@ -102,34 +104,44 @@ static void update_netdev_tables(void)
 	for_each_netdev(&init_net, dev) {
 		map = rtnl_dereference(dev->priomap);
 		if ((!map) ||
-		    (map->priomap_len < max_len))
-			extend_netdev_table(dev, max_len);
+		    (map->priomap_len < max_len)) {
+			ret = extend_netdev_table(dev, max_len);
+			if (ret < 0)
+				break;
+		}
 	}
 	rtnl_unlock();
+	return ret;
 }
 
 static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
 {
 	struct cgroup_netprio_state *cs;
-	int ret;
+	int ret = -EINVAL;
 
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
 	if (!cs)
 		return ERR_PTR(-ENOMEM);
 
-	if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx) {
-		kfree(cs);
-		return ERR_PTR(-EINVAL);
-	}
+	if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx)
+		goto out;
 
 	ret = get_prioidx(&cs->prioidx);
-	if (ret != 0) {
+	if (ret < 0) {
 		pr_warn("No space in priority index array\n");
-		kfree(cs);
-		return ERR_PTR(ret);
+		goto out;
+	}
+
+	ret = update_netdev_tables();
+	if (ret < 0) {
+		put_prioidx(cs->prioidx);
+		goto out;
 	}
 
 	return &cs->css;
+out:
+	kfree(cs);
+	return ERR_PTR(ret);
 }
 
 static void cgrp_destroy(struct cgroup *cgrp)
@@ -221,7 +233,10 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 	if (!dev)
 		goto out_free_devname;
 
-	update_netdev_tables();
+	ret = update_netdev_tables();
+	if (ret < 0)
+		goto out_free_devname;
+
 	ret = 0;
 	rcu_read_lock();
 	map = rcu_dereference(dev->priomap);
-- 
1.7.7.6

^ permalink raw reply related

* Re: [PATCH] net: cgroup: fix out of bounds accesses
From: Gao feng @ 2012-07-10  2:33 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, nhorman, linux-kernel, netdev, lizefan, tj
In-Reply-To: <20120709.145125.1903343847210013668.davem@davemloft.net>

于 2012年07月10日 05:51, David Miller 写道:
> From: Gao feng <gaofeng@cn.fujitsu.com>
> Date: Mon, 09 Jul 2012 16:15:29 +0800
> 
>> 于 2012年07月09日 15:45, Eric Dumazet 写道:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> dev->priomap is allocated by extend_netdev_table() called from
>>> update_netdev_tables().
>>> And this is only called if write_priomap() is called.
>>>
>>> But if write_priomap() is not called, it seems we can have out of bounds
>>> accesses in cgrp_destroy(), read_priomap() & skb_update_prio()
>>>
>>> With help from Gao Feng
>>>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Cc: Neil Horman <nhorman@tuxdriver.com>
>>> Cc: Gao feng <gaofeng@cn.fujitsu.com>
>>> ---
>>> net/core/dev.c            |    8 ++++++--
>>> net/core/netprio_cgroup.c |    4 ++--
>>> 2 files changed, 8 insertions(+), 4 deletions(-)
>>
>> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
> 
> Applied.
> 

Hi David

Please see my patch in this thread, I think it's a better way to fix this bug.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: cgroup: fix out of bounds accesses
From: David Miller @ 2012-07-10  2:37 UTC (permalink / raw)
  To: gaofeng; +Cc: eric.dumazet, nhorman, linux-kernel, netdev, lizefan, tj
In-Reply-To: <4FFB9473.4040203@cn.fujitsu.com>

From: Gao feng <gaofeng@cn.fujitsu.com>
Date: Tue, 10 Jul 2012 10:33:23 +0800

> Please see my patch in this thread, I think it's a better way to fix this bug.

You'll need to work that out with Eric, fwiw I think his patch was
clean and just fine and it's staying in my tree.

^ permalink raw reply

* linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2012-07-10  3:08 UTC (permalink / raw)
  To: David Miller, netdev; +Cc: linux-next, linux-kernel, "Bjørn Mork"

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

Hi all,

After merging the net-next tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/net/usb/qmi_wwan.c:381:13: error: 'qmi_wwan_unbind_shared' undeclared here (not in a function)

Caused by a bad automatic merge between commit 6fecd35d4cd7 ("net:
qmi_wwan: add ZTE MF60") from the net tree and commit 230718bda1be ("net:
qmi_wwan: bind to both control and data interface") from the net-next
tree.

I added the following merge fix patch:

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Tue, 10 Jul 2012 13:06:01 +1000
Subject: [PATCH] net: fix for qmi_wwan_unbind_shared changes

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 drivers/net/usb/qmi_wwan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 06cfcc7..85c983d 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -378,7 +378,7 @@ static const struct driver_info qmi_wwan_force_int2 = {
 	.description	= "Qualcomm WWAN/QMI device",
 	.flags		= FLAG_WWAN,
 	.bind		= qmi_wwan_bind_shared,
-	.unbind		= qmi_wwan_unbind_shared,
+	.unbind		= qmi_wwan_unbind,
 	.manage_power	= qmi_wwan_manage_power,
 	.data		= BIT(2), /* interface whitelist bitmap */
 };
-- 
1.7.10.280.gaa39

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related

* [RFC PATCH v2 net-next] Re: [RFC PATCH] ppp: add support for L2 multihop / tunnel switching
From: Benjamin LaHaise @ 2012-07-10  3:27 UTC (permalink / raw)
  To: James Chapman; +Cc: netdev, linux-ppp
In-Reply-To: <20120709141511.GL19462@kvack.org>

Hello all,

Here is v2 of the PPP multihop patch.  This version adds a notifier hook to 
make sure that the multihop reference is dropped when the multihop target 
gets unregistered, ensuring that the references are properly dropped witout 
leaking the devices.

		-ben

Not-yet-signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
 drivers/net/ppp/ppp_generic.c |  119 ++++++++++++++++++++++++++++++++++++++++--
 include/linux/if_ether.h      |    1 
 include/linux/ppp-ioctl.h     |    1 
 3 files changed, 118 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 5c05572..6dc7eff 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -121,6 +121,7 @@ struct ppp {
 	unsigned long	last_xmit;	/* jiffies when last pkt sent 9c */
 	unsigned long	last_recv;	/* jiffies when last pkt rcvd a0 */
 	struct net_device *dev;		/* network interface device a4 */
+	struct net_device *multihop_if;	/* if to forward incoming frames to */
 	int		closing;	/* is device closing down? a8 */
 #ifdef CONFIG_PPP_MULTILINK
 	int		nxchan;		/* next channel to send something on */
@@ -272,6 +273,7 @@ static void unit_put(struct idr *p, int n);
 static void *unit_find(struct idr *p, int n);
 
 static struct class *ppp_class;
+static const struct net_device_ops ppp_netdev_ops;
 
 /* per net-namespace data */
 static inline struct ppp_net *ppp_pernet(struct net *net)
@@ -380,8 +382,9 @@ static int ppp_release(struct inode *unused, struct file *file)
 		file->private_data = NULL;
 		if (pf->kind == INTERFACE) {
 			ppp = PF_TO_PPP(pf);
-			if (file == ppp->owner)
+			if (file == ppp->owner) {
 				ppp_shutdown_interface(ppp);
+			}
 		}
 		if (atomic_dec_and_test(&pf->refcnt)) {
 			switch (pf->kind) {
@@ -553,6 +556,41 @@ static int get_filter(void __user *arg, struct sock_filter **p)
 }
 #endif /* CONFIG_PPP_FILTER */
 
+static int ppp_multihop_event(struct notifier_block *this, unsigned long event,
+			      void *ptr)
+{
+	struct net_device *event_dev = (struct net_device *)ptr;
+	struct net_device *master = event_dev->master;
+	struct ppp *ppp;
+
+	if (event_dev->netdev_ops != &ppp_netdev_ops)
+		return NOTIFY_DONE;
+	if (!master || (master->netdev_ops != &ppp_netdev_ops))
+		return NOTIFY_DONE;
+
+	ppp = netdev_priv(master);
+
+	switch (event) {
+	case NETDEV_UNREGISTER:
+		ppp_lock(ppp);
+		BUG_ON(ppp->multihop_if != event_dev);
+		ppp->multihop_if = NULL;
+		netdev_set_master(event_dev, NULL);
+		ppp_unlock(ppp);
+		dev_put(event_dev);
+		break;
+
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block ppp_multihop_notifier = {
+	.notifier_call = ppp_multihop_event,
+};
+
 static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
 	struct ppp_file *pf = file->private_data;
@@ -738,6 +776,46 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		err = 0;
 		break;
 
+	case PPPIOCSMULTIHOP_IF:
+	{
+		struct net_device *multihop_if;
+		if (get_user(val, p))
+			break;
+		rtnl_lock();
+		ppp_lock(ppp);
+		err = 0;
+		multihop_if = ppp->multihop_if;
+		if (multihop_if && (val == -1)) {
+			ppp->multihop_if = NULL;
+			BUG_ON(multihop_if->master != ppp->dev);
+			netdev_set_master(multihop_if, NULL);
+			goto out_multihop;
+		}
+		err = -EBUSY;
+		multihop_if = NULL;
+		if (ppp->multihop_if)
+			goto out_multihop;
+		multihop_if = dev_get_by_index(&init_net, val);
+		err = -ENOENT;
+		if (!multihop_if)
+			goto out_multihop;
+		err = -EINVAL;
+		if (multihop_if->netdev_ops != &ppp_netdev_ops)
+			goto out_multihop;
+		err = netdev_set_master(multihop_if, ppp->dev);
+		if (err)
+			goto out_multihop;
+		ppp->multihop_if = multihop_if;
+		multihop_if = NULL;
+		err = 0;
+out_multihop:
+		ppp_unlock(ppp);
+		rtnl_unlock();
+		if (multihop_if)
+			dev_put(multihop_if);
+		break;
+	}
+
 #ifdef CONFIG_PPP_FILTER
 	case PPPIOCSPASS:
 	{
@@ -901,6 +979,7 @@ static int __init ppp_init(void)
 
 	pr_info("PPP generic driver version " PPP_VERSION "\n");
 
+	register_netdevice_notifier(&ppp_multihop_notifier);
 	err = register_pernet_device(&ppp_net_ops);
 	if (err) {
 		pr_err("failed to register PPP pernet device (%d)\n", err);
@@ -942,6 +1021,9 @@ ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	int npi, proto;
 	unsigned char *pp;
 
+	if (skb->protocol == htons(ETH_P_PPP))
+		goto queue;
+
 	npi = ethertype_to_npindex(ntohs(skb->protocol));
 	if (npi < 0)
 		goto outf;
@@ -968,6 +1050,7 @@ ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	proto = npindex_to_proto[npi];
 	put_unaligned_be16(proto, pp);
 
+queue:
 	skb_queue_tail(&ppp->file.xq, skb);
 	ppp_xmit_process(ppp);
 	return NETDEV_TX_OK;
@@ -1131,6 +1214,9 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
 	int len;
 	unsigned char *cp;
 
+	if (skb->protocol == htons(ETH_P_PPP))
+		goto xmit;
+
 	if (proto < 0x8000) {
 #ifdef CONFIG_PPP_FILTER
 		/* check if we should pass this packet */
@@ -1228,6 +1314,7 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
 		return;
 	}
 
+xmit:
 	ppp->xmit_pending = skb;
 	ppp_push(ppp);
 	return;
@@ -1259,7 +1346,8 @@ ppp_push(struct ppp *ppp)
 		return;
 	}
 
-	if ((ppp->flags & SC_MULTILINK) == 0) {
+	if (((ppp->flags & SC_MULTILINK) == 0) ||
+	    (skb->protocol == htons(ETH_P_PPP))) {
 		/* not doing multilink: send it down the first channel */
 		list = list->next;
 		pch = list_entry(list, struct channel, clist);
@@ -1599,6 +1687,14 @@ ppp_input(struct ppp_channel *chan, struct sk_buff *skb)
 		goto done;
 	}
 
+	if (pch->ppp && pch->ppp->multihop_if) {
+		skb->protocol = htons(ETH_P_PPP);
+		skb->dev = pch->ppp->multihop_if;
+		skb->ip_summed = CHECKSUM_NONE;
+		dev_queue_xmit(skb);
+		goto done;
+	}
+
 	proto = PPP_PROTO(skb);
 	if (!pch->ppp || proto >= 0xc000 || proto == PPP_CCPFRAG) {
 		/* put it on the channel queue */
@@ -2709,18 +2805,28 @@ static void ppp_shutdown_interface(struct ppp *ppp)
 {
 	struct ppp_net *pn;
 
+	rtnl_lock();
 	pn = ppp_pernet(ppp->ppp_net);
 	mutex_lock(&pn->all_ppp_mutex);
 
 	/* This will call dev_close() for us. */
 	ppp_lock(ppp);
 	if (!ppp->closing) {
+		struct net_device *multihop_if = ppp->multihop_if;
 		ppp->closing = 1;
+		ppp->multihop_if = NULL;
 		ppp_unlock(ppp);
+		if (multihop_if)
+			netdev_set_master(multihop_if, NULL);
+		rtnl_unlock();
 		unregister_netdev(ppp->dev);
 		unit_put(&pn->units_idr, ppp->file.index);
-	} else
+		if (multihop_if)
+			dev_put(multihop_if);
+	} else {
 		ppp_unlock(ppp);
+		rtnl_unlock();
+	}
 
 	ppp->file.dead = 1;
 	ppp->owner = NULL;
@@ -2764,6 +2870,12 @@ static void ppp_destroy_interface(struct ppp *ppp)
 #endif /* CONFIG_PPP_FILTER */
 
 	kfree_skb(ppp->xmit_pending);
+	if (ppp->multihop_if) {
+		struct net_device *multihop_if = ppp->multihop_if;
+		ppp->multihop_if = NULL;
+		netdev_set_master(multihop_if, NULL);
+		dev_put(multihop_if);
+	}
 
 	free_netdev(ppp->dev);
 }
@@ -2901,6 +3013,7 @@ static void __exit ppp_cleanup(void)
 	device_destroy(ppp_class, MKDEV(PPP_MAJOR, 0));
 	class_destroy(ppp_class);
 	unregister_pernet_device(&ppp_net_ops);
+	unregister_netdevice_notifier(&ppp_multihop_notifier);
 }
 
 /*
diff --git a/include/linux/if_ether.h b/include/linux/if_ether.h
index 167ce5b..fe47a70 100644
--- a/include/linux/if_ether.h
+++ b/include/linux/if_ether.h
@@ -120,6 +120,7 @@
 #define ETH_P_PHONET	0x00F5		/* Nokia Phonet frames          */
 #define ETH_P_IEEE802154 0x00F6		/* IEEE802.15.4 frame		*/
 #define ETH_P_CAIF	0x00F7		/* ST-Ericsson CAIF protocol	*/
+#define ETH_P_PPP	0x00F8		/* Dummy type for PPP multihop	*/
 
 /*
  *	This is an Ethernet frame header.
diff --git a/include/linux/ppp-ioctl.h b/include/linux/ppp-ioctl.h
index 2d9a885..5571375 100644
--- a/include/linux/ppp-ioctl.h
+++ b/include/linux/ppp-ioctl.h
@@ -81,6 +81,7 @@ struct pppol2tp_ioc_stats {
  * Ioctl definitions.
  */
 
+#define	PPPIOCSMULTIHOP_IF	_IOWR('t', 91, int) /* set multihop if */
 #define	PPPIOCGFLAGS	_IOR('t', 90, int)	/* get configuration flags */
 #define	PPPIOCSFLAGS	_IOW('t', 89, int)	/* set configuration flags */
 #define	PPPIOCGASYNCMAP	_IOR('t', 88, int)	/* get async map */

^ permalink raw reply related

* Re: [PATCH] net: cgroup: fix access the unallocated memory in netprio cgroup
From: Eric Dumazet @ 2012-07-10  4:14 UTC (permalink / raw)
  To: Gao feng; +Cc: nhorman, davem, linux-kernel, netdev, lizefan, tj, Eric Dumazet
In-Reply-To: <1341887508-20302-1-git-send-email-gaofeng@cn.fujitsu.com>

On Tue, 2012-07-10 at 10:31 +0800, Gao feng wrote:
> there are some out of bound accesses in netprio cgroup.
> when creating a new netprio cgroup,we only set a prioidx for
> the new cgroup,without allocate memory for dev->priomap.
> 
> because we don't want to see additional bound checkings in
> fast path, so I think the best way is to allocate memory when we
> creating a new netprio cgroup.
> 
> and because netdev can be created or registered after cgroup being
> created, so extend_netdev_table is also needed in write_priomap.
> 
> this patch add a return value for update_netdev_tables & extend_netdev_table,
> so when new_priomap is allocated failed,write_priomap will stop to access
> the priomap,and return -ENOMEM back to the userspace to tell the user
> what happend.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: Eric Dumazet <edumazet@google.com>
> ---

>  static void cgrp_destroy(struct cgroup *cgrp)
> @@ -221,7 +233,10 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
>  	if (!dev)
>  		goto out_free_devname;
>  
> -	update_netdev_tables();
> +	ret = update_netdev_tables();
> +	if (ret < 0)
> +		goto out_free_devname;
> +
>  	ret = 0;
>  	rcu_read_lock();
>  	map = rcu_dereference(dev->priomap);

Hi Gao

Is it still needed to call update_netdev_tables() from write_priomap() ?

^ permalink raw reply

* Re: [PATCH net-next 6/6] r8169: support RTL8168G
From: Hayes Wang @ 2012-07-10  5:36 UTC (permalink / raw)
  To: romieu; +Cc: netdev, linux-kernel, wfg, Hayes Wang
In-Reply-To: <c558386b836ee97762e12495101c6e373f20e69d.1341872752.git.romieu@fr.zoreil.com>

fix incorrct argument in rtl_hw_init_8168g.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/ethernet/realtek/r8169.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 7ff3423..c29c5fb 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6753,14 +6753,14 @@ static void __devinit rtl_hw_init_8168g(struct rtl8169_private *tp)
 	msleep(1);
 	RTL_W8(MCU, RTL_R8(MCU) & ~NOW_IS_OOB);
 
-	data = r8168_mac_ocp_read(ioaddr, 0xe8de);
+	data = r8168_mac_ocp_read(tp, 0xe8de);
 	data &= ~(1 << 14);
 	r8168_mac_ocp_write(tp, 0xe8de, data);
 
 	if (!rtl_udelay_loop_wait_high(tp, &rtl_link_list_ready_cond, 100, 42))
 		return;
 
-	data = r8168_mac_ocp_read(ioaddr, 0xe8de);
+	data = r8168_mac_ocp_read(tp, 0xe8de);
 	data |= (1 << 15);
 	r8168_mac_ocp_write(tp, 0xe8de, data);
 
-- 
1.7.10.4

^ permalink raw reply related

* [RFC] skbtrace: A trace infrastructure for networking subsystem
From: Li Yu @ 2012-07-10  6:07 UTC (permalink / raw)
  To: Linux Netdev List

Hi,

  This RFC introduces to the tracing infrastructure for networking
subsystem and a workable prototype.

  I noticed that the blktrace indeed helps file system and block
subsystem developers a lot, even it could help them to find out some
problems in mm subsystem. However, the "networkers" don't have such
like good luck, although tcpdump is very very useful, but they still
often need to start investigation from limited exported statistics
counters, then may directly dig into source code to guess possible
solutions, then test their ideas, if good luck doesn't arrive, then
start another investigation-guess-test loop. It is a difficult
time-costly and hard to share experiences, report problem, many users
have not enough understanding for protocol stack internals, I saw some
"detailed reports" still do not carry useful information to solve problem.

  Unfortunately, the networking subsystem is rather performance
sensitive in kernel, so we can not add too detailed counters directly
here. In fact, Some folks already tried to add more statistics counters
for detailed performance measuration, e.g. RFC4898 and its
implementation Web10g project. Web10G is a great project for
researchers and engineers on TCP stack, which exports per-connection
details to userland by procfs or netlink interface. However, it tightly
depends on TCP and its implementation, other protocols implementation
need some duplicated works to archive same goal, and it also has some
measurable overhead (5% - 10% in my simple netperf TCP_STREAM
benchmark), I think it'd better that such powerful tracing or
instrumentation feature should be able to be off at runtime, and zero
overhead when it is off.

  So why we don't write a blktrace like utility for our sweet
networking subsystem? This just is it, "skbtrace", I hope it can:

1. Provide an extendable tracing infrastructure to support various
protocols instead of specific one.

2. Ability of runtime enable or disable and zero overhead when it
is off. I think that jump label optimized trace point is a good choice
to implement it.

3. Provide tracing details on per-connection/per-skb level. Please note
that skbtrace are not only for sk_buff tracing, but also can track
sockets events. Second, this also means we need some forms of filters,
otherwise we must will lost in tons of uninteresting trace data. I think
that BPF is one of good choices. But we need extend BPF to make it
can handle other data structures rather than skb.

   Above is my basic idea, below are details of current prototype
implementation.

   Like blktrace, skbtrace also are base on the tracepoints
infrastructure and relay file system, however, I do not implement any
tracers like blktrace, since I want to keep kernel side as simple (also
fast, I hope) as possible. Basically, the trace points just are
optimized conditional statements here, the slow path copies these
traced data to the ring buffer in relay file system. The parameters of
this relay file system can be tuned by some exported files in skbtrace
directory.

  There are three trace data files (channels) in relay file system for
each CPU, they represent above ring buffers that save kernel traced
data for different contexts respectively:

  (1) trace.hardirq.cpuN, saving trace data that come from hardirq
context.
  (2) trace.softirq.cpuN, saving trace data that come from softirq
context.
  (3) trace.syscall.cpuN, saving trace data that come from process
context.

  Each trace data will write into one of above channels, depend on which
context is trace point called. Each trace data is represented by a
skbtrace_block struct, the extended fields for specific protocols can be
append at end of it. For global order of trace data, this patch has an
64 bits atomic variable to generate sequence number of each generated
trace data. So userland utility is able to sort out of order trace data
across different channels or/and CPUs.

  For tracing filter feature, I selected BPF as core engine, so far, it
only can filter out sk_buff-based traces, I have a plan to extend BPF to
support other data structures. In fact, I ever wrote a custom filter
implemenation for TCP/IPv4 ago, this way needs to refactor each specific
protocol implemenation, I do not like and discard them.

  So far, I implemented some skbtrace trace points:

  (1) skb_rps_info.

         I ever saw that some buggy drivers (or firmwares?)
         always setup zero skb->rx_hash. And it seem that RPS
         hashing can not work well for some corner cases.

  (2) tcp_connection and icsk_connection.

	To track the basic TCP state migration, e.g. TCP_LISTEN.

  (3) tcp_sendlimit.

        Personally, I am interesting in reason of tcp_write_xmit()
        exits.

  (4) tcp_congestion.

       Oops, it is cwnd killer, isn't it?

  The userland utilties:

  (1) skbtrace, record raw trace data to regular disk files.
  (2) skbparse, parse raw trace data to human readable strings.
                this still need a lot of works, it just is a rough
		(but workable) demo for TCP/IPv4 yet.

  You can get source code at github:

	https://github.com/Rover-Yu/skbtrace-userland
	https://github.com/Rover-Yu/skbtrace-kernel

  The source code of skbtrace-kernel is based on net-next tree.

  Welcome for suggestions.

  Thanks.

Yu

^ permalink raw reply

* [net] net: Fix memory leak - vlan_info struct
From: Jeff Kirsher @ 2012-07-10  6:47 UTC (permalink / raw)
  To: davem; +Cc: Amir Hanania, netdev, gospo, sassmann, Jeff Kirsher

From: Amir Hanania <amir.hanania@intel.com>

In driver reload test there is a memory leak.
The structure vlan_info was not freed when the driver was removed.
It was not released since the nr_vids var is one after last vlan was removed.
The nr_vids is one, since vlan zero is added to the interface when the interface
is being set, but the vlan zero is not deleted at unregister.
Fix - delete vlan zero when we unregister the device.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 net/8021q/vlan.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 6089f0c..9096bcb 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -403,6 +403,9 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 		break;
 
 	case NETDEV_DOWN:
+		if (dev->features & NETIF_F_HW_VLAN_FILTER)
+			vlan_vid_del(dev, 0);
+
 		/* Put all VLANs for this dev in the down state too.  */
 		for (i = 0; i < VLAN_N_VID; i++) {
 			vlandev = vlan_group_get_device(grp, i);
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH net-next 6/6] r8169: support RTL8168G
From: Francois Romieu @ 2012-07-10  6:50 UTC (permalink / raw)
  To: Hayes Wang; +Cc: David S. Miller, netdev, linux-kernel, wfg
In-Reply-To: <1341898590-1253-1-git-send-email-hayeswang@realtek.com>

Hayes Wang <hayeswang@realtek.com> :
> fix incorrct argument in rtl_hw_init_8168g.
> 
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>

Thanks Hayes.

It's available with proper attribution and subject at:

git://violet.fr.zoreil.com/romieu/linux davem-next.r8169

-- 
Ueimor

^ permalink raw reply

* [v2 PATCH] ksz884x: fix Endian
From: roy.qing.li @ 2012-07-10  6:56 UTC (permalink / raw)
  To: netdev; +Cc: Tristram.Ha, bhutchings, joe

From: Li RongQing <roy.qing.li@gmail.com>

ETH_P_IP is host Endian, skb->protocol is big Endian, when
compare them, Using htons on skb->protocol is wrong.

And fix two code style issues: indentation and remove
unnecessary parentheses.

CC: Tristram Ha <Tristram.Ha@micrel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Joe Perches <joe@perches.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 drivers/net/ethernet/micrel/ksz884x.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/micrel/ksz884x.c b/drivers/net/ethernet/micrel/ksz884x.c
index eaf9ff0..0fbe2e2 100644
--- a/drivers/net/ethernet/micrel/ksz884x.c
+++ b/drivers/net/ethernet/micrel/ksz884x.c
@@ -4881,8 +4881,8 @@ static netdev_tx_t netdev_tx(struct sk_buff *skb, struct net_device *dev)
 	left = hw_alloc_pkt(hw, skb->len, num);
 	if (left) {
 		if (left < num ||
-				((CHECKSUM_PARTIAL == skb->ip_summed) &&
-				(ETH_P_IPV6 == htons(skb->protocol)))) {
+		    (CHECKSUM_PARTIAL == skb->ip_summed &&
+		     skb->protocol == htons(ETH_P_IPV6))) {
 			struct sk_buff *org_skb = skb;
 
 			skb = netdev_alloc_skb(dev, org_skb->len);
-- 
1.7.1

^ permalink raw reply related

* Re: [RFC PATCH] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Massimo Cetra @ 2012-07-10  6:58 UTC (permalink / raw)
  To: Lin Ming
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Julian Anastasov
In-Reply-To: <CAF1ivSZBMWYc5iKxhX5d_ykkMD4LauFP9M10dBwfmqvpYj=pHg@mail.gmail.com>

On 09/07/2012 14:00, Lin Ming wrote:

>> i spent a couple of days trying to figure out how to reproduce but you were
>> quicker and smarter than me.
>
> Could you also test it ? :-)
>

Of course.

I have already installed a 3.5-rc and a 3.2.22 with this patch and, by 
now, i see no problems.

I'm only waiting a couple of days before reporting, to be sure the issue 
is gone.

Massimo

^ permalink raw reply

* Re: [PATCH net-next 6/6] r8169: support RTL8168G
From: Hayes Wang @ 2012-07-10  7:12 UTC (permalink / raw)
  To: romieu; +Cc: netdev, linux-kernel, Hayes Wang
In-Reply-To: <1341898590-1253-1-git-send-email-hayeswang@realtek.com>

1. Remove rtl_ocpdr_cond. No waiting is needed for mac_ocp_{write / read}.
2. Set ocp_base to OCP_STD_PHY_BASE after rtl8168g_1_hw_phy_config.
---
 drivers/net/ethernet/realtek/r8169.c |   14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index c29c5fb..7269175 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1043,13 +1043,6 @@ static void rtl_w1w0_phy_ocp(struct rtl8169_private *tp, int reg, int p, int m)
 	r8168_phy_ocp_write(tp, reg, (val | p) & ~m);
 }
 
-DECLARE_RTL_COND(rtl_ocpdr_cond)
-{
-	void __iomem *ioaddr = tp->mmio_addr;
-
-	return RTL_R32(OCPDR) & OCPAR_FLAG;
-}
-
 static void r8168_mac_ocp_write(struct rtl8169_private *tp, u32 reg, u32 data)
 {
 	void __iomem *ioaddr = tp->mmio_addr;
@@ -1058,8 +1051,6 @@ static void r8168_mac_ocp_write(struct rtl8169_private *tp, u32 reg, u32 data)
 		return;
 
 	RTL_W32(OCPDR, OCPAR_FLAG | (reg << 15) | data);
-
-	rtl_udelay_loop_wait_low(tp, &rtl_ocpdr_cond, 25, 10);
 }
 
 static u16 r8168_mac_ocp_read(struct rtl8169_private *tp, u32 reg)
@@ -1071,8 +1062,7 @@ static u16 r8168_mac_ocp_read(struct rtl8169_private *tp, u32 reg)
 
 	RTL_W32(OCPDR, reg << 15);
 
-	return rtl_udelay_loop_wait_high(tp, &rtl_ocpdr_cond, 25, 10) ?
-		RTL_R32(OCPDR) : ~0;
+	return RTL_R32(OCPDR);
 }
 
 #define OCP_STD_PHY_BASE	0xa400
@@ -3417,6 +3407,8 @@ static void rtl8168g_1_hw_phy_config(struct rtl8169_private *tp)
 	rtl_w1w0_phy_ocp(tp, 0xa438, 0x8000, 0x0000);
 
 	rtl_w1w0_phy_ocp(tp, 0xc422, 0x4000, 0x2000);
+
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
 static void rtl8102e_hw_phy_config(struct rtl8169_private *tp)
-- 
1.7.10.4

^ permalink raw reply related

* Re: TCP transmit performance regression
From: Ming Lei @ 2012-07-10  7:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Network Development, David Miller
In-Reply-To: <1341895143.3265.4049.camel@edumazet-glaptop>

On Tue, Jul 10, 2012 at 12:39 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Please dont send private messages for discussing general linux stuff.
>
> Next time I wont reply.
>
> On Tue, 2012-07-10 at 12:00 +0800, Ming Lei wrote:
>> On Mon, Jul 9, 2012 at 9:54 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Mon, 2012-07-09 at 21:23 +0800, Ming Lei wrote:
>> >
>> >> Looks the patch replaces skb_clone with netdev_alloc_skb_ip_align and
>> >> introduces extra copies on incoming data, so would you mind explaining
>> >> it in a bit detail? And why is skb_clone not OK for the purpose?
>> >
>> > Problem with cloning is that some paths will have to make a private copy
>> > of the skb.
>>
>> Looks you convert some private copy into all copy in rx path, :-)
>
> For small speed device, a copy is probably unnoticed.

The copy still has some effect on low speed device, for example, your recent
patch on asix driver can improve tx performance from ~75M to ~92M.

>
> rtl8169 does that (copybreak) for security issues on Gbps link speed,
> and I get Gbps link speed on an old AMD host with no problem.
>
> As you discovered, the slowdown comes from SLAB debug on the 30K huge
> skb. To recover from this we must patch usbnet to not constantly
> allocate/free such big RX skb but recycle them. Once we do that, you'll
> find out that copybreak improves general performance on low ram devices
> by an order of magnitude.

Looks your copybreak patch doesn't improve tx performance on smsc95xx.

>> >
>> > So you dont see the cost here in the driver, but later in upper stacks.
>> >
>> > Since this driver defaults to a huge RX area of more than 16Kbytes,
>> > a copy to a much smaller skb (we call this 'copybreak' in our jargon )
>> > is more than welcome to avoid OOM problems anyway.
>>
>> Looks 'memory compaction' has been implemented already to address
>> the big buffer allocation problem.
>
> Usually its too late (not enough ram to perform the compaction), and
> a collapse having to compact 3MB is very expensive and blows cpu caches.
>
> I noticed that on machines with 1GB or 2GB ram. These machines are
> called ChromeBooks and every lost network frame is analyzed in Google.
> And we had problems because some wifi adapters use 8KB skbs for incoming
> frames.

Kernel stack size is 8KB or more, so could you find process creation failure
in your ChromeBooks machine at the same time?

> (Not even 32KB !!! This is just crazy !!)
>
> Relying on TCP collapsing is just very lazy. What about other
> protocols ?
>
> I guess that on beagle this can happen very fast.

Previously I only found there was usbnet OOMs triggered by
kmalloc(GFP_ATOMIC), but kmalloc(GFP_KERNEL) can succeed.
Some times later, the problem disappeared.

>>
>> Also the allocated huge RX SKB buffer will be freed after all cloned buffers
>> are consumed, so I still don't know what is the real problem with cloned buffer.
>>
>
> IF they are consumed.
>
> But IF they arent because application is not fast enough to drain, you
> end with sockets storing huge amount of data in their receive buffer.
>
> So a single 100 bytes payload holds the 32KB block.
>
> If you allowed your UDP socket to store 130.000 bytes of payload, you
> can consume 13.000 * 32KB = ~40 MB

Looks it is one advantage of copybreak.

>
>
>> >
>> > TCP coalescing (skb_try_coalesce) for example wont work for cloned skbs,
>> > so TCP receive window will close pretty fast, and performance sucks in
>> > lossy environments (like the Internet)
>>
>> I didn't observe the above thing, so could you provide a way to reproduce it?
>>
>
> netstat -s can show you interesting TCP counters. But as driver lies on
> skb->truesize, you can also have unexpected crashes with malicious
> senders. With a 64 ratio, its easy to consume all ram.
>
> TCP coalescing is great as soon as you have Out Of Order queueing
> because of packet losses. You avoid expensive collapses and
> dropping/purge of OFO queue. Sender has to resend previously sent data.
>
>> Suppose the above is true, looks skb_clone is useless, isn't it?
>
> cloning has some uses, for example if you dont need to touch packet
> content, only mess with skb->data, skb->len, skb->tail.
>
> But if you need to change a single bit in the payload, or play with skb
> fragments (struct skb_shared_info), you have to make a full copy of the
> 30KB buffer, even if the skb contained only 10 bytes of payload.

So the netdev_alloc_skb_ip_align() can be replaced with skb_clone()
in asix driver since not bits are touched in asix_rx_fixup? The default MTU is
1500 and rx_urb_size is 2048.

If so, could we use copybreak only for case of rx_urb_size > 4096?
And for ax88172, the dev->rx_urb_size is always 2048, looks the copy
is not needed at all.

> I would just switch off turbo mode by default, I doubt it has any
> advantage.

At least for smsc95xx, I think 32K buffer is not worthy of the feature.

>
> Coalescing up to 16K of incoming frames adds latency for no performance
> gain, once you do it the right way (that is without OOM risks).
> Currently, skb->truesize lie is very bad.
>



Thanks,
-- 
Ming Lei

^ permalink raw reply

* Re: linux-next: build failure after merge of the net-next tree
From: Bjørn Mork @ 2012-07-10  7:25 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: David Miller, netdev, linux-next, linux-kernel
In-Reply-To: <20120710130848.1014fbe05e5146a33a3c7d39@canb.auug.org.au>

Stephen Rothwell <sfr@canb.auug.org.au> writes:

> Hi all,
>
> After merging the net-next tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
>
> drivers/net/usb/qmi_wwan.c:381:13: error: 'qmi_wwan_unbind_shared' undeclared here (not in a function)
>
> Caused by a bad automatic merge between commit 6fecd35d4cd7 ("net:
> qmi_wwan: add ZTE MF60") from the net tree and commit 230718bda1be ("net:
> qmi_wwan: bind to both control and data interface") from the net-next
> tree.
>
> I added the following merge fix patch:
>
> From: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Tue, 10 Jul 2012 13:06:01 +1000
> Subject: [PATCH] net: fix for qmi_wwan_unbind_shared changes
>
> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
> ---
>  drivers/net/usb/qmi_wwan.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
> index 06cfcc7..85c983d 100644
> --- a/drivers/net/usb/qmi_wwan.c
> +++ b/drivers/net/usb/qmi_wwan.c
> @@ -378,7 +378,7 @@ static const struct driver_info qmi_wwan_force_int2 = {
>  	.description	= "Qualcomm WWAN/QMI device",
>  	.flags		= FLAG_WWAN,
>  	.bind		= qmi_wwan_bind_shared,
> -	.unbind		= qmi_wwan_unbind_shared,
> +	.unbind		= qmi_wwan_unbind,
>  	.manage_power	= qmi_wwan_manage_power,
>  	.data		= BIT(2), /* interface whitelist bitmap */
>  };


Looks good.  Thanks.


Bjørn

^ permalink raw reply

* net-next kernel NULL pointer dereference at fib_rules_tclass
From: Or Gerlitz @ 2012-07-10  7:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Shlomo Pongratz, Amir Vadai, Erez Shitrit

Hi Dave,

Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
get the below crash during the boot cycle. The crash happens on a set of
nodes which use igb for their onboard 1g nic, as soon as the device goes
up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
NIC doesn't get this crash, but the kernel there is built by a different
.config .

Or.

Bringing up loopback interface:  [  OK  ]
Bringing up interface eth1:
Determining IP information for eth1...IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
Starting system logger: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ac
IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
PGD 223171067 PUD 22353e067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in:
 ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr rng_core ioatdma dca shpchp dm_mod button sr_mod ext3 jbd sd_mod usb_storage ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last unloaded: scsi_wait_scan]

Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12540-g061a5c3-dirty #94 Supermicro X7DWU/X7DWU
RIP: 0010:[<ffffffff81320393>]  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
RSP: 0018:ffff88022fc03a30  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88022fc03b54 RCX: 0000000000000050
RDX: 0000000000000020 RSI: 0000000000000001 RDI: ffff88022fc03a40
RBP: ffff88022fc03a30 R08: ffff88022fc03a70 R09: ffff88022fc03a40
R10: 0000000000000020 R11: ffff880225390a80 R12: 0000000000000001
R13: ffff88021cc7a000 R14: 0000000000000000 R15: ffff8802269c26c0
FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000ac CR3: 0000000222aeb000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff81613410)
Stack:
 ffff88022fc03ac0 ffffffff81318956 ffff8802fd010010 ffff8802232d5a80
 ffff880222add880 ffff880223269a98 0000000000000020 ffff880200000000
 0000000100000000 ffff000000000000 12311eac2540eaf0 ffff88027e001eac
Call Trace:
 <IRQ>

 [<ffffffff81318956>] fib_validate_source+0x170/0x2a5
 [<ffffffff812e6603>] ip_route_input_common+0x6fe/0xd12
 [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
 [<ffffffff812e8461>] ip_rcv_finish+0x151/0x457
 [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
 [<ffffffff812e89a1>] ip_rcv+0x23a/0x260
 [<ffffffff812beae7>] __netif_receive_skb+0x3ac/0x415
 [<ffffffff812be86f>] ? __netif_receive_skb+0x134/0x415
 [<ffffffff81312ae5>] ? inet_gro_receive+0x81/0x23f
 [<ffffffff812b68da>] ? skb_free_head+0x47/0x49
 [<ffffffff812c035d>] netif_receive_skb+0xee/0xf7
 [<ffffffff812c071d>] ? dev_gro_receive+0x15f/0x2fb
 [<ffffffff812c063a>] ? dev_gro_receive+0x7c/0x2fb
 [<ffffffff81065644>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff812c044c>] napi_skb_finish+0x24/0x56
 [<ffffffff812c0bf0>] napi_gro_receive+0x10f/0x11e
 [<ffffffffa0216e85>] igb_poll+0x843/0xae5 [igb]
 [<ffffffff812c0e01>] ? net_rx_action+0x14c/0x1ee
 [<ffffffff812c0d76>] net_rx_action+0xc1/0x1ee
 [<ffffffff8102f746>] __do_softirq+0xff/0x1de
 [<ffffffff813631cc>] call_softirq+0x1c/0x26
 [<ffffffff81003090>] do_softirq+0x38/0x80
 [<ffffffff8102f41f>] irq_exit+0x4e/0x83
 [<ffffffff810028f9>] do_IRQ+0x98/0xaf
 [<ffffffff8135b52c>] common_interrupt+0x6c/0x6c
 <EOI>

 [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
 [<ffffffff810083e3>] ? mwait_idle+0x133/0x208
 [<ffffffff810088d1>] cpu_idle+0x6e/0xab
 [<ffffffff81343e13>] rest_init+0xc7/0xce
 [<ffffffff81343d4c>] ? csum_partial_copy_generic+0x16c/0x16c
 [<ffffffff8167fbf3>] start_kernel+0x332/0x33f
 [<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
 [<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
 [<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
Code: 81 31 c0 e8 a5 bb dd ff 48 83 c4 28 31 c0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 90 90 90 48 8b 57 20 55 31 c0 48 89 e5 48 85 d2 74 06 <8b> 82 8c 00 00 00 c9 c3 8b 47 7c 33 46 14 85 87 80 00 00 00 55
RIP  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
 RSP <ffff88022fc03a30>
CR2: 00000000000000ac
---[ end trace e7c6714b8de1c341 ]---
Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply

* net-next kernel NULL pointer dereference at fib_rules_tclass
From: Or Gerlitz @ 2012-07-10  7:29 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, Amir Vadai, Shlomo Pongratz, Erez Shitrit

Hi Dave,

Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
get the below crash during the boot cycle. The crash happens on a set of
nodes which use igb for their onboard 1g nic, as soon as the device goes
up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
NIC doesn't get this crash, but the kernel there is built by a different
.config

Or.


Bringing up loopback interface:  [  OK  ]
Bringing up interface eth1:
Determining IP information for eth1...IPv6: ADDRCONF(NETDEV_UP): eth1:
link is not ready
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
Starting system logger: BUG: unable to handle kernel NULL pointer
dereference at 00000000000000ac
IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
PGD 223171067 PUD 22353e067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in:
  ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib
ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr
rng_core ioatdma dca shpchp dm_mod button sr_mod ext3 jbd sd_mod
usb_storage ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last
unloaded: scsi_wait_scan]

Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12540-g061a5c3-dirty #94
Supermicro X7DWU/X7DWU
RIP: 0010:[<ffffffff81320393>]  [<ffffffff81320393>]
fib_rules_tclass+0xf/0x17
RSP: 0018:ffff88022fc03a30  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88022fc03b54 RCX: 0000000000000050
RDX: 0000000000000020 RSI: 0000000000000001 RDI: ffff88022fc03a40
RBP: ffff88022fc03a30 R08: ffff88022fc03a70 R09: ffff88022fc03a40
R10: 0000000000000020 R11: ffff880225390a80 R12: 0000000000000001
R13: ffff88021cc7a000 R14: 0000000000000000 R15: ffff8802269c26c0
FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000ac CR3: 0000000222aeb000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task
ffffffff81613410)
Stack:
  ffff88022fc03ac0 ffffffff81318956 ffff8802fd010010 ffff8802232d5a80
  ffff880222add880 ffff880223269a98 0000000000000020 ffff880200000000
  0000000100000000 ffff000000000000 12311eac2540eaf0 ffff88027e001eac
Call Trace:
  <IRQ>

  [<ffffffff81318956>] fib_validate_source+0x170/0x2a5
  [<ffffffff812e6603>] ip_route_input_common+0x6fe/0xd12
  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
  [<ffffffff812e8461>] ip_rcv_finish+0x151/0x457
  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
  [<ffffffff812e89a1>] ip_rcv+0x23a/0x260
  [<ffffffff812beae7>] __netif_receive_skb+0x3ac/0x415
  [<ffffffff812be86f>] ? __netif_receive_skb+0x134/0x415
  [<ffffffff81312ae5>] ? inet_gro_receive+0x81/0x23f
  [<ffffffff812b68da>] ? skb_free_head+0x47/0x49
  [<ffffffff812c035d>] netif_receive_skb+0xee/0xf7
[<ffffffff812c071d>] ? dev_gro_receive+0x15f/0x2fb
  [<ffffffff812c063a>] ? dev_gro_receive+0x7c/0x2fb
  [<ffffffff81065644>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff812c044c>] napi_skb_finish+0x24/0x56
  [<ffffffff812c0bf0>] napi_gro_receive+0x10f/0x11e
  [<ffffffffa0216e85>] igb_poll+0x843/0xae5 [igb]
  [<ffffffff812c0e01>] ? net_rx_action+0x14c/0x1ee
  [<ffffffff812c0d76>] net_rx_action+0xc1/0x1ee
  [<ffffffff8102f746>] __do_softirq+0xff/0x1de
  [<ffffffff813631cc>] call_softirq+0x1c/0x26
  [<ffffffff81003090>] do_softirq+0x38/0x80
  [<ffffffff8102f41f>] irq_exit+0x4e/0x83
  [<ffffffff810028f9>] do_IRQ+0x98/0xaf
  [<ffffffff8135b52c>] common_interrupt+0x6c/0x6c
  <EOI>

  [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
  [<ffffffff810083e3>] ? mwait_idle+0x133/0x208
  [<ffffffff810088d1>] cpu_idle+0x6e/0xab
  [<ffffffff81343e13>] rest_init+0xc7/0xce
  [<ffffffff81343d4c>] ? csum_partial_copy_generic+0x16c/0x16c
  [<ffffffff8167fbf3>] start_kernel+0x332/0x33f
  [<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
  [<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
  [<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
Code: 81 31 c0 e8 a5 bb dd ff 48 83 c4 28 31 c0 5b 41 5c 41 5d 41 5e 41
5f c9 c3 90 90 90 48 8b 57 20 55 31 c0 48 89 e5 48 85 d2 74 06 <8b> 82
8c 00 00 00 c9 c3 8b 47 7c 33 46 14 85 87 80 00 00 00 55
RIP  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
  RSP <ffff88022fc03a30>
CR2: 00000000000000ac
---[ end trace e7c6714b8de1c341 ]---
Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply

* Re: 82571EB: Detected Hardware Unit Hang
From: Joe Jin @ 2012-07-10  7:40 UTC (permalink / raw)
  To: Joe Jin; +Cc: e1000-devel, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <4FFA9B96.6040901@oracle.com>

When I debug the driver I found before Detected HW hang, driver unable to clean
and reclaim the resources:

1457         while ((eop_desc->upper.data & cpu_to_le32(E1000_TXD_STAT_DD)) &&  <== at here upper.data always is 0x300
1458                (count < tx_ring->count)) {
     <--- snip --->
1487         }


I checked all driver codes I did not found anywhere will set the upper.data with 
E1000_TXD_STAT_DD, I guess upper.data be set by hardware?
If OS is 32bit system, what which happen?

Thanks in advance,
Joe 

On 07/09/12 16:51, Joe Jin wrote:
> Hi list,
> 
> I'm seeing a Unit Hang even with the latest e1000e driver 2.0.0 when doing
> scp test. this issue is easy do reproduced on SUN FIRE X2270 M2, just copy
> a big file (>500M) from another server will hit it at once. 
> 
> Would you please help on this?
> 
> device info:
> # lspci -s 05:00.0 
> 05:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
> 
> # lspci -s 05:00.0 -n
> 05:00.0 0200: 8086:10bc (rev 06)
> 
> # ethtool -i eth0
> driver: e1000e
> version: 2.0.0-NAPI
> firmware-version: 5.10-2
> bus-info: 0000:05:00.0
> 
> # ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: on
> udp fragmentation offload: off
> generic segmentation offload: on
> generic-receive-offload: on
> 
> kernel log:
> -----------
> e1000e 0000:05:00.0: eth0: Detected Hardware Unit Hang:
>   TDH                  <6c>
>   TDT                  <81>
>   next_to_use          <81>
>   next_to_clean        <6b>
> buffer_info[next_to_clean]:
>   time_stamp           <fffc7a23>
>   next_to_watch        <71>
>   jiffies              <fffc8c0c>
>   next_to_watch.status <0>
> MAC Status             <80387>
> PHY Status             <792d>
> PHY 1000BASE-T Status  <3c00>
> PHY Extended Status    <3000>
> PCI Status             <10>
> e1000e 0000:05:00.0: eth0: Detected Hardware Unit Hang:
>   TDH                  <6c>
>   TDT                  <81>
>   next_to_use          <81>
>   next_to_clean        <6b>
> buffer_info[next_to_clean]:
>   time_stamp           <fffc7a23>
>   next_to_watch        <71>
>   jiffies              <fffc9bac>
>   next_to_watch.status <0>
> MAC Status             <80387>
> PHY Status             <792d>
> PHY 1000BASE-T Status  <3c00>
> PHY Extended Status    <3000>
> PCI Status             <10>
> ------------[ cut here ]------------
> WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x225/0x230()
> Hardware name: SUN FIRE X2270 M2
> NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
> Modules linked in: autofs4 hidp rfcomm bluetooth rfkill lockd sunrpc cpufreq_ondemand acpi_cpufreq mperf be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi video sbs sbshc acpi_pad acpi_ipmi ipmi_msghandler parport_pc lp parport e1000e(U) snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device igb snd_pcm_oss serio_raw snd_mixer_oss snd_pcm tpm_infineon snd_timer snd soundcore snd_page_alloc i2c_i801 iTCO_wdt i2c_core pcspkr i7core_edac iTCO_vendor_support ioatdma ghes dca edac_core hed dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage sd_mod crc_t10dif sg ahci libahci ext3 jbd mbcache [last unloaded: microcode]
> Pid: 0, comm: swapper Not tainted 2.6.39-200.24.1.el5uek #1
> Call Trace:
>  [<c07d9ac5>] ? dev_watchdog+0x225/0x230
>  [<c045ba61>] warn_slowpath_common+0x81/0xa0
>  [<c07d9ac5>] ? dev_watchdog+0x225/0x230
>  [<c045bb23>] warn_slowpath_fmt+0x33/0x40
>  [<c07d9ac5>] dev_watchdog+0x225/0x230
>  [<c07d98a0>] ? dev_activate+0xb0/0xb0
>  [<c0468e82>] call_timer_fn+0x32/0xf0
>  [<c04bceb0>] ? rcu_check_callbacks+0x80/0x80
>  [<c046a76d>] run_timer_softirq+0xed/0x1b0
>  [<c07d98a0>] ? dev_activate+0xb0/0xb0
>  [<c0461a81>] __do_softirq+0x91/0x1a0
>  [<c04619f0>] ? local_bh_enable+0x80/0x80
>  <IRQ>  [<c0462295>] ? irq_exit+0x95/0xa0
>  [<c087f8b8>] ? smp_apic_timer_interrupt+0x38/0x42
>  [<c08784f5>] ? apic_timer_interrupt+0x31/0x38
>  [<c046007b>] ? do_exit+0x11b/0x370
>  [<c065eae4>] ? intel_idle+0xa4/0x100
>  [<c078d9b9>] ? cpuidle_idle_call+0xb9/0x1e0
>  [<c0411d77>] ? cpu_idle+0x97/0xd0
>  [<c085cbbd>] ? rest_init+0x5d/0x70
>  [<c0b07a7a>] ? start_kernel+0x28a/0x340
>  [<c0b074b0>] ? obsolete_checksetup+0xb0/0xb0
>  [<c0b070a4>] ? i386_start_kernel+0x64/0xb0
> ---[ end trace 5502b55cd4d4e5cb ]---
> e1000e 0000:05:00.0: eth0: Reset adapter
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> 
> Thanks,
> Joe
> 


-- 
Oracle <http://www.oracle.com>
Joe Jin | Software Development Senior Manager | +8610.6106.5624
ORACLE | Linux and Virtualization
No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing 

^ permalink raw reply

* [v2 PATCH] qlge: fix endian issue
From: roy.qing.li @ 2012-07-10  8:02 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

commit 6d29b1ef introduces a bug, ntohs is __be16_to_cpu,
not cpu_to_be16.

We always use htons on IP_OFFSET and IP_MF, then compare
with network package.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
v2 : Change my name
 drivers/net/ethernet/qlogic/qlge/qlge_main.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
index 09d8d33..7c520fa 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
@@ -1546,7 +1546,7 @@ static void ql_process_mac_rx_page(struct ql_adapter *qdev,
 			struct iphdr *iph =
 				(struct iphdr *) ((u8 *)addr + ETH_HLEN);
 			if (!(iph->frag_off &
-				cpu_to_be16(IP_MF|IP_OFFSET))) {
+				htons(IP_MF|IP_OFFSET))) {
 				skb->ip_summed = CHECKSUM_UNNECESSARY;
 				netif_printk(qdev, rx_status, KERN_DEBUG,
 					     qdev->ndev,
@@ -1654,7 +1654,7 @@ static void ql_process_mac_rx_skb(struct ql_adapter *qdev,
 			/* Unfragmented ipv4 UDP frame. */
 			struct iphdr *iph = (struct iphdr *) skb->data;
 			if (!(iph->frag_off &
-				ntohs(IP_MF|IP_OFFSET))) {
+				htons(IP_MF|IP_OFFSET))) {
 				skb->ip_summed = CHECKSUM_UNNECESSARY;
 				netif_printk(qdev, rx_status, KERN_DEBUG,
 					     qdev->ndev,
@@ -1968,7 +1968,7 @@ static void ql_process_mac_split_rx_intr(struct ql_adapter *qdev,
 		/* Unfragmented ipv4 UDP frame. */
 			struct iphdr *iph = (struct iphdr *) skb->data;
 			if (!(iph->frag_off &
-				ntohs(IP_MF|IP_OFFSET))) {
+				htons(IP_MF|IP_OFFSET))) {
 				skb->ip_summed = CHECKSUM_UNNECESSARY;
 				netif_printk(qdev, rx_status, KERN_DEBUG, qdev->ndev,
 					     "TCP checksum done!\n");
-- 
1.7.1

^ permalink raw reply related

* Re: TCP transmit performance regression
From: Eric Dumazet @ 2012-07-10  8:28 UTC (permalink / raw)
  To: Ming Lei; +Cc: Network Development, David Miller
In-Reply-To: <CACVXFVPgqtSN3BrEXRxSv4yxaxCni495SxZNXBmYQpagmxk2tQ@mail.gmail.com>

On Tue, 2012-07-10 at 15:22 +0800, Ming Lei wrote:

> Kernel stack size is 8KB or more, so could you find process creation failure
> in your ChromeBooks machine at the same time?

I believe you mix a lot of things.

Have you ever heard of sockets limits ?

All available ram on a machine is not for whoever wants it, thanks God.

No : TCP stack was dropping frames, because of socket limits.

Only because skbs were fat (8KB allocated/truesize, for a single 1500
bytes frame)

If application is fast and read skb as soon as the arrive, no problem is
detected.

But if  application is slow, or a TCP packet is lost on network,
man packets are queued into ofo queue. And eventually not enough room is
avalable -> we drop incoming frames, and sender has to restransmit them.

So instead of loading your web pages as fast as possible, you have to
wait for retransmits.

So you see nothing at all, no kernel logs, no failed memory attempts.

Only its slower than necessary

^ permalink raw reply

* Re: [RFC PATCH] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Lin Ming @ 2012-07-10  8:34 UTC (permalink / raw)
  To: Massimo Cetra
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Julian Anastasov
In-Reply-To: <4FFBD289.7050909@navynet.it>

On Tue, Jul 10, 2012 at 2:58 PM, Massimo Cetra <mcetra@navynet.it> wrote:
> On 09/07/2012 14:00, Lin Ming wrote:
>
>>> i spent a couple of days trying to figure out how to reproduce but you
>>> were
>>> quicker and smarter than me.
>>
>>
>> Could you also test it ? :-)
>>
>
> Of course.
>
> I have already installed a 3.5-rc and a 3.2.22 with this patch and, by now,
> i see no problems.
>
> I'm only waiting a couple of days before reporting, to be sure the issue is
> gone.

Then could you reply to below thread after you confirm the issue is gone?

http://marc.info/?l=linux-netdev&m=134165707424765&w=2

Nice to add your "Reported-and-tested-by:".

Thanks,
Lin Ming

>
>
> Massimo

^ permalink raw reply

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Simon Horman @ 2012-07-10  8:41 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Lin Ming, Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller
In-Reply-To: <alpine.LFD.2.00.1207071322490.5927@ja.ssi.bg>

On Sat, Jul 07, 2012 at 01:27:49PM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Sat, 7 Jul 2012, Lin Ming wrote:
> 
> > On Sat, 2012-07-07 at 12:48 +0300, Julian Anastasov wrote:
> > > 
> > > 	Very good. Thanks for tracking and fixing this bug.
> > > Can you send a copy to Simon Horman <horms@verge.net.au>
> > > with correct Subject. As this change can go to stable
> > > kernels you can also improve the comments, for example:
> > > 
> > > ipvs: fix oops on NAT reply in br_nf context
> > > 
> > > 	IPVS should not reset skb->nf_bridge in FORWARD hook
> > > by calling nf_reset for NAT replies. It triggers oops in
> > > br_nf_forward_finish.
> > > 
> > > [here follows your corrected description including
> > > the stack trace]
> > 
> > How about below? Can I have your ACK?
> > I'll resend this patch in another mail.
> 
> 	Very good. You can add my
> 
> Signed-off-by: Julian Anastasov <ja@ssi.bg>

Thanks, I will queue this up in my ipvs tree and see
about getting it included in 3.5

It seems to me that this problem has been present since 2.6.37
and thus is stable material.

^ permalink raw reply

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
From: Lin Ming @ 2012-07-10  8:42 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, netdev, Shlomo Pongratz, Amir Vadai, Erez Shitrit
In-Reply-To: <alpine.LRH.2.00.1207101008270.9760@ogerlitz.voltaire.com>

On Tue, Jul 10, 2012 at 3:16 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> Hi Dave,
>
> Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
> get the below crash during the boot cycle. The crash happens on a set of
> nodes which use igb for their onboard 1g nic, as soon as the device goes
> up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
> NIC doesn't get this crash, but the kernel there is built by a different
> .config .

Hi,

I got similar panic, but not at boot time.
I'll look for the cause.

Regards,
Lin Ming

>
> Or.
>
> Bringing up loopback interface:  [  OK  ]
> Bringing up interface eth1:
> Determining IP information for eth1...IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
> igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> Starting system logger: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ac
> IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
> PGD 223171067 PUD 22353e067 PMD 0
> Oops: 0000 [#1] SMP
> CPU 0
> Modules linked in:
>  ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr rng_core ioatdma dca shpchp dm_mod button sr_mod ext3 jbd sd_mod usb_storage ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last unloaded: scsi_wait_scan]
>
> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12540-g061a5c3-dirty #94 Supermicro X7DWU/X7DWU
> RIP: 0010:[<ffffffff81320393>]  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
> RSP: 0018:ffff88022fc03a30  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff88022fc03b54 RCX: 0000000000000050
> RDX: 0000000000000020 RSI: 0000000000000001 RDI: ffff88022fc03a40
> RBP: ffff88022fc03a30 R08: ffff88022fc03a70 R09: ffff88022fc03a40
> R10: 0000000000000020 R11: ffff880225390a80 R12: 0000000000000001
> R13: ffff88021cc7a000 R14: 0000000000000000 R15: ffff8802269c26c0
> FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000000ac CR3: 0000000222aeb000 CR4: 00000000000007f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff81613410)
> Stack:
>  ffff88022fc03ac0 ffffffff81318956 ffff8802fd010010 ffff8802232d5a80
>  ffff880222add880 ffff880223269a98 0000000000000020 ffff880200000000
>  0000000100000000 ffff000000000000 12311eac2540eaf0 ffff88027e001eac
> Call Trace:
>  <IRQ>
>
>  [<ffffffff81318956>] fib_validate_source+0x170/0x2a5
>  [<ffffffff812e6603>] ip_route_input_common+0x6fe/0xd12
>  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
>  [<ffffffff812e8461>] ip_rcv_finish+0x151/0x457
>  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
>  [<ffffffff812e89a1>] ip_rcv+0x23a/0x260
>  [<ffffffff812beae7>] __netif_receive_skb+0x3ac/0x415
>  [<ffffffff812be86f>] ? __netif_receive_skb+0x134/0x415
>  [<ffffffff81312ae5>] ? inet_gro_receive+0x81/0x23f
>  [<ffffffff812b68da>] ? skb_free_head+0x47/0x49
>  [<ffffffff812c035d>] netif_receive_skb+0xee/0xf7
>  [<ffffffff812c071d>] ? dev_gro_receive+0x15f/0x2fb
>  [<ffffffff812c063a>] ? dev_gro_receive+0x7c/0x2fb
>  [<ffffffff81065644>] ? trace_hardirqs_on+0xd/0xf
>  [<ffffffff812c044c>] napi_skb_finish+0x24/0x56
>  [<ffffffff812c0bf0>] napi_gro_receive+0x10f/0x11e
>  [<ffffffffa0216e85>] igb_poll+0x843/0xae5 [igb]
>  [<ffffffff812c0e01>] ? net_rx_action+0x14c/0x1ee
>  [<ffffffff812c0d76>] net_rx_action+0xc1/0x1ee
>  [<ffffffff8102f746>] __do_softirq+0xff/0x1de
>  [<ffffffff813631cc>] call_softirq+0x1c/0x26
>  [<ffffffff81003090>] do_softirq+0x38/0x80
>  [<ffffffff8102f41f>] irq_exit+0x4e/0x83
>  [<ffffffff810028f9>] do_IRQ+0x98/0xaf
>  [<ffffffff8135b52c>] common_interrupt+0x6c/0x6c
>  <EOI>
>
>  [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
>  [<ffffffff810083e3>] ? mwait_idle+0x133/0x208
>  [<ffffffff810088d1>] cpu_idle+0x6e/0xab
>  [<ffffffff81343e13>] rest_init+0xc7/0xce
>  [<ffffffff81343d4c>] ? csum_partial_copy_generic+0x16c/0x16c
>  [<ffffffff8167fbf3>] start_kernel+0x332/0x33f
>  [<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
>  [<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
>  [<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
> Code: 81 31 c0 e8 a5 bb dd ff 48 83 c4 28 31 c0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 90 90 90 48 8b 57 20 55 31 c0 48 89 e5 48 85 d2 74 06 <8b> 82 8c 00 00 00 c9 c3 8b 47 7c 33 46 14 85 87 80 00 00 00 55
> RIP  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
>  RSP <ffff88022fc03a30>
> CR2: 00000000000000ac
> ---[ end trace e7c6714b8de1c341 ]---
> Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox