Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 1/2] inetpeer: add namespace support for inetpeer
From: Gao feng @ 2012-06-08  5:27 UTC (permalink / raw)
  To: David Miller
  Cc: steffen.klassert-opNxpl+3fjRBDgjK7y7TUQ,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w
In-Reply-To: <20120607.205348.913373847915721607.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

于 2012年06月08日 11:53, David Miller 写道:
> From: Gao feng <gaofeng@cn.fujitsu.com>
> Date: Fri, 08 Jun 2012 09:44:04 +0800
> 
>> Sorry for my pool english, I don't understand this.
>> Can you explain it for me?
> 
> Look at the end of the parts of the patch that change
> net/ipv4/inetpeer.c:
> 
> @@ -586,3 +617,3 @@  out:
>  	write_sequnlock_bh(&base->lock);
>  }
>  EXPORT_SYMBOL(inetpeer_invalidate_tree);
> 
> That makes no sense, it's a patch hunk with no "+" or "-"
> lines.


Thanks David,got it.

> 
> Your patch was corrupted by something.
> 
> This is extremely irritating.  I would recommend that you email
> patches to yourself, and try to apply the copies you receive
> in those emails.  Because that's what the people trying to use
> your patch are going to do.
> 

Sorry, I will try it myself next time. ;)
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: David Miller @ 2012-06-08  5:46 UTC (permalink / raw)
  To: therbert; +Cc: netdev
In-Reply-To: <alpine.DEB.2.00.1206072201200.13686@pokey.mtv.corp.google.com>

From: Tom Herbert <therbert@google.com>
Date: Thu, 7 Jun 2012 22:05:42 -0700 (PDT)

> @@ -381,7 +382,8 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
>  	return next;
>  }
>  
> -#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
> +#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb + \
> +    sizeof(struct qdisc_skb_cb)))
>  
>  /**
>   * bond_dev_queue_xmit - Prepare skb for xmit.

I know it's a little bit more work, but please declare a proper
datastructure which shows explicitly what's going on, like Infiniband
does in drivers/infiniband/ulp/ipoib/ipoib.h

struct bond_skb_cb {
	struct qdisc_skb_cb	qdisc_cb;
	u16			queue_mapping;
};

Actually, this probably means there is also a conflict and thus
queue mapping corruption possible for bonded infiniband. :-/

^ permalink raw reply

* Re: [PATCH (net.git) V3] stmmac: fix driver built w/ w/o both pci and platf modules
From: David Miller @ 2012-06-08  5:47 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev, wfg
In-Reply-To: <1339133107-29098-1-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Date: Fri,  8 Jun 2012 07:25:07 +0200

> The commit ba27ec66ffeb78cbf fixes the Kconfig of the
> driver when built as module allowing to select/unselect
> the PCI and Platform modules that are not anymore mutually
> exclusive. This patch fixes and guarantees that the driver
> builds on all the platforms w/ w/o PCI and when select/unselect
> the two stmmac supports. In case of there are some problems
> on both the configuration and the pci/pltf registration the
> module_init will fail.
> 
> v2: set the CONFIG_STMMAC_PLATFORM enabled by default.
> I've just noticed that this can actually help on
> some configurations that don't enable any STMMAC
> options by default (e.g. SPEAr).
> 
> v3: change printk level when do not register the driver.
> 
> Reported-by: Fengguang Wu <wfg@linux.intel.com>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Applied.

^ permalink raw reply

* Re: Deadlock, L2TP over IP are not working, 3.4.1
From: Eric Dumazet @ 2012-06-08  5:47 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Denys Fedoryshchenko, davem, netdev, linux-kernel
In-Reply-To: <20120607223752.GA4475@electric-eye.fr.zoreil.com>

On Fri, 2012-06-08 at 00:37 +0200, Francois Romieu wrote:
> Eric Dumazet <eric.dumazet@gmail.com> :
> [...]
> > If LLTX is used, this means several cpus can execute this code at the
> > same time.
> > 
> > You need percpu stats, or use atomic primitives.
> 
> Would adding percpu stats not be frown upon ?
> 

I have no idea how many l2tp_eth devices are setup at once in typical
conf.

> As atomic will defeat the 64 bits stats on 32 bits, I should probably stick
> to plain bh disabling lock.
> 

Not sure what you mean by "atomic will defeat the 64 bits stats on 32
bits", since you also need atomic for "32 bits stats" or "64 bits stats
on 64 bit arches"

I personally would just use "unsigned long". If people want 64bit stats
for their l2tp_eth, they certainly can use a 64bit host.

^ permalink raw reply

* [PATCH] net/sh-eth: Add support selecting MII function for SH7734 and R8A7740
From: Nobuhiro Iwamatsu @ 2012-06-08  6:01 UTC (permalink / raw)
  To: netdev; +Cc: Nobuhiro Iwamatsu

Ethernet IP of SH7734 and R8A7740 has selecting MII register.
The user needs to change a value according to MII to be used.
This adds the function to change the value of this register.

Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
---
 drivers/net/ethernet/renesas/sh_eth.c |  105 ++++++++++++++++++++-------------
 drivers/net/ethernet/renesas/sh_eth.h |    1 +
 2 files changed, 64 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index be3c221..e3d03c8 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -49,6 +49,33 @@
 		NETIF_MSG_RX_ERR| \
 		NETIF_MSG_TX_ERR)
 
+#if defined(CONFIG_CPU_SUBTYPE_SH7734) || defined(CONFIG_CPU_SUBTYPE_SH7763) || \
+	defined(CONFIG_ARCH_R8A7740)
+static void sh_eth_select_mii(struct net_device *ndev)
+{
+	u32 value = 0x0;
+	struct sh_eth_private *mdp = netdev_priv(ndev);
+
+	switch (mdp->phy_interface) {
+	case PHY_INTERFACE_MODE_GMII:
+		value = 0x2;
+		break;
+	case PHY_INTERFACE_MODE_MII:
+		value = 0x1;
+		break;
+	case PHY_INTERFACE_MODE_RMII:
+		value = 0x0;
+		break;
+	default:
+		pr_warn("PHY interface mode was not setup. Set to MII.\n");
+		value = 0x1;
+		break;
+	}
+
+	sh_eth_write(ndev, value, RMII_MII);
+}
+#endif
+
 /* There is CPU dependent code */
 #if defined(CONFIG_CPU_SUBTYPE_SH7724)
 #define SH_ETH_RESET_DEFAULT	1
@@ -283,6 +310,7 @@ static struct sh_eth_cpu_data *sh_eth_get_cpu_data(struct sh_eth_private *mdp)
 #elif defined(CONFIG_CPU_SUBTYPE_SH7734) || defined(CONFIG_CPU_SUBTYPE_SH7763)
 #define SH_ETH_HAS_TSU	1
 static void sh_eth_reset_hw_crc(struct net_device *ndev);
+
 static void sh_eth_chip_reset(struct net_device *ndev)
 {
 	struct sh_eth_private *mdp = netdev_priv(ndev);
@@ -292,35 +320,6 @@ static void sh_eth_chip_reset(struct net_device *ndev)
 	mdelay(1);
 }
 
-static void sh_eth_reset(struct net_device *ndev)
-{
-	int cnt = 100;
-
-	sh_eth_write(ndev, EDSR_ENALL, EDSR);
-	sh_eth_write(ndev, sh_eth_read(ndev, EDMR) | EDMR_SRST_GETHER, EDMR);
-	while (cnt > 0) {
-		if (!(sh_eth_read(ndev, EDMR) & 0x3))
-			break;
-		mdelay(1);
-		cnt--;
-	}
-	if (cnt == 0)
-		printk(KERN_ERR "Device reset fail\n");
-
-	/* Table Init */
-	sh_eth_write(ndev, 0x0, TDLAR);
-	sh_eth_write(ndev, 0x0, TDFAR);
-	sh_eth_write(ndev, 0x0, TDFXR);
-	sh_eth_write(ndev, 0x0, TDFFR);
-	sh_eth_write(ndev, 0x0, RDLAR);
-	sh_eth_write(ndev, 0x0, RDFAR);
-	sh_eth_write(ndev, 0x0, RDFXR);
-	sh_eth_write(ndev, 0x0, RDFFR);
-
-	/* Reset HW CRC register */
-	sh_eth_reset_hw_crc(ndev);
-}
-
 static void sh_eth_set_duplex(struct net_device *ndev)
 {
 	struct sh_eth_private *mdp = netdev_priv(ndev);
@@ -377,9 +376,43 @@ static struct sh_eth_cpu_data sh_eth_my_cpu_data = {
 	.tsu		= 1,
 #if defined(CONFIG_CPU_SUBTYPE_SH7734)
 	.hw_crc     = 1,
+	.select_mii = 1,
 #endif
 };
 
+static void sh_eth_reset(struct net_device *ndev)
+{
+	int cnt = 100;
+
+	sh_eth_write(ndev, EDSR_ENALL, EDSR);
+	sh_eth_write(ndev, sh_eth_read(ndev, EDMR) | EDMR_SRST_GETHER, EDMR);
+	while (cnt > 0) {
+		if (!(sh_eth_read(ndev, EDMR) & 0x3))
+			break;
+		mdelay(1);
+		cnt--;
+	}
+	if (cnt == 0)
+		printk(KERN_ERR "Device reset fail\n");
+
+	/* Table Init */
+	sh_eth_write(ndev, 0x0, TDLAR);
+	sh_eth_write(ndev, 0x0, TDFAR);
+	sh_eth_write(ndev, 0x0, TDFXR);
+	sh_eth_write(ndev, 0x0, TDFFR);
+	sh_eth_write(ndev, 0x0, RDLAR);
+	sh_eth_write(ndev, 0x0, RDFAR);
+	sh_eth_write(ndev, 0x0, RDFXR);
+	sh_eth_write(ndev, 0x0, RDFFR);
+
+	/* Reset HW CRC register */
+	sh_eth_reset_hw_crc(ndev);
+
+	/* Select MII mode */
+	if (sh_eth_my_cpu_data.hw_crc)
+		sh_eth_select_mii(ndev);
+}
+
 static void sh_eth_reset_hw_crc(struct net_device *ndev)
 {
 	if (sh_eth_my_cpu_data.hw_crc)
@@ -397,19 +430,7 @@ static void sh_eth_chip_reset(struct net_device *ndev)
 	sh_eth_tsu_write(mdp, ARSTR_ARSTR, ARSTR);
 	mdelay(1);
 
-	switch (mdp->phy_interface) {
-	case PHY_INTERFACE_MODE_GMII:
-		mii = 2;
-		break;
-	case PHY_INTERFACE_MODE_MII:
-		mii = 1;
-		break;
-	case PHY_INTERFACE_MODE_RMII:
-	default:
-		mii = 0;
-		break;
-	}
-	sh_eth_write(ndev, mii, RMII_MII);
+	sh_eth_select_mii(ndev);
 }
 
 static void sh_eth_reset(struct net_device *ndev)
diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index 57b8e1f..d6763b1392 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -757,6 +757,7 @@ struct sh_eth_cpu_data {
 	unsigned no_trimd:1;		/* E-DMAC DO NOT have TRIMD */
 	unsigned no_ade:1;	/* E-DMAC DO NOT have ADE bit in EESR */
 	unsigned hw_crc:1;	/* E-DMAC have CSMR */
+	unsigned select_mii:1;	/* EtherC have RMII_MII (MII select register) */
 };
 
 struct sh_eth_private {
-- 
1.7.10

^ permalink raw reply related

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  5:57 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.2.00.1206072201200.13686@pokey.mtv.corp.google.com>

On Thu, 2012-06-07 at 22:05 -0700, Tom Herbert wrote:
> In the transmit path of the bonding driver, skb->cb is used to
> stash the skb->queue_mapping so that the bonding device can set its
> own queue mapping.  This value becomes corrupted since the skb->cb is
> also used in __dev_xmit_skb.
> 
> When transmitting through bonding driver, bond_select_queue is
> called from dev_queue_xmit.  In bond_select_queue the original
> skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
> and skb->queue_mapping is overwritten with the bond driver queue.
> Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
> the packet length into skb->cb, thereby overwriting the stashed
> queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
> the queue mapping for the skb is set to the stashed value which is now
> the skb length and hence is an invalid queue for the slave device.
> 
> Fix is to set bond_queue_mapping to skb->cb +
> sizeof((struct qdisc_skb_cb)
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
>  drivers/net/bonding/bond_main.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 2ee8cf9..044c1c0 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -76,6 +76,7 @@
>  #include <net/route.h>
>  #include <net/net_namespace.h>
>  #include <net/netns/generic.h>
> +#include <net/sch_generic.h>
>  #include "bonding.h"
>  #include "bond_3ad.h"
>  #include "bond_alb.h"
> @@ -381,7 +382,8 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
>  	return next;
>  }
>  
> -#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
> +#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb + \
> +    sizeof(struct qdisc_skb_cb)))
>  
>  /**
>   * bond_dev_queue_xmit - Prepare skb for xmit.


Sorry this wont work in all cases.

Some qdisc also use skb->cb[]

maybe :

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 55ce96b..47cbfa2 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -220,6 +220,7 @@ struct tcf_proto {
 
 struct qdisc_skb_cb {
 	unsigned int		pkt_len;
+	unsigned int		bond_queue_mapping;
 	unsigned char		data[24];
 };
 

^ permalink raw reply related

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: David Miller @ 2012-06-08  6:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1339135057.6001.20.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 08 Jun 2012 07:57:37 +0200

> On Thu, 2012-06-07 at 22:05 -0700, Tom Herbert wrote:
>> In the transmit path of the bonding driver, skb->cb is used to
>> stash the skb->queue_mapping so that the bonding device can set its
>> own queue mapping.  This value becomes corrupted since the skb->cb is
>> also used in __dev_xmit_skb.
>> 
>> When transmitting through bonding driver, bond_select_queue is
>> called from dev_queue_xmit.  In bond_select_queue the original
>> skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
>> and skb->queue_mapping is overwritten with the bond driver queue.
>> Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
>> the packet length into skb->cb, thereby overwriting the stashed
>> queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
>> the queue mapping for the skb is set to the stashed value which is now
>> the skb length and hence is an invalid queue for the slave device.
>> 
>> Fix is to set bond_queue_mapping to skb->cb +
>> sizeof((struct qdisc_skb_cb)
>> 
>> Signed-off-by: Tom Herbert <therbert@google.com>
>> ---
>>  drivers/net/bonding/bond_main.c |    4 +++-
>>  1 files changed, 3 insertions(+), 1 deletions(-)
>> 
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 2ee8cf9..044c1c0 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -76,6 +76,7 @@
>>  #include <net/route.h>
>>  #include <net/net_namespace.h>
>>  #include <net/netns/generic.h>
>> +#include <net/sch_generic.h>
>>  #include "bonding.h"
>>  #include "bond_3ad.h"
>>  #include "bond_alb.h"
>> @@ -381,7 +382,8 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
>>  	return next;
>>  }
>>  
>> -#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
>> +#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb + \
>> +    sizeof(struct qdisc_skb_cb)))
>>  
>>  /**
>>   * bond_dev_queue_xmit - Prepare skb for xmit.
> 
> 
> Sorry this wont work in all cases.
> 
> Some qdisc also use skb->cb[]

Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
explicitly allocated:

>  	unsigned char		data[24];

there. :-)

^ permalink raw reply

* Re: Difficulties to get 1Gbps on be2net ethernet card
From: Eric Dumazet @ 2012-06-08  6:08 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev
In-Reply-To: <CAL8zT=i2EkZLz3J5ib1tLnOuXt=Buf0sCwr7UbQWRE3B4oB+JQ@mail.gmail.com>

On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:

> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)

> qdisc mq 0: dev eth1 root
>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
> requeues 17480)

OK, and "tc -s -d cl show dev eth1"

(How many queues are really used)

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  6:11 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, netdev
In-Reply-To: <20120607.230216.2014005732863772019.davem@davemloft.net>

On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 08 Jun 2012 07:57:37 +0200
> 
> > On Thu, 2012-06-07 at 22:05 -0700, Tom Herbert wrote:
> >> In the transmit path of the bonding driver, skb->cb is used to
> >> stash the skb->queue_mapping so that the bonding device can set its
> >> own queue mapping.  This value becomes corrupted since the skb->cb is
> >> also used in __dev_xmit_skb.
> >> 
> >> When transmitting through bonding driver, bond_select_queue is
> >> called from dev_queue_xmit.  In bond_select_queue the original
> >> skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
> >> and skb->queue_mapping is overwritten with the bond driver queue.
> >> Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
> >> the packet length into skb->cb, thereby overwriting the stashed
> >> queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
> >> the queue mapping for the skb is set to the stashed value which is now
> >> the skb length and hence is an invalid queue for the slave device.
> >> 
> >> Fix is to set bond_queue_mapping to skb->cb +
> >> sizeof((struct qdisc_skb_cb)
> >> 
> >> Signed-off-by: Tom Herbert <therbert@google.com>
> >> ---
> >>  drivers/net/bonding/bond_main.c |    4 +++-
> >>  1 files changed, 3 insertions(+), 1 deletions(-)
> >> 
> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >> index 2ee8cf9..044c1c0 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -76,6 +76,7 @@
> >>  #include <net/route.h>
> >>  #include <net/net_namespace.h>
> >>  #include <net/netns/generic.h>
> >> +#include <net/sch_generic.h>
> >>  #include "bonding.h"
> >>  #include "bond_3ad.h"
> >>  #include "bond_alb.h"
> >> @@ -381,7 +382,8 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
> >>  	return next;
> >>  }
> >>  
> >> -#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
> >> +#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb + \
> >> +    sizeof(struct qdisc_skb_cb)))
> >>  
> >>  /**
> >>   * bond_dev_queue_xmit - Prepare skb for xmit.
> > 
> > 
> > Sorry this wont work in all cases.
> > 
> > Some qdisc also use skb->cb[]
> 
> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
> explicitly allocated:
> 
> >  	unsigned char		data[24];
> 
> there. :-)
> 

Yes, but some other layers can use the same trick so it might collide.

Inserting the bond field in qdisc_skb_cb (level0) is safer.

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: David Miller @ 2012-06-08  6:15 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1339135881.6001.25.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 08 Jun 2012 08:11:21 +0200

> On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
>> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
>> explicitly allocated:
>> 
>> >  	unsigned char		data[24];
>> 
>> there. :-)
>> 
> 
> Yes, but some other layers can use the same trick so it might collide.
> 
> Inserting the bond field in qdisc_skb_cb (level0) is safer.

Do you suggest that Infiniband does the same thing? :-)

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  6:17 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, netdev
In-Reply-To: <20120607.230216.2014005732863772019.davem@davemloft.net>

On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:

> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
> explicitly allocated:
> 
> >  	unsigned char		data[24];
> 
> there. :-)
> 

By the way, I notice data[] is not aligned on a long on 64bit arches.

This might break net/sched/sch_netem.c on some arches, since
time_to_send is a u64.

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: David Miller @ 2012-06-08  6:22 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1339136238.6001.28.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 08 Jun 2012 08:17:18 +0200

> On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
> 
>> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
>> explicitly allocated:
>> 
>> >  	unsigned char		data[24];
>> 
>> there. :-)
>> 
> 
> By the way, I notice data[] is not aligned on a long on 64bit arches.
> 
> This might break net/sched/sch_netem.c on some arches, since
> time_to_send is a u64.

Looks like we'll get the bonding queue mapping and fix this alignment
bug for free then :-)

^ permalink raw reply

* Re: [PATCH] usbnet: Activate the halt interrupt endpoint to fix endless "XactErr" error
From: Huajun Li @ 2012-06-08  6:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: David S. Miller, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CACVXFVPj-4J2_UuOQ9NCW7wV_SLG50a-pEYqhd7K9sXuGmDmoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Jun 8, 2012 at 1:22 PM, Ming Lei <tom.leiming-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, Jun 8, 2012 at 1:04 PM, Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> Ming, thanks for your comments.
>
> Welcome, :-)
>
>>
>> On Fri, Jun 8, 2012 at 11:29 AM, Ming Lei <tom.leiming-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> On Tue, Jun 5, 2012 at 10:12 PM, Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> There prints endless "XactErr" error msg once switch my device to the
>>>
>>> The "XactErr" error msg means that there are some transfer error in the bus,
>>> such as timeout, bad CRC, wrong PID, etc. Generally, -EPROTO is returned
>>> to driver if this kind of error happened.
>>>
>> Yes, you pointed out the why there printed the error.
>> However, how can this error(-EPROTO) happen? Actually, maybe it is
>> caused by malfunction device, bad usb cable, or other issues. In my
>> case, it is because the endpoint halts.
>
> If so, looks mistaken value is returned from the host controller driver,
> but not sure if your device is buggy. What is your host controller?
>
Nothing related to HC.
I tried to find out the endpoint state, but found it was halt. I think
this is the root cause.

> Also suppose your device is buggy, and the correct solution should
> be addding quirk for the driver to clear halt before the 1st submit status
> urb.
>
I ever worked out a patch just as you said and it could work.
However, if this can be fixed by common framework just like usbnet.c,
and there is no sideeffect, then why not.

>>
>>>> configuration
>>>>  which needs cdc_ether driver, the root cause is the interrupt endpoint halts.
>>>
>>> How do you switch your configuration? by writing to
>>> /sys/.../bConfigurationValue?
>>>
>>
>> Maybe.
>> I were using a third party application to switch the configurations,
>> so I think it should use this way to do it unless there is other
>> approach.
>
> I just have tried to switch configuration by sysfs interface on the g_multi
> and don't trigger the error.
>
The driver is common one, but not just for a specific device.

>>
>>> Is the "XactErr" msg printed just after switching to cdc_ether interface
>>> by changing configuration?
>>>
>>
>> Yes, just as I mentioned in my original email.
>> And it did not work even I removed the driver and re-installed it.
>>
>>>> Maybe this is a common issue, so fix it by activating the endpoint
>>>> once the error occurs.
>>>>
>>>> Signed-off-by: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>> ---
>>>>  drivers/net/usb/usbnet.c   |   33 +++++++++++++++++++++++++++++++++
>>>>  include/linux/usb/usbnet.h |   15 ++++++++-------
>>>>  2 files changed, 41 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
>>>> index 9f58330..f13922b 100644
>>>> --- a/drivers/net/usb/usbnet.c
>>>> +++ b/drivers/net/usb/usbnet.c
>>>> @@ -537,6 +537,11 @@ static void intr_complete (struct urb *urb)
>>>>                          "intr shutdown, code %d\n", status);
>>>>                return;
>>>>
>>>> +       case -EPIPE:
>>>> +       case -EPROTO:
>>>
>>> It is good to handle EPIPE error here, but looks it is no sense to
>>> clear halt for
>>> bus transfer failure. At least, no clear halt is done for returning -EPROTO from
>>> rx/tx transfer currently.
>>
>> Just as I said above, there is different issue can cause -EPROTO, at
>> least, for my case it is because the interrupt endpoint is not active.
>> If the error occurs, the driver need try to fix it instead of just
>> printing an error msg.
>
> One problem in your patch is that if the  -EPROTO is caused by bad cable
> or interference, clean halt may not be sent to device successfully, and will
> cause -EPROTO further.

What's your opinion to handle "-EPROTO" error in usbnet.c?
Please check usbnet.c again, when "-EPROTO" occurs, it just pints
error msg and re-submit the interrupt URB, and then causes endless
"EactErr" error msg.

At least, this patch lets the driver try to fix the error before
resubmit the URB.

>
>>
>>>
>>>> +               usbnet_defer_kevent(dev, EVENT_STS_HALT);
>>>> +               return;
>>>> +
>>>>        /* NOTE:  not throttling like RX/TX, since this endpoint
>>>>         * already polls infrequently
>>>>         */
>>>> @@ -967,6 +972,34 @@ fail_halt:
>>>>                }
>>>>        }
>>>>
>>>> +       if (test_bit(EVENT_STS_HALT, &dev->flags)) {
>>>> +               unsigned pipe;
>>>> +               struct usb_endpoint_descriptor *desc;
>>>> +
>>>> +               desc = &dev->status->desc;
>>>> +               pipe = usb_rcvintpipe(dev->udev,
>>>> +                       desc->bEndpointAddress & USB_ENDPOINT_NUMBER_MASK);
>>>> +               status = usb_autopm_get_interface(dev->intf);
>>>> +               if (status < 0)
>>>> +                       goto fail_sts;
>>>> +               status = usb_clear_halt(dev->udev, pipe);
>>>> +               usb_autopm_put_interface(dev->intf);
>>>> +
>>>> +               if (status < 0 && status != -EPIPE && status != -ESHUTDOWN) {
>>>> +fail_sts:
>>>> +                       netdev_err(dev->net,
>>>> +                               "can't clear intr halt, status %d\n", status);
>>>> +               } else {
>>>> +                       clear_bit(EVENT_STS_HALT, &dev->flags);
>>>> +                       memset(dev->interrupt->transfer_buffer, 0,
>>>> +                               dev->interrupt->transfer_buffer_length);
>>>
>>> The above is not necessary.
>>
>> Ming, do you mean the above one line, or others ?
>
> Yes, it is the above line.
>

Then not sure whether the buffer will be tainted without this line.
>
>
> Thanks,
> --
> Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  6:47 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, netdev
In-Reply-To: <20120607.231501.463746858434969001.davem@davemloft.net>

On Thu, 2012-06-07 at 23:15 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 08 Jun 2012 08:11:21 +0200
> 
> > On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
> >> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
> >> explicitly allocated:
> >> 
> >> >  	unsigned char		data[24];
> >> 
> >> there. :-)
> >> 
> > 
> > Yes, but some other layers can use the same trick so it might collide.
> > 
> > Inserting the bond field in qdisc_skb_cb (level0) is safer.
> 
> Do you suggest that Infiniband does the same thing? :-)

I wonder if another way to solve this is not letting ndo_select_queue()
method the responsibility to call skb_set_queue_mapping() itself ?

(ie removing skb_set_queue_mapping() done in dev_pick_tx())

bonding would not have to save/restore skb queue mapping ?

Partial patch : (we have to audit all ndo_select_queue()

diff --git a/net/core/dev.c b/net/core/dev.c
index cd09819..c6c92d5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2368,6 +2368,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 
 	if (dev->real_num_tx_queues == 1)
 		queue_index = 0;
+		skb_set_queue_mapping(skb, queue_index);
 	else if (ops->ndo_select_queue) {
 		queue_index = ops->ndo_select_queue(dev, skb);
 		queue_index = dev_cap_txqueue(dev, queue_index);
@@ -2391,9 +2392,9 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 					sk_tx_queue_set(sk, queue_index);
 			}
 		}
+		skb_set_queue_mapping(skb, queue_index);
 	}
 
-	skb_set_queue_mapping(skb, queue_index);
 	return netdev_get_tx_queue(dev, queue_index);
 }
 

^ permalink raw reply related

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Michael S. Tsirkin @ 2012-06-08  7:02 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <4FD172FD.5070200@redhat.com>

On Fri, Jun 08, 2012 at 11:35:25AM +0800, Jason Wang wrote:
> >>>  @@ -655,7 +695,17 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >>>    		kfree_skb(skb);
> >>>    		return NETDEV_TX_OK;
> >>>    	}
> >>>  -	virtqueue_kick(vi->svq);
> >>>  +
> >>>  +	kick = virtqueue_kick_prepare(vi->svq);
> >>>  +	if (unlikely(kick))
> >>>  +		virtqueue_notify(vi->svq);
> >>>  +
> >>>  +	u64_stats_update_begin(&stats->syncp);
> >>>  +	if (unlikely(kick))
> >>>  +		stats->data[VIRTNET_TX_KICKS]++;
> >>>  +	stats->data[VIRTNET_TX_Q_BYTES] += skb->len;
> >>>  +	stats->data[VIRTNET_TX_Q_PACKETS]++;
> >is this statistic interesting?
> >how about decrementing when we free?
> >this way we see how many are pending..
> >
> 
> Currently we didn't have per-vq statistics but per-cpu, so the skb
> could be sent by one vcpu and freed by another.
> Pehaps another reason to use per-queue satistics.

For transmit, it could be done easily as we both send and free skbs
under a lock. I'm not sure how acceptable it is to
take a lock in get_stats but send a separate patch like this
and we'll see what others say.

-- 
MST

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Michael S. Tsirkin @ 2012-06-08  7:04 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <4FD172FD.5070200@redhat.com>

On Fri, Jun 08, 2012 at 11:35:25AM +0800, Jason Wang wrote:
> >>>  @@ -655,7 +695,17 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >>>    		kfree_skb(skb);
> >>>    		return NETDEV_TX_OK;
> >>>    	}
> >>>  -	virtqueue_kick(vi->svq);
> >>>  +
> >>>  +	kick = virtqueue_kick_prepare(vi->svq);
> >>>  +	if (unlikely(kick))
> >>>  +		virtqueue_notify(vi->svq);
> >>>  +
> >>>  +	u64_stats_update_begin(&stats->syncp);
> >>>  +	if (unlikely(kick))
> >>>  +		stats->data[VIRTNET_TX_KICKS]++;
> >>>  +	stats->data[VIRTNET_TX_Q_BYTES] += skb->len;
> >>>  +	stats->data[VIRTNET_TX_Q_PACKETS]++;
> >is this statistic interesting?
> >how about decrementing when we free?
> >this way we see how many are pending..
> >
> 
> Currently we didn't have per-vq statistics but per-cpu, so the skb
> could be sent by one vcpu and freed by another.
> Pehaps another reason to use per-queue satistics.

Just to stress these things do not need to contradict:
you can have per cpu stats for each queue.

-- 
MST

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  7:23 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, netdev
In-Reply-To: <1339138040.6001.39.camel@edumazet-glaptop>

On Fri, 2012-06-08 at 08:47 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 23:15 -0700, David Miller wrote:
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Fri, 08 Jun 2012 08:11:21 +0200
> > 
> > > On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
> > >> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
> > >> explicitly allocated:
> > >> 
> > >> >  	unsigned char		data[24];
> > >> 
> > >> there. :-)
> > >> 
> > > 
> > > Yes, but some other layers can use the same trick so it might collide.
> > > 
> > > Inserting the bond field in qdisc_skb_cb (level0) is safer.
> > 
> > Do you suggest that Infiniband does the same thing? :-)
> 
> I wonder if another way to solve this is not letting ndo_select_queue()
> method the responsibility to call skb_set_queue_mapping() itself ?
> 
> (ie removing skb_set_queue_mapping() done in dev_pick_tx())
> 
> bonding would not have to save/restore skb queue mapping ?
> 
> Partial patch : (we have to audit all ndo_select_queue()
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index cd09819..c6c92d5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2368,6 +2368,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
>  
>  	if (dev->real_num_tx_queues == 1)
>  		queue_index = 0;
> +		skb_set_queue_mapping(skb, queue_index);
>  	else if (ops->ndo_select_queue) {
>  		queue_index = ops->ndo_select_queue(dev, skb);
>  		queue_index = dev_cap_txqueue(dev, queue_index);
> @@ -2391,9 +2392,9 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
>  					sk_tx_queue_set(sk, queue_index);
>  			}
>  		}
> +		skb_set_queue_mapping(skb, queue_index);
>  	}
>  
> -	skb_set_queue_mapping(skb, queue_index);
>  	return netdev_get_tx_queue(dev, queue_index);
>  }
>  
> 


I must say I dont understand dev_pick_tx() anymore.

It seems to ignore skb->queue_mapping (unless device provides its own
ndo_select_queue() and this functions is aware of skb->queue_mapping, as
correctly done in ixgbe)

So commit fff3269907897ee (tcp: reflect SYN queue_mapping into SYNACK
packets) works on ixgbe, but probably not on other multiqueue devices.

This sounds like a regression to me.

^ permalink raw reply

* [PATCH] Revert "niu: Add support for byte queue limits."
From: David Miller @ 2012-06-08  7:30 UTC (permalink / raw)
  To: netdev


This reverts commit efa230f2c68abab817f13473077f8d0cc74f43f3.

BQL doesn't work with how this driver currently only takes TX
interrupts every 1/4 of the TX ring.  That behavior needs to be fixed,
but that's a larger non-trivial task and for now we have to revert
BQL support as this makes the device currently completely unusable.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/sun/niu.c |   12 +-----------
 1 files changed, 1 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 703c8cc..8c726b7 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -3598,7 +3598,6 @@ static int release_tx_packet(struct niu *np, struct tx_ring_info *rp, int idx)
 static void niu_tx_work(struct niu *np, struct tx_ring_info *rp)
 {
 	struct netdev_queue *txq;
-	unsigned int tx_bytes;
 	u16 pkt_cnt, tmp;
 	int cons, index;
 	u64 cs;
@@ -3621,18 +3620,12 @@ static void niu_tx_work(struct niu *np, struct tx_ring_info *rp)
 	netif_printk(np, tx_done, KERN_DEBUG, np->dev,
 		     "%s() pkt_cnt[%u] cons[%d]\n", __func__, pkt_cnt, cons);
 
-	tx_bytes = 0;
-	tmp = pkt_cnt;
-	while (tmp--) {
-		tx_bytes += rp->tx_buffs[cons].skb->len;
+	while (pkt_cnt--)
 		cons = release_tx_packet(np, rp, cons);
-	}
 
 	rp->cons = cons;
 	smp_mb();
 
-	netdev_tx_completed_queue(txq, pkt_cnt, tx_bytes);
-
 out:
 	if (unlikely(netif_tx_queue_stopped(txq) &&
 		     (niu_tx_avail(rp) > NIU_TX_WAKEUP_THRESH(rp)))) {
@@ -4333,7 +4326,6 @@ static void niu_free_channels(struct niu *np)
 			struct tx_ring_info *rp = &np->tx_rings[i];
 
 			niu_free_tx_ring_info(np, rp);
-			netdev_tx_reset_queue(netdev_get_tx_queue(np->dev, i));
 		}
 		kfree(np->tx_rings);
 		np->tx_rings = NULL;
@@ -6739,8 +6731,6 @@ static netdev_tx_t niu_start_xmit(struct sk_buff *skb,
 		prod = NEXT_TX(rp, prod);
 	}
 
-	netdev_tx_sent_queue(txq, skb->len);
-
 	if (prod < rp->prod)
 		rp->wrap_bit ^= TX_RING_KICK_WRAP;
 	rp->prod = prod;
-- 
1.7.6.5

^ permalink raw reply related

* Re: [PATCH] e1000: save skb counts in TX to avoid cache misses
From: Roman Kagan @ 2012-06-08  7:37 UTC (permalink / raw)
  To: Greg KH
  Cc: David Miller, jeffrey.t.kirsher@intel.com, tarbal@gmail.com,
	stable@vger.kernel.org, jesse.brandeburg@intel.com,
	bruce.w.allan@intel.com, carolyn.wyborny@intel.com,
	donald.c.skidmore@intel.com, gregory.v.rose@intel.com,
	peter.p.waskiewicz.jr@intel.com, alexander.h.duyck@intel.com,
	john.ronciak@intel.com, dnelson@redhat.com,
	e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20120608021542.GA10112@kroah.com>

On Fri, 2012-06-08 at 06:15 +0400, Greg KH wrote:
> On Thu, Jun 07, 2012 at 02:43:58PM -0700, David Miller wrote:
> > From: Jeff Kirsher <tarbal@gmail.com>
> > Date: Thu, 07 Jun 2012 14:38:17 -0700
> > 
> > > Thanks! I have applied the patch to my queue
> > 
> > Why?
> > 
> > My impression is that this is a patch already in the tree, and it's
> > being submitted for -stable but such minor performance hacks are
> > absolutely not appropriate for -stable submission.
> 
> The patch description says it is fixing reported oopses,

Exactly.

> but the Subject: isn't all that helpful there.

Well I just preserved the original subject from the upstream commit.
Want me to resubmit with a more alarming one?

> So which is this?  Should I accept it for a stable release or not?

IMO yes ;)

Thanks,
Roman.

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: John Fastabend @ 2012-06-08  7:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, therbert, netdev
In-Reply-To: <1339140238.6001.42.camel@edumazet-glaptop>

On 6/8/2012 12:23 AM, Eric Dumazet wrote:
> On Fri, 2012-06-08 at 08:47 +0200, Eric Dumazet wrote:
>> On Thu, 2012-06-07 at 23:15 -0700, David Miller wrote:
>>> From: Eric Dumazet <eric.dumazet@gmail.com>
>>> Date: Fri, 08 Jun 2012 08:11:21 +0200
>>>
>>>> On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
>>>>> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
>>>>> explicitly allocated:
>>>>>
>>>>>>   	unsigned char		data[24];
>>>>>
>>>>> there. :-)
>>>>>
>>>>
>>>> Yes, but some other layers can use the same trick so it might collide.
>>>>
>>>> Inserting the bond field in qdisc_skb_cb (level0) is safer.
>>>
>>> Do you suggest that Infiniband does the same thing? :-)
>>
>> I wonder if another way to solve this is not letting ndo_select_queue()
>> method the responsibility to call skb_set_queue_mapping() itself ?
>>
>> (ie removing skb_set_queue_mapping() done in dev_pick_tx())
>>
>> bonding would not have to save/restore skb queue mapping ?
>>
>> Partial patch : (we have to audit all ndo_select_queue()
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index cd09819..c6c92d5 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2368,6 +2368,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
>>
>>   	if (dev->real_num_tx_queues == 1)
>>   		queue_index = 0;
>> +		skb_set_queue_mapping(skb, queue_index);
>>   	else if (ops->ndo_select_queue) {
>>   		queue_index = ops->ndo_select_queue(dev, skb);
>>   		queue_index = dev_cap_txqueue(dev, queue_index);
>> @@ -2391,9 +2392,9 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
>>   					sk_tx_queue_set(sk, queue_index);
>>   			}
>>   		}
>> +		skb_set_queue_mapping(skb, queue_index);
>>   	}
>>
>> -	skb_set_queue_mapping(skb, queue_index);
>>   	return netdev_get_tx_queue(dev, queue_index);
>>   }
>>
>>
>
>
> I must say I dont understand dev_pick_tx() anymore.
>
> It seems to ignore skb->queue_mapping (unless device provides its own
> ndo_select_queue() and this functions is aware of skb->queue_mapping, as
> correctly done in ixgbe)
>
> So commit fff3269907897ee (tcp: reflect SYN queue_mapping into SYNACK
> packets) works on ixgbe, but probably not on other multiqueue devices.
>
> This sounds like a regression to me.
>
>
>

Well it would get picked up via skb_tx_hash(),

         else if (ops->ndo_select_queue) {
		[...]
         } else {
                 struct sock *sk = skb->sk;
                 queue_index = sk_tx_queue_get(sk);

                 if (queue_index < 0 || skb->ooo_okay ||
                     queue_index >= dev->real_num_tx_queues) {
                         int old_index = queue_index;

                         queue_index = get_xps_queue(dev, skb);
                         if (queue_index < 0)
                                 queue_index = skb_tx_hash(dev, skb);
	[...]


So think this might be OK.

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  7:46 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, netdev
In-Reply-To: <1339140238.6001.42.camel@edumazet-glaptop>

On Fri, 2012-06-08 at 09:24 +0200, Eric Dumazet wrote:

> 
> I must say I dont understand dev_pick_tx() anymore.
> 
> It seems to ignore skb->queue_mapping (unless device provides its own
> ndo_select_queue() and this functions is aware of skb->queue_mapping, as
> correctly done in ixgbe)
> 
> So commit fff3269907897ee (tcp: reflect SYN queue_mapping into SYNACK
> packets) works on ixgbe, but probably not on other multiqueue devices.
> 
> This sounds like a regression to me.

Oh well, its done in skb_tx_hash(), after a few indirections, and if
skb->sk is NULL.

Which happens to be true in net-next for SYNACKS after commit
90ba9b1986b5ac (tcp: tcp_make_synack() can use alloc_skb())

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  7:48 UTC (permalink / raw)
  To: John Fastabend; +Cc: David Miller, therbert, netdev
In-Reply-To: <4FD1ACE1.4050005@intel.com>

On Fri, 2012-06-08 at 00:42 -0700, John Fastabend wrote:
> On 6/8/2012 12:23 AM, Eric Dumazet wrote:
> > On Fri, 2012-06-08 at 08:47 +0200, Eric Dumazet wrote:
> >> On Thu, 2012-06-07 at 23:15 -0700, David Miller wrote:
> >>> From: Eric Dumazet <eric.dumazet@gmail.com>
> >>> Date: Fri, 08 Jun 2012 08:11:21 +0200
> >>>
> >>>> On Thu, 2012-06-07 at 23:02 -0700, David Miller wrote:
> >>>>> Hmmm, isn't that what qdisc_skb_cb is for?  And even private data is
> >>>>> explicitly allocated:
> >>>>>
> >>>>>>   	unsigned char		data[24];
> >>>>>
> >>>>> there. :-)
> >>>>>
> >>>>
> >>>> Yes, but some other layers can use the same trick so it might collide.
> >>>>
> >>>> Inserting the bond field in qdisc_skb_cb (level0) is safer.
> >>>
> >>> Do you suggest that Infiniband does the same thing? :-)
> >>
> >> I wonder if another way to solve this is not letting ndo_select_queue()
> >> method the responsibility to call skb_set_queue_mapping() itself ?
> >>
> >> (ie removing skb_set_queue_mapping() done in dev_pick_tx())
> >>
> >> bonding would not have to save/restore skb queue mapping ?
> >>
> >> Partial patch : (we have to audit all ndo_select_queue()
> >>
> >> diff --git a/net/core/dev.c b/net/core/dev.c
> >> index cd09819..c6c92d5 100644
> >> --- a/net/core/dev.c
> >> +++ b/net/core/dev.c
> >> @@ -2368,6 +2368,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
> >>
> >>   	if (dev->real_num_tx_queues == 1)
> >>   		queue_index = 0;
> >> +		skb_set_queue_mapping(skb, queue_index);
> >>   	else if (ops->ndo_select_queue) {
> >>   		queue_index = ops->ndo_select_queue(dev, skb);
> >>   		queue_index = dev_cap_txqueue(dev, queue_index);
> >> @@ -2391,9 +2392,9 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
> >>   					sk_tx_queue_set(sk, queue_index);
> >>   			}
> >>   		}
> >> +		skb_set_queue_mapping(skb, queue_index);
> >>   	}
> >>
> >> -	skb_set_queue_mapping(skb, queue_index);
> >>   	return netdev_get_tx_queue(dev, queue_index);
> >>   }
> >>
> >>
> >
> >
> > I must say I dont understand dev_pick_tx() anymore.
> >
> > It seems to ignore skb->queue_mapping (unless device provides its own
> > ndo_select_queue() and this functions is aware of skb->queue_mapping, as
> > correctly done in ixgbe)
> >
> > So commit fff3269907897ee (tcp: reflect SYN queue_mapping into SYNACK
> > packets) works on ixgbe, but probably not on other multiqueue devices.
> >
> > This sounds like a regression to me.
> >
> >
> >
> 
> Well it would get picked up via skb_tx_hash(),
> 
>          else if (ops->ndo_select_queue) {
> 		[...]
>          } else {
>                  struct sock *sk = skb->sk;
>                  queue_index = sk_tx_queue_get(sk);
> 
>                  if (queue_index < 0 || skb->ooo_okay ||
>                      queue_index >= dev->real_num_tx_queues) {
>                          int old_index = queue_index;
> 
>                          queue_index = get_xps_queue(dev, skb);
>                          if (queue_index < 0)
>                                  queue_index = skb_tx_hash(dev, skb);
> 	[...]
> 
> 
> So think this might be OK.

Yes, it sounds like sk setting (sk->sk_tx_queue_mapping) has precedence
over skb->queue_mapping.

Not sure how it works for UDP workload for example.

^ permalink raw reply

* Re: [PATCH] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-08  7:52 UTC (permalink / raw)
  To: John Fastabend; +Cc: David Miller, therbert, netdev
In-Reply-To: <1339141738.6001.49.camel@edumazet-glaptop>

On Fri, 2012-06-08 at 09:49 +0200, Eric Dumazet wrote:

> Yes, it sounds like sk setting (sk->sk_tx_queue_mapping) has precedence
> over skb->queue_mapping.
> 
> Not sure how it works for UDP workload for example.

Unconnected UDP sockets dont have sk_dst_cache, so
sk->sk_tx_queue_mapping stay to -1, so everything seems good.

^ permalink raw reply

* Re: Difficulties to get 1Gbps on be2net ethernet card
From: Jean-Michel Hautbois @ 2012-06-08  8:14 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev
In-Reply-To: <1339135709.6001.22.camel@edumazet-glaptop>

2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
>
>> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
>
>> qdisc mq 0: dev eth1 root
>>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
>> requeues 17480)
>
> OK, and "tc -s -d cl show dev eth1"
>
> (How many queues are really used)
>
>
>

tc -s -d cl show dev eth1
class mq :1 root
 Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
 backlog 0b 0p requeues 2747
class mq :2 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :3 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :4 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :5 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :6 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :7 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :8 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

^ permalink raw reply

* Re: Difficulties to get 1Gbps on be2net ethernet card
From: Eric Dumazet @ 2012-06-08  8:22 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev
In-Reply-To: <CAL8zT=jPUEF62-V_0n9o3=n2Mo6y4M_kGgK+OwdXg0aLDazDgQ@mail.gmail.com>

On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
> >
> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
> >>           collisions:0 txqueuelen:1000
> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
> >
> >> qdisc mq 0: dev eth1 root
> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
> >> requeues 17480)
> >
> > OK, and "tc -s -d cl show dev eth1"
> >
> > (How many queues are really used)
> >
> >
> >
> 
> tc -s -d cl show dev eth1
> class mq :1 root
>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
>  backlog 0b 0p requeues 2747
> class mq :2 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :3 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :4 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :5 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :6 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :7 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :8 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0


Do you have the same distribution on old kernels as well ?
(ie only queue 0 is used)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox