Netdev List

Netdev List
 help / color / mirror / Atom feed

* Socket send-buffer auto-sizing
From: Ben Greear @ 2012-06-07 17:59 UTC (permalink / raw)
  To: netdev

I'm continuing to test one-way tcp streams in 3.5.0-rc1 on
a wifi network.

When I do not specify a send buffer size, and thus use the kernel
defaults, max speed is about 77Mbps.

When I specify 512KB send-buffer, I get speeds up to 185Mbps.

When set to 1MB, I get about 198Mbps (and setting higher does not
increase the throughput after this).

This is without any 'delack' patches applied.

My question is:  Should the kernel auto-tuner work better?

I seem to recall a comments from some years ago that applications
should no longer attempt to tune send/recv buffers because the kernel
was smart enough to get it at least mostly right.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: tcp wifi upload performance and lots of ACKs
From: Rick Jones @ 2012-06-07 17:51 UTC (permalink / raw)
  To: David Laight; +Cc: Ben Greear, Daniel Baluta, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F3C@saturn3.aculab.com>

> Does this delaying of acks have a detrimental effect on the
> sending end?
> I've seen very bad interactions between delayed acks and
> (I believe) the 'slow start' code on connections with
> one-directional data, Nagle disabled and very low RTT.
>
> What I saw was the sender sending 4 data packets, then
> sitting waiting for an ack - in spite of accumulating
> several kB of data to send.
>
> Delaying acks further will only make this worse.

At least two stacks have a reasonable ACK avoidance heuristic.  Those 
would be HP-UX and Solaris (Mac OS 9 had one as well, IIRC).  The 
heuristics are rather similar because the two TCP stacks share a common 
ancestor.  I used to interact with HP-UX's regularly, my statements will 
be based on that, and an assumption Solaris is similar.

Both attempt to divine what the senders' congestion window happens to be 
and be certain to send an ACK before that is exhausted.  So, at the 
start of a connection, there will be the usual, more rapid 
ACKnowledgement.  As things happen "normally" then the number of 
segments per ACK increases until it hits a configurable limit.  There 
are conditions which will cut the limit in half on a given connection - 
one is the sending of an ACK via the standalone ACK timer (this is from 
memory, so may be a bit off).  There are probably a few other conditions 
that drop the limit by half.  The heuristic attempts to learn in each 
connection what the reasonable limit on ACK avoidance might be so there 
isn't a per-connection control, just the global controls via ndd.  As 
conditions causing the limit to be cut in half arise, the connection 
naturally and irrevocably falls back to the usual "ack-every-other" 
behaviour.

When there is little to no packet loss, and a rather regular stream of 
data, this works rather well indeed.  For example in a LAN or Data 
Center.  You can run netperf TCP_STREAM with the limit at the default, 
and with the limit set to two and see the considerable difference in 
service demand on either side.

This may not work well when the sender has a congestion window growth 
heuristic different from what the ACK avoidance heuristic assumes.  If I 
recall correctly, the heuristic in HP-UX assumes the sender grows cwnd 
by the number of bytes/segments ACKnowledged. If the sender grows the 
cwnd by only one segment per ACK rather than by the bytes ACKnowldeged 
by the ACK the growth of the cwnd will be slowed.  In a LAN that may be 
papered-over a bit, but it will become quite noticable in a higher RTT 
environment.  Probably not as noticable for a sufficiently short 
connection, or a long one, but will be for ones in the middle.  The 
short connection doesn't need much cwnd in the first place, and the 
heuristic works its way up to avoiding ACKs, and the long one will be 
long enough to have the ACK avoidance heuristic gravitate down to 
ack-every-other.

rick jones

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Ben Hutchings @ 2012-06-07 17:15 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, mst, linux-kernel, virtualization
In-Reply-To: <20120606075217.29081.30713.stgit@amd-6168-8-1.englab.nay.redhat.com>

On Wed, 2012-06-06 at 15:52 +0800, Jason Wang wrote:
> Satistics counters is useful for debugging and performance optimization, so this
> patch lets virtio_net driver collect following and export them to userspace
> through "ethtool -S":
> 
> - number of packets sent/received
> - number of bytes sent/received
> - number of callbacks for tx/rx
> - number of kick for tx/rx
> - number of bytes/packets queued for tx
> 
> As virtnet_stats were per-cpu, so both per-cpu and gloabl satistics were
> collected like:
[...]

I would really like to see some sort of convention for presenting
per-queue statistics through ethtool.  At the moment we have a complete
mess of different formats:

bnx2x:    "[${index}]: ${name}"
be2net:   "${qtype}q${index}: ${name}"
ehea:     "PR${index} ${name}"
mlx4_en:  "${qtype}${index}_${name}"
myri10ge: dummy stat names as headings
niu:      dummy stat names as headings
s2io:     "ring_${index}_${name}"
vmxnet3:  dummy stat names as headings
vxge:     "${name}_${index}"; also dummy stat names as headings

And you're introducing yet another format!

(Additionally some of the drivers are playing games with spaces and tabs
to make ethtool indent the stats the way they like.  Ethtool statistics
are inconsistent enough already without drivers pulling that sort of
crap.

I'm inclined to make ethtool start stripping whitespace from stat names,
and *if* people can agree on a common format for per-queue statistic
names then I'll indent them *consistently*.  Also, I would make such
stats optional, so you don't get hundreds of lines of crap by default.)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] ipv6: fib: Restore NTF_ROUTER exception in fib6_age()
From: Thomas Graf @ 2012-06-07 16:51 UTC (permalink / raw)
  To: davem; +Cc: netdev

Commit 5339ab8b1dd82 (ipv6: fib: Convert fib6_age() to
dst_neigh_lookup().) seems to have mistakenly inverted the
exception for cached NTF_ROUTER routes.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 net/ipv6/ip6_fib.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 0c220a4..74c21b9 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1561,7 +1561,7 @@ static int fib6_age(struct rt6_info *rt, void *arg)
 				neigh_flags = neigh->flags;
 				neigh_release(neigh);
 			}
-			if (neigh_flags & NTF_ROUTER) {
+			if (!(neigh_flags & NTF_ROUTER)) {
 				RT6_TRACE("purging route %p via non-router but gateway\n",
 					  rt);
 				return -1;
-- 
1.7.7.6

^ permalink raw reply related

* Re: [v4 net-next PATCH 1/3] Added kernel support in EEE Ethtool commands
From: Ben Hutchings @ 2012-06-07 16:28 UTC (permalink / raw)
  To: Yuval Mintz; +Cc: davem, netdev, eilong, peppe.cavallaro
In-Reply-To: <1339038788-3447-2-git-send-email-yuvalmin@broadcom.com>

On Thu, 2012-06-07 at 06:13 +0300, Yuval Mintz wrote:
> This patch extends the kernel's ethtool interface by adding support
> for 2 new EEE commands - get_eee and set_eee.
> 
> Thanks goes to Giuseppe Cavallaro for his original patch adding this support.
> 
> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>

> ---
>  include/linux/ethtool.h |   35 +++++++++++++++++++++++++++++++++++
>  net/core/ethtool.c      |   40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 75 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index e17fa71..a518361 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -137,6 +137,35 @@ struct ethtool_eeprom {
>  };
>  
>  /**
> + * struct ethtool_eee - Energy Efficient Ethernet information
> + * @cmd: ETHTOOL_{G,S}EEE
> + * @supported: Mask of %SUPPORTED_* flags for the speed/duplex combinations
> + *	for which there is EEE support.
> + * @advertised: Mask of %ADVERTISED_* flags for the speed/duplex combinations
> + *	advertised as eee capable.
> + * @lp_advertised: Mask of %ADVERTISED_* flags for the speed/duplex
> + *	combinations advertised by the link partner as eee capable.
> + * @eee_active: Result of the eee auto negotiation.
> + * @eee_enabled: EEE configured mode (enabled/disabled).
> + * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
> + *	that eee was negotiated.
> + * @tx_lpi_timer: Time in microseconds the interface delays prior to asserting
> + *	its tx lpi (after reaching 'idle' state). Effective only when eee
> + *	was negotiated and tx_lpi_enabled was set.
> + */
> +struct ethtool_eee {
> +	__u32	cmd;
> +	__u32	supported;
> +	__u32	advertised;
> +	__u32	lp_advertised;
> +	__u32	eee_active;
> +	__u32	eee_enabled;
> +	__u32	tx_lpi_enabled;
> +	__u32	tx_lpi_timer;
> +	__u32	reserved[2];
> +};
> +
> +/**
>   * struct ethtool_modinfo - plugin module eeprom information
>   * @cmd: %ETHTOOL_GMODULEINFO
>   * @type: Standard the module information conforms to %ETH_MODULE_SFF_xxxx
> @@ -945,6 +974,8 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
>   * @get_module_info: Get the size and type of the eeprom contained within
>   *	a plug-in module.
>   * @get_module_eeprom: Get the eeprom information from the plug-in module
> + * @get_eee: Get Energy-Efficient (EEE) supported and status.
> + * @set_eee: Set EEE status (enable/disable) as well as LPI timers.
>   *
>   * All operations are optional (i.e. the function pointer may be set
>   * to %NULL) and callers must take this into account.  Callers must
> @@ -1011,6 +1042,8 @@ struct ethtool_ops {
>  				   struct ethtool_modinfo *);
>  	int     (*get_module_eeprom)(struct net_device *,
>  				     struct ethtool_eeprom *, u8 *);
> +	int	(*get_eee)(struct net_device *, struct ethtool_eee *);
> +	int	(*set_eee)(struct net_device *, struct ethtool_eee *);
>  
> 
>  };
> @@ -1089,6 +1122,8 @@ struct ethtool_ops {
>  #define ETHTOOL_GET_TS_INFO	0x00000041 /* Get time stamping and PHC info */
>  #define ETHTOOL_GMODULEINFO	0x00000042 /* Get plug-in module information */
>  #define ETHTOOL_GMODULEEEPROM	0x00000043 /* Get plug-in module eeprom */
> +#define ETHTOOL_GEEE		0x00000044 /* Get EEE settings */
> +#define ETHTOOL_SEEE		0x00000045 /* Set EEE settings */
>  
>  /* compatibility with older code */
>  #define SPARC_ETH_GSET		ETHTOOL_GSET
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 9c2afb4..5a582da 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -729,6 +729,40 @@ static int ethtool_set_wol(struct net_device *dev, char __user *useraddr)
>  	return dev->ethtool_ops->set_wol(dev, &wol);
>  }
>  
> +static int ethtool_get_eee(struct net_device *dev, char __user *useraddr)
> +{
> +	struct ethtool_eee edata;
> +	int rc;
> +
> +	if (!dev->ethtool_ops->get_eee)
> +		return -EOPNOTSUPP;
> +
> +	memset(&edata, 0, sizeof(struct ethtool_eee));
> +	edata.cmd = ETHTOOL_GEEE;
> +	rc = dev->ethtool_ops->get_eee(dev, &edata);
> +
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(useraddr, &edata, sizeof(edata)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int ethtool_set_eee(struct net_device *dev, char __user *useraddr)
> +{
> +	struct ethtool_eee edata;
> +
> +	if (!dev->ethtool_ops->set_eee)
> +		return -EOPNOTSUPP;
> +
> +	if (copy_from_user(&edata, useraddr, sizeof(edata)))
> +		return -EFAULT;
> +
> +	return dev->ethtool_ops->set_eee(dev, &edata);
> +}
> +
>  static int ethtool_nway_reset(struct net_device *dev)
>  {
>  	if (!dev->ethtool_ops->nway_reset)
> @@ -1471,6 +1505,12 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>  		rc = ethtool_set_value_void(dev, useraddr,
>  				       dev->ethtool_ops->set_msglevel);
>  		break;
> +	case ETHTOOL_GEEE:
> +		rc = ethtool_get_eee(dev, useraddr);
> +		break;
> +	case ETHTOOL_SEEE:
> +		rc = ethtool_set_eee(dev, useraddr);
> +		break;
>  	case ETHTOOL_NWAY_RST:
>  		rc = ethtool_nway_reset(dev);
>  		break;

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] ARM: bpf_jit: BPF_S_ANC_ALU_XOR_X support
From: Mircea Gherzan @ 2012-06-07 15:40 UTC (permalink / raw)
  To: linux; +Cc: mgherzan, netdev, linux-arm-kernel, davem

JIT support for the XOR operation introduced by the commit
ffe06c17afbb.

Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
---
 arch/arm/net/bpf_jit_32.c |    5 +++++
 arch/arm/net/bpf_jit_32.h |    4 ++++
 2 files changed, 9 insertions(+)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 62135849..c641fb6 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -762,6 +762,11 @@ b_epilogue:
 			update_on_xread(ctx);
 			emit(ARM_MOV_R(r_A, r_X), ctx);
 			break;
+		case BPF_S_ANC_ALU_XOR_X:
+			/* A ^= X */
+			update_on_xread(ctx);
+			emit(ARM_EOR_R(r_A, r_A, r_X), ctx);
+			break;
 		case BPF_S_ANC_PROTOCOL:
 			/* A = ntohs(skb->protocol) */
 			ctx->seen |= SEEN_SKB;
diff --git a/arch/arm/net/bpf_jit_32.h b/arch/arm/net/bpf_jit_32.h
index 99ae5e3..7fa2f7d 100644
--- a/arch/arm/net/bpf_jit_32.h
+++ b/arch/arm/net/bpf_jit_32.h
@@ -68,6 +68,8 @@
 #define ARM_INST_CMP_R		0x01500000
 #define ARM_INST_CMP_I		0x03500000
 
+#define ARM_INST_EOR_R		0x00200000
+
 #define ARM_INST_LDRB_I		0x05d00000
 #define ARM_INST_LDRB_R		0x07d00000
 #define ARM_INST_LDRH_I		0x01d000b0
@@ -132,6 +134,8 @@
 #define ARM_CMP_R(rn, rm)	_AL3_R(ARM_INST_CMP, 0, rn, rm)
 #define ARM_CMP_I(rn, imm)	_AL3_I(ARM_INST_CMP, 0, rn, imm)
 
+#define ARM_EOR_R(rd, rn, rm)	_AL3_R(ARM_INST_EOR, rd, rn, rm)
+
 #define ARM_LDR_I(rt, rn, off)	(ARM_INST_LDR_I | (rt) << 12 | (rn) << 16 \
 				 | (off))
 #define ARM_LDRB_I(rt, rn, off)	(ARM_INST_LDRB_I | (rt) << 12 | (rn) << 16 \
-- 
1.7.10

^ permalink raw reply related

* [PATCH] net: neighbour: fix neigh_dump_info()
From: Eric Dumazet @ 2012-06-07 14:58 UTC (permalink / raw)
  To: Denys Fedoryshchenko, David Miller; +Cc: netdev, Stephen Hemminger
In-Reply-To: <1339078935.5083.13.camel@edumazet-glaptop>

From: Eric Dumazet <edumazet@google.com>

Denys found out "ip neigh" output was truncated to
about 54 neighbours.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
---
 net/core/neighbour.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index eb09f8b..d81d026 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2219,9 +2219,7 @@ static int neigh_dump_table(struct neigh_table *tbl, struct sk_buff *skb,
 	rcu_read_lock_bh();
 	nht = rcu_dereference_bh(tbl->nht);
 
-	for (h = 0; h < (1 << nht->hash_shift); h++) {
-		if (h < s_h)
-			continue;
+	for (h = s_h; h < (1 << nht->hash_shift); h++) {
 		if (h > s_h)
 			s_idx = 0;
 		for (n = rcu_dereference_bh(nht->hash_buckets[h]), idx = 0;
@@ -2260,9 +2258,7 @@ static int pneigh_dump_table(struct neigh_table *tbl, struct sk_buff *skb,
 
 	read_lock_bh(&tbl->lock);
 
-	for (h = 0; h <= PNEIGH_HASHMASK; h++) {
-		if (h < s_h)
-			continue;
+	for (h = s_h; h <= PNEIGH_HASHMASK; h++) {
 		if (h > s_h)
 			s_idx = 0;
 		for (n = tbl->phash_buckets[h], idx = 0; n; n = n->next) {
@@ -2297,7 +2293,7 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 	struct neigh_table *tbl;
 	int t, family, s_t;
 	int proxy = 0;
-	int err = 0;
+	int err;
 
 	read_lock(&neigh_tbl_lock);
 	family = ((struct rtgenmsg *) nlmsg_data(cb->nlh))->rtgen_family;
@@ -2311,7 +2307,7 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 
 	s_t = cb->args[0];
 
-	for (tbl = neigh_tables, t = 0; tbl && (err >= 0);
+	for (tbl = neigh_tables, t = 0; tbl;
 	     tbl = tbl->next, t++) {
 		if (t < s_t || (family && tbl->family != family))
 			continue;
@@ -2322,6 +2318,8 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 			err = pneigh_dump_table(tbl, skb, cb);
 		else
 			err = neigh_dump_table(tbl, skb, cb);
+		if (err < 0)
+			break;
 	}
 	read_unlock(&neigh_tbl_lock);
 

^ permalink raw reply related

* Re: tcp wifi upload performance and lots of ACKs
From: Ben Greear @ 2012-06-07 14:41 UTC (permalink / raw)
  To: David Laight; +Cc: Daniel Baluta, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F3C@saturn3.aculab.com>

On 06/07/2012 05:20 AM, David Laight wrote:
>
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org
>> [mailto:netdev-owner@vger.kernel.org] On Behalf Of Ben Greear
>> Sent: 07 June 2012 05:16
>> To: Daniel Baluta
>> Cc: netdev
>> Subject: Re: tcp wifi upload performance and lots of ACKs
>>
>> On 06/04/2012 12:22 PM, Daniel Baluta wrote:
>>> On Mon, Jun 4, 2012 at 9:29 PM, Ben Greear<greearb@candelatech.com>
> wrote:
>>>> I'm going some TCP performance testing on wifi ->   LAN interface
> connections.
>>>>    With
>>>> UDP, we can get around 250Mbps of payload throughput.  With TCP,
> max is
>>>> about 80Mbps.
>>>>
>>>> I think the problem is that there are way too many ACK packets, and
>>>> bi-directional
>>>> traffic on wifi interfaces really slows things down.  (About 7000
> pkts per
>>>> second in
>>>> upload direction, 2000 pps download.  And the vast majority of the
> download
>>>> pkts
>>>> are 66 byte ACK pkts from what I can tell.)
>>
>>> [1] http://marc.info/?l=linux-netdev&m=131983649130350&w=2
>>
>> After a bit more playing, I did notice a reliable 5% increase in
>> traffic (200Mbps ->  210Mbps) from changing the delack segments
>> to 20 from the default of 1.  That is enough to be useful to me,
>> and there may be more significant gains to be found...
>> I haven't done a full matrix of testing yet.
>
> Does this delaying of acks have a detrimental effect on the
> sending end?
> I've seen very bad interactions between delayed acks and
> (I believe) the 'slow start' code on connections with
> one-directional data, Nagle disabled and very low RTT.
>
> What I saw was the sender sending 4 data packets, then
> sitting waiting for an ack - in spite of accumulating
> several kB of data to send.
>
> Delaying acks further will only make this worse.

I'm sure it's not for everyone in all cases.  In my case, I'm
sending long-term bulk transfer, at high speeds, over wifi network
which has some latency.  Tested one-way traffic so far.

With the patch and delayed acks, I get more sender throughput than
without (200Mbps -> 210Mbps).

Thanks,
Ben

>
> 	David
>
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* [PATCH net-next] be2net: Fix driver load for VFs for Lancer
From: Padmanabh Ratnakar @ 2012-06-07 14:37 UTC (permalink / raw)
  To: netdev; +Cc: Padmanabh Ratnakar

Permanent MAC is wrongly supplied in create iface command. Call the
command with no MAC address and then MAC address should be later queried
and applied.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_cmds.c |   21 +++---
 drivers/net/ethernet/emulex/benet/be_cmds.h |    8 +-
 drivers/net/ethernet/emulex/benet/be_main.c |   98 ++++++++++++++------------
 3 files changed, 66 insertions(+), 61 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
index 8d06ea3..f899752 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
@@ -1132,7 +1132,7 @@ err:
  * Uses MCCQ
  */
 int be_cmd_if_create(struct be_adapter *adapter, u32 cap_flags, u32 en_flags,
-		u8 *mac, u32 *if_handle, u32 *pmac_id, u32 domain)
+		     u32 *if_handle, u32 domain)
 {
 	struct be_mcc_wrb *wrb;
 	struct be_cmd_req_if_create *req;
@@ -1152,17 +1152,13 @@ int be_cmd_if_create(struct be_adapter *adapter, u32 cap_flags, u32 en_flags,
 	req->hdr.domain = domain;
 	req->capability_flags = cpu_to_le32(cap_flags);
 	req->enable_flags = cpu_to_le32(en_flags);
-	if (mac)
-		memcpy(req->mac_addr, mac, ETH_ALEN);
-	else
-		req->pmac_invalid = true;
+
+	req->pmac_invalid = true;
 
 	status = be_mcc_notify_wait(adapter);
 	if (!status) {
 		struct be_cmd_resp_if_create *resp = embedded_payload(wrb);
 		*if_handle = le32_to_cpu(resp->interface_id);
-		if (mac)
-			*pmac_id = le32_to_cpu(resp->pmac_id);
 	}
 
 err:
@@ -2330,8 +2326,8 @@ err:
 }
 
 /* Uses synchronous MCCQ */
-int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
-			bool *pmac_id_active, u32 *pmac_id, u8 *mac)
+int be_cmd_get_mac_from_list(struct be_adapter *adapter, u8 *mac,
+			     bool *pmac_id_active, u32 *pmac_id, u8 domain)
 {
 	struct be_mcc_wrb *wrb;
 	struct be_cmd_req_get_mac_list *req;
@@ -2376,8 +2372,9 @@ int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
 						get_mac_list_cmd.va;
 		mac_count = resp->true_mac_count + resp->pseudo_mac_count;
 		/* Mac list returned could contain one or more active mac_ids
-		 * or one or more pseudo permanant mac addresses. If an active
-		 * mac_id is present, return first active mac_id found
+		 * or one or more true or pseudo permanant mac addresses.
+		 * If an active mac_id is present, return first active mac_id
+		 * found.
 		 */
 		for (i = 0; i < mac_count; i++) {
 			struct get_list_macaddr *mac_entry;
@@ -2396,7 +2393,7 @@ int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
 				goto out;
 			}
 		}
-		/* If no active mac_id found, return first pseudo mac addr */
+		/* If no active mac_id found, return first mac addr */
 		*pmac_id_active = false;
 		memcpy(mac, resp->macaddr_list[0].mac_addr_id.macaddr,
 								ETH_ALEN);
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h
index 9625bf4..2f6bb06 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.h
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.h
@@ -1664,8 +1664,7 @@ extern int be_cmd_pmac_add(struct be_adapter *adapter, u8 *mac_addr,
 extern int be_cmd_pmac_del(struct be_adapter *adapter, u32 if_id,
 			int pmac_id, u32 domain);
 extern int be_cmd_if_create(struct be_adapter *adapter, u32 cap_flags,
-			u32 en_flags, u8 *mac, u32 *if_handle, u32 *pmac_id,
-			u32 domain);
+			    u32 en_flags, u32 *if_handle, u32 domain);
 extern int be_cmd_if_destroy(struct be_adapter *adapter, int if_handle,
 			u32 domain);
 extern int be_cmd_eq_create(struct be_adapter *adapter,
@@ -1751,8 +1750,9 @@ extern int be_cmd_get_cntl_attributes(struct be_adapter *adapter);
 extern int be_cmd_req_native_mode(struct be_adapter *adapter);
 extern int be_cmd_get_reg_len(struct be_adapter *adapter, u32 *log_size);
 extern void be_cmd_get_regs(struct be_adapter *adapter, u32 buf_len, void *buf);
-extern int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
-				bool *pmac_id_active, u32 *pmac_id, u8 *mac);
+extern int be_cmd_get_mac_from_list(struct be_adapter *adapter, u8 *mac,
+				    bool *pmac_id_active, u32 *pmac_id,
+				    u8 domain);
 extern int be_cmd_set_mac_list(struct be_adapter *adapter, u8 *mac_array,
 						u8 mac_count, u32 domain);
 extern int be_cmd_set_hsw_config(struct be_adapter *adapter, u16 pvid,
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index f29827f..896f283 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -2601,8 +2601,8 @@ static int be_vf_setup(struct be_adapter *adapter)
 	cap_flags = en_flags = BE_IF_FLAGS_UNTAGGED | BE_IF_FLAGS_BROADCAST |
 				BE_IF_FLAGS_MULTICAST;
 	for_all_vfs(adapter, vf_cfg, vf) {
-		status = be_cmd_if_create(adapter, cap_flags, en_flags, NULL,
-					  &vf_cfg->if_handle, NULL, vf + 1);
+		status = be_cmd_if_create(adapter, cap_flags, en_flags,
+					  &vf_cfg->if_handle, vf + 1);
 		if (status)
 			goto err;
 	}
@@ -2642,29 +2642,43 @@ static void be_setup_init(struct be_adapter *adapter)
 	adapter->phy.forced_port_speed = -1;
 }
 
-static int be_add_mac_from_list(struct be_adapter *adapter, u8 *mac)
+static int be_get_mac_addr(struct be_adapter *adapter, u8 *mac, u32 if_handle,
+			   bool *active_mac, u32 *pmac_id)
 {
-	u32 pmac_id;
-	int status;
-	bool pmac_id_active;
+	int status = 0;
 
-	status = be_cmd_get_mac_from_list(adapter, 0, &pmac_id_active,
-							&pmac_id, mac);
-	if (status != 0)
-		goto do_none;
+	if (!is_zero_ether_addr(adapter->netdev->perm_addr)) {
+		memcpy(mac, adapter->netdev->dev_addr, ETH_ALEN);
+		if (!lancer_chip(adapter) && !be_physfn(adapter))
+			*active_mac = true;
+		else
+			*active_mac = false;
 
-	if (pmac_id_active) {
-		status = be_cmd_mac_addr_query(adapter, mac,
-				MAC_ADDRESS_TYPE_NETWORK,
-				false, adapter->if_handle, pmac_id);
+		return status;
+	}
 
-		if (!status)
-			adapter->pmac_id[0] = pmac_id;
+	if (lancer_chip(adapter)) {
+		status = be_cmd_get_mac_from_list(adapter, mac,
+						  active_mac, pmac_id, 0);
+		if (*active_mac) {
+			status = be_cmd_mac_addr_query(adapter, mac,
+						       MAC_ADDRESS_TYPE_NETWORK,
+						       false, if_handle,
+						       *pmac_id);
+		}
+	} else if (be_physfn(adapter)) {
+		/* For BE3, for PF get permanent MAC */
+		status = be_cmd_mac_addr_query(adapter, mac,
+					       MAC_ADDRESS_TYPE_NETWORK, true,
+					       0, 0);
+		*active_mac = false;
 	} else {
-		status = be_cmd_pmac_add(adapter, mac,
-				adapter->if_handle, &adapter->pmac_id[0], 0);
+		/* For BE3, for VF get soft MAC assigned by PF*/
+		status = be_cmd_mac_addr_query(adapter, mac,
+					       MAC_ADDRESS_TYPE_NETWORK, false,
+					       if_handle, 0);
+		*active_mac = true;
 	}
-do_none:
 	return status;
 }
 
@@ -2685,12 +2699,12 @@ static int be_get_config(struct be_adapter *adapter)
 
 static int be_setup(struct be_adapter *adapter)
 {
-	struct net_device *netdev = adapter->netdev;
 	struct device *dev = &adapter->pdev->dev;
 	u32 cap_flags, en_flags;
 	u32 tx_fc, rx_fc;
 	int status;
 	u8 mac[ETH_ALEN];
+	bool active_mac;
 
 	be_setup_init(adapter);
 
@@ -2716,14 +2730,6 @@ static int be_setup(struct be_adapter *adapter)
 	if (status)
 		goto err;
 
-	memset(mac, 0, ETH_ALEN);
-	status = be_cmd_mac_addr_query(adapter, mac, MAC_ADDRESS_TYPE_NETWORK,
-			true /*permanent */, 0, 0);
-	if (status)
-		return status;
-	memcpy(adapter->netdev->dev_addr, mac, ETH_ALEN);
-	memcpy(adapter->netdev->perm_addr, mac, ETH_ALEN);
-
 	en_flags = BE_IF_FLAGS_UNTAGGED | BE_IF_FLAGS_BROADCAST |
 			BE_IF_FLAGS_MULTICAST | BE_IF_FLAGS_PASS_L3L4_ERRORS;
 	cap_flags = en_flags | BE_IF_FLAGS_MCAST_PROMISCUOUS |
@@ -2733,27 +2739,29 @@ static int be_setup(struct be_adapter *adapter)
 		cap_flags |= BE_IF_FLAGS_RSS;
 		en_flags |= BE_IF_FLAGS_RSS;
 	}
+
 	status = be_cmd_if_create(adapter, cap_flags, en_flags,
-			netdev->dev_addr, &adapter->if_handle,
-			&adapter->pmac_id[0], 0);
+				  &adapter->if_handle, 0);
 	if (status != 0)
 		goto err;
 
-	 /* The VF's permanent mac queried from card is incorrect.
-	  * For BEx: Query the mac configued by the PF using if_handle
-	  * For Lancer: Get and use mac_list to obtain mac address.
-	  */
-	if (!be_physfn(adapter)) {
-		if (lancer_chip(adapter))
-			status = be_add_mac_from_list(adapter, mac);
-		else
-			status = be_cmd_mac_addr_query(adapter, mac,
-					MAC_ADDRESS_TYPE_NETWORK, false,
-					adapter->if_handle, 0);
-		if (!status) {
-			memcpy(adapter->netdev->dev_addr, mac, ETH_ALEN);
-			memcpy(adapter->netdev->perm_addr, mac, ETH_ALEN);
-		}
+	memset(mac, 0, ETH_ALEN);
+	active_mac = false;
+	status = be_get_mac_addr(adapter, mac, adapter->if_handle,
+				 &active_mac, &adapter->pmac_id[0]);
+	if (status != 0)
+		goto err;
+
+	if (!active_mac) {
+		status = be_cmd_pmac_add(adapter, mac, adapter->if_handle,
+					 &adapter->pmac_id[0], 0);
+		if (status != 0)
+			goto err;
+	}
+
+	if (is_zero_ether_addr(adapter->netdev->dev_addr)) {
+		memcpy(adapter->netdev->dev_addr, mac, ETH_ALEN);
+		memcpy(adapter->netdev->perm_addr, mac, ETH_ALEN);
 	}
 
 	status = be_tx_qs_create(adapter);
-- 
1.6.0.2

^ permalink raw reply related

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Eric Dumazet @ 2012-06-07 14:25 UTC (permalink / raw)
  To: Grant Edwards; +Cc: netdev
In-Reply-To: <jqqd4k$i2c$1@dough.gmane.org>

On Thu, 2012-06-07 at 14:16 +0000, Grant Edwards wrote:

> I was merely pointing out that the API was indeed documented that way.

Good, you are right and we were wrong.

Hopefully you can still use linux-2

^ permalink raw reply

* Re: ip neigh output are incomplete, 3.4.1
From: Eric Dumazet @ 2012-06-07 14:22 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev, Stephen Hemminger
In-Reply-To: <ea0a16e085389e68b480c4b0baf05645@visp.net.lb>

On Thu, 2012-06-07 at 16:09 +0300, Denys Fedoryshchenko wrote:
> I have a host with large L2 network (around 100 L2TP tunnels bridged to 
> one interface). 3.4.1 kernel, x86, 32bit.
> 
> ip route add 172.16.0.0/16 dev br0
> 
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 2
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 3575
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 4613
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 5117
> 
> And at same time
> GlobalNAT /config # ip neigh |wc -l
> 52

Thansk for the report, I am testing a fix and send patch ASAP.

^ permalink raw reply

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Grant Edwards @ 2012-06-07 14:16 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1339077710.5083.12.camel@edumazet-glaptop>

On Thu, Jun 07, 2012 at 04:01:50PM +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 13:23 +0000, Grant Edwards wrote:
> > On 2012-06-06, David Miller <davem@davemloft.net> wrote:

> > > It was never a formal API that we would only allocate 'size'
> > > amount of tailroom.
> >
> > How can you say that?

> Documentation was stale, so what ?

So there _was_ a formal API that said you would only allocate 'size'
amount of tailroom.  That's what.

> kmalloc(99) doesnt allocate 99 bytes but 128, so what?

Doing so violated the documented API.

You said there was never any API definition that said tailroom() ==
requested size, and implied that it was stupid to write code that
expected tailroom() == requested size.

I was merely pointing out that the API was indeed documented that way.

> Grant, what about you fix your code ?

I did.

And the API documentation has now been fixed as well, but don't try to
tell me that the API documentation didn't promise to work the way my
code expected it to work.

-- 
Grant Edwards               grant.b.edwards        Yow! Youth of today!
                                  at               Join me in a mass rally
                              gmail.com            for traditional mental
                                                   attitudes!

^ permalink raw reply

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Eric Dumazet @ 2012-06-07 14:01 UTC (permalink / raw)
  To: Grant Edwards; +Cc: netdev
In-Reply-To: <jqqa1b$kug$1@dough.gmane.org>

On Thu, 2012-06-07 at 13:23 +0000, Grant Edwards wrote:
> On 2012-06-06, David Miller <davem@davemloft.net> wrote:
> > From: Grant Edwards <grant.b.edwards@gmail.com>
> > Date: Wed, 6 Jun 2012 18:59:19 +0000 (UTC)
> >
> >> At the time it was written (probably 10+ years ago) it was relying on
> >> the documented API for alloc_skb() that stated alloc_skb() either
> >> returned an sk_buff of the requested size or it failed.
> >
> > It was never a formal API that we would only allocate 'size'
> > amount of tailroom.
> 
> How can you say that?

Documentation was stale, so what ?

kmalloc(99) doesnt allocate 99 bytes but 128, so what ?

Grant, what about you fix your code ?

^ permalink raw reply

* Re: [PATCH (net.git) V2] stmmac: fix driver built w/ w/o both pci and platf modules
From: Fengguang Wu @ 2012-06-07 13:38 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev, davem
In-Reply-To: <1339073322-23093-1-git-send-email-peppe.cavallaro@st.com>

Tested-by: Fengguang Wu <wfg@linux.intel.com>

Thanks!

^ permalink raw reply

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Grant Edwards @ 2012-06-07 13:23 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20120606.120247.1618312724057709285.davem@davemloft.net>

On 2012-06-06, David Miller <davem@davemloft.net> wrote:
> From: Grant Edwards <grant.b.edwards@gmail.com>
> Date: Wed, 6 Jun 2012 18:59:19 +0000 (UTC)
>
>> At the time it was written (probably 10+ years ago) it was relying on
>> the documented API for alloc_skb() that stated alloc_skb() either
>> returned an sk_buff of the requested size or it failed.
>
> It was never a formal API that we would only allocate 'size'
> amount of tailroom.

How can you say that?

>From skbuff.c:

    /**
    *__alloc_skb-allocate a network buffer
    *@size: size to allocate
    *@gfp_mask: allocation mask
    *@fclone: allocate from fclone cache instead of head cache
    *and allocate a cloned (child) skb
    *@node: numa node to allocate memory on
    *
>>> *Allocate a new &sk_buff. The returned buffer has no headroom and a
>>> *tail room of size bytes. The object has a reference count of one.
    *The return is the buffer. On a failure the return is %NULL.
    *
    *Buffers may only be allocated from interrupts using a @gfp_mask of
    *%GFP_ATOMIC.
    */

-- 
Grant Edwards               grant.b.edwards        Yow! Did you move a lot of
                                  at               KOREAN STEAK KNIVES this
                              gmail.com            trip, Dingy?

^ permalink raw reply

* (SEE ATTACHMENT)Johnson Gilbert Muthusamy!
From: Mr Johnson Gilbert Muthusamy @ 2012-06-07 13:10 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: FROM Mr Johnson Gilbert Muthusamy!.doc --]
[-- Type: application/msword, Size: 25600 bytes --]

^ permalink raw reply

* ip neigh output are incomplete, 3.4.1
From: Denys Fedoryshchenko @ 2012-06-07 13:09 UTC (permalink / raw)
  To: netdev, Stephen Hemminger

I have a host with large L2 network (around 100 L2TP tunnels bridged to 
one interface). 3.4.1 kernel, x86, 32bit.

ip route add 172.16.0.0/16 dev br0

GlobalNAT ~ # cat /proc/net/arp |wc -l
2
GlobalNAT ~ # cat /proc/net/arp |wc -l
3575
GlobalNAT ~ # cat /proc/net/arp |wc -l
4613
GlobalNAT ~ # cat /proc/net/arp |wc -l
5117

And at same time
GlobalNAT /config # ip neigh |wc -l
52
GlobalNAT /config # ip neigh
172.16.1.94 dev br0 lladdr ea:2b:dd:8c:a6:96 REACHABLE
172.16.188.36 dev br0  INCOMPLETE
172.16.21.67 dev br0 lladdr aa:75:b5:04:01:1a REACHABLE
172.16.199.127 dev br0 lladdr 2e:64:5c:61:92:be REACHABLE
172.16.219.100 dev br0 lladdr 82:2f:33:ec:31:a9 REACHABLE
172.16.212.171 dev br0 lladdr be:d0:90:8a:97:35 REACHABLE
172.16.134.232 dev br0  FAILED
172.16.47.155 dev br0 lladdr 52:2e:4b:d1:d7:73 REACHABLE
172.16.67.128 dev br0 lladdr 22:c8:b2:a6:d8:17 REACHABLE
172.16.107.74 dev br0 lladdr 5a:ed:4f:35:32:94 REACHABLE
172.16.176.131 dev br0 lladdr b6:cf:5c:6e:84:88 REACHABLE
172.16.196.104 dev br0 lladdr e6:ec:2a:77:8d:ab REACHABLE
172.16.167.249 dev br0 lladdr be:13:c5:b2:be:f6 REACHABLE
172.16.49.108 dev br0 lladdr ea:97:c7:a9:a5:40 REACHABLE
172.16.71.34 dev br0 lladdr 2e:27:29:da:fc:2e REACHABLE
172.16.42.179 dev br0  INCOMPLETE
172.16.33.41 dev br0 lladdr 2e:64:5c:61:92:be REACHABLE
172.16.4.186 dev br0 lladdr 16:f9:a1:00:9d:c9 REACHABLE
172.16.104.51 dev br0 lladdr c6:1c:7e:2f:fe:1e REACHABLE
172.16.242.165 dev br0 lladdr 5a:ed:4f:35:32:94 REACHABLE
172.16.124.24 dev br0  INCOMPLETE
172.16.57.176 dev br0 lladdr 4e:70:0f:5c:d0:f3 REACHABLE
172.16.206.125 dev br0 lladdr 82:41:c6:78:56:36 REACHABLE
172.16.128.186 dev br0  INCOMPLETE
172.16.10.45 dev br0 lladdr 5e:15:f2:8d:8a:ab REACHABLE
172.16.21.136 dev br0 lladdr aa:75:b5:04:01:1a REACHABLE
172.16.61.82 dev br0 lladdr be:13:c5:b2:be:f6 REACHABLE
172.16.248.24 dev br0 lladdr 1a:f7:d5:2f:98:36 REACHABLE
172.16.239.142 dev br0 lladdr c6:1c:7e:2f:fe:1e REACHABLE
172.16.14.207 dev br0 lladdr 5e:15:f2:8d:8a:ab REACHABLE
172.16.134.45 dev br0 lladdr 8e:5a:26:d7:e6:ba REACHABLE
172.16.252.186 dev br0 lladdr 62:79:87:21:b4:46 REACHABLE
172.16.154.18 dev br0 lladdr 2a:39:28:65:80:37 REACHABLE
172.16.194.220 dev br0 lladdr ba:18:38:24:86:07 REACHABLE
172.16.76.79 dev br0 lladdr 62:25:4b:70:9e:f2 REACHABLE
172.16.234.166 dev br0 lladdr 6e:67:51:cd:a6:d4 REACHABLE
172.16.67.197 dev br0 lladdr 22:c8:b2:a6:d8:17 REACHABLE
172.16.49.177 dev br0 lladdr ea:97:c7:a9:a5:40 REACHABLE
172.16.118.234 dev br0 lladdr e2:3a:a3:0d:02:4d REACHABLE
172.16.169.15 dev br0  INCOMPLETE
172.16.100.214 dev br0 lladdr 56:a4:8f:1f:46:58 REACHABLE
172.16.71.103 dev br0 lladdr 2e:27:29:da:fc:2e REACHABLE
172.16.91.76 dev br0 lladdr d6:a0:18:9f:a3:21 REACHABLE
172.16.229.190 dev br0 lladdr d6:38:bd:40:02:4f REACHABLE
172.16.93.29 dev br0 lladdr 82:35:37:e8:11:c2 REACHABLE
172.16.113.2 dev br0 lladdr c6:5e:c6:f5:44:5e REACHABLE
172.16.104.120 dev br0 lladdr c6:1c:7e:2f:fe:1e REACHABLE
172.16.95.238 dev br0  INCOMPLETE
172.16.106.73 dev br0 lladdr 5a:ed:4f:35:32:94 REACHABLE
172.16.146.19 dev br0 lladdr ea:2b:dd:8c:a6:96 REACHABLE
172.16.235.49 dev br0 lladdr be:d0:90:8a:97:35 REACHABLE
172.16.255.22 dev br0  INCOMPLETE

ip neigh show dev br0 showing the same, 52 hosts only
Trying to set larger rcvbuf won't help, for example
ip -rcvbuf 1000000 ip neigh

Short sample from /proc/net/arp
172.16.41.227    0x1         0x2         5a:ed:4f:35:32:94     *        
br0
172.16.61.200    0x1         0x2         be:13:c5:b2:be:f6     *        
br0
172.16.239.4     0x1         0x2         c6:1c:7e:2f:fe:1e     *        
br0
172.16.3.234     0x1         0x2         5a:ed:4f:35:32:94     *        
br0
172.16.161.65    0x1         0x2         96:cf:d5:8a:85:7f     *        
br0
172.16.14.69     0x1         0x2         5e:15:f2:8d:8a:ab     *        
br0
172.16.212.102   0x1         0x2         be:d0:90:8a:97:35     *        
br0
172.16.252.48    0x1         0x2         62:79:87:21:b4:46     *        
br0
172.16.125.25    0x1         0x2         d6:3f:35:6c:69:a5     *        
br0
172.16.56.224    0x1         0x2         d6:a0:18:9f:a3:21     *        
br0
172.16.76.197    0x1         0x2         62:25:4b:70:9e:f2     *        
br0
172.16.67.59     0x1         0x2         22:c8:b2:a6:d8:17     *        
br0
172.16.156.89    0x1         0x2         6e:bd:24:97:4f:fb     *        
br0
172.16.107.5     0x1         0x2         5a:ed:4f:35:32:94     *        
br0
172.16.245.119   0x1         0x2         5a:1d:72:ba:39:1c     *        
br0
172.16.176.62    0x1         0x2         b6:cf:5c:6e:84:88     *        
br0

---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply

* Re: NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
From: Thomas Meyer @ 2012-06-07 12:37 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Eric Dumazet, Linux Kernel Mailing List, jcliburn, chris.snook,
	netdev, Josh Boyer
In-Reply-To: <20120606003856.GA7839@burratino>

Am Dienstag, den 05.06.2012, 19:38 -0500 schrieb Jonathan Nieder:
> In February, 2012, Thomas Meyer wrote:
> > Am Freitag, den 24.02.2012, 20:20 +0100 schrieb Eric Dumazet:
> 
> >> Here is a cumulative patch to hopefuly remove the races in this driver,
> >> could you please test it ?
> [...]
> > just building a 3.2.7 kernel with your patch applied. I will watch out
> > for the warning in the next days.
> 
> Well, did it work? :)

Hi Jonathan,

no it didn't. I still get these warnings.

wiht kind regards
thomas

> 
> In suspense,
> Jonathan

^ permalink raw reply

* Re: Difficulties to get 1Gbps on be2net ethernet card
From: Jean-Michel Hautbois @ 2012-06-07 12:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev
In-Reply-To: <1339072289.3494.3.camel@edumazet-glaptop>

2012/6/7 Eric Dumazet <eric.dumazet@gmail.com>:
> On Thu, 2012-06-07 at 14:27 +0200, Jean-Michel Hautbois wrote:
>
>> I made some tests, and I didn't mention it : I am using the bonding
>> driver over my ethernet drivers (be2net/mlx4 etc.).
>> When I am using bonding, I need a big txqeuelen in order to send 2.4Gbps.
>> When I disable bonding, and use directly the NIC then I don't see any
>> drops in qdisc and it works well.
>> So, I think there is something between 2.6.26 and 3.0 in the bonding
>> driver which causes this issue.
>>
>
> What your bond configuration looks like ?
>
> cat /proc/net/bonding/bond0
cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 50
Up Delay (ms): 100
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Speed: 4000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 68:b5:99:b9:8d:d4
Slave queue ID: 0

Slave Interface: eth9
MII Status: up
Speed: 4000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 78:e7:d1:68:bb:38
Slave queue ID: 0

> ifconfig -a
bond0     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d0
          inet addr:192.168.250.11  Bcast:192.168.250.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd0/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:6570 errors:0 dropped:74 overruns:0 frame:0
          TX packets:5208 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:900993 (879.8 KiB)  TX bytes:863735 (843.4 KiB)

bond1     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
          inet addr:192.168.2.1  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd4/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:4096  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)

bond2     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d1
          inet addr:10.11.17.190  Bcast:10.11.17.255  Mask:255.255.255.128
          inet6 addr: fe80::6ab5:99ff:feb9:8dd1/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:1301996 errors:0 dropped:27 overruns:0 frame:0
          TX packets:959 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1760302182 (1.6 GiB)  TX bytes:502828 (491.0 KiB)

bond3     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          inet6 addr: fe80::6ab5:99ff:feb9:8dd5/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:942641 errors:0 dropped:0 overruns:0 frame:0
          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1278313720 (1.1 GiB)  TX bytes:2616 (2.5 KiB)

bond4     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d2
          inet addr:192.168.202.1  Bcast:192.168.202.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd2/64 Scope:Link
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

bond5     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d6
          inet addr:192.168.203.1  Bcast:192.168.203.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd6/64 Scope:Link
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

bond6     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          inet6 addr: fe80::6ab5:99ff:feb9:8dd3/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:25531 (24.9 KiB)  TX bytes:1046 (1.0 KiB)

bond7     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d7
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

bond3.4   Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          inet addr:192.168.201.1  Bcast:192.168.201.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:942641 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1265116746 (1.1 GiB)  TX bytes:1980 (1.9 KiB)

bond6.7   Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          inet addr:192.168.204.1  Bcast:192.168.204.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:21765 (21.2 KiB)  TX bytes:468 (468.0 B)

eth0      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:6496 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5208 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:896849 (875.8 KiB)  TX bytes:863735 (843.4 KiB)

eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)

eth2      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d1
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:1301996 errors:0 dropped:27 overruns:0 frame:0
          TX packets:959 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1760302182 (1.6 GiB)  TX bytes:502828 (491.0 KiB)

eth3      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth4      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d2
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth5      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d6
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth6      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:25531 (24.9 KiB)  TX bytes:1046 (1.0 KiB)

eth7      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d7
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth8      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:74 errors:0 dropped:74 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4144 (4.0 KiB)  TX bytes:0 (0.0 B)

eth9      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth10     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d1
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth11     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:942641 errors:0 dropped:0 overruns:0 frame:0
          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1278313720 (1.1 GiB)  TX bytes:2616 (2.5 KiB)

eth12     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d2
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

eth13     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d6
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

eth14     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth15     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d7
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:667883 errors:0 dropped:0 overruns:0 frame:0
          TX packets:667883 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:109537849 (104.4 MiB)  TX bytes:109537849 (104.4 MiB)


> tc -s -d qdisc
 tc -s -d qdisc
qdisc mq 0: dev eth0 root
 Sent 873668 bytes 5267 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth1 root
 Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
requeues 17480)
 backlog 0b 0p requeues 17480
qdisc mq 0: dev eth2 root
 Sent 516248 bytes 983 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth3 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth4 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth5 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth6 root
 Sent 1022 bytes 13 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth7 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth8 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth9 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth10 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth11 root
 Sent 2448 bytes 40 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth12 root
 Sent 90 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth13 root
 Sent 90 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth14 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth15 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)

^ permalink raw reply

* Re: NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
From: Eric Dumazet @ 2012-06-07 12:52 UTC (permalink / raw)
  To: Thomas Meyer
  Cc: Jonathan Nieder, Linux Kernel Mailing List, jcliburn, chris.snook,
	netdev, Josh Boyer
In-Reply-To: <1339072653.3018.9.camel@localhost.localdomain>

On Thu, 2012-06-07 at 14:37 +0200, Thomas Meyer wrote:
> Am Dienstag, den 05.06.2012, 19:38 -0500 schrieb Jonathan Nieder:
> > In February, 2012, Thomas Meyer wrote:
> > > Am Freitag, den 24.02.2012, 20:20 +0100 schrieb Eric Dumazet:
> > 
> > >> Here is a cumulative patch to hopefuly remove the races in this driver,
> > >> could you please test it ?
> > [...]
> > > just building a 3.2.7 kernel with your patch applied. I will watch out
> > > for the warning in the next days.
> > 
> > Well, did it work? :)
> 
> Hi Jonathan,
> 
> no it didn't. I still get these warnings.

I sent another patch today, you might try it ;)

https://lkml.org/lkml/2012/6/7/143

^ permalink raw reply

* [PATCH] e1000: save skb counts in TX to avoid cache misses
From: Roman Kagan @ 2012-06-07 12:49 UTC (permalink / raw)
  To: stable
  Cc: Roman Kagan, Jeff Kirsher, Jesse Brandeburg, Bruce Allan,
	Carolyn Wyborny, Don Skidmore, Greg Rose, PJ Waskiewicz,
	Alex Duyck, John Ronciak, Dean Nelson, David S. Miller,
	e1000-devel, netdev, linux-kernel

[Upstream commit 31c15a2f24ebdab14333d9bf5df49757842ae2ec with paths
adjusted to compensate for the drivers/net/ethernet/intel reorg in
dee1ad47f2ee75f5146d83ca757c1b7861c34c3b]

Author: Dean Nelson <dnelson@redhat.com>
Date:   Thu Aug 25 14:39:24 2011 +0000

    e1000: save skb counts in TX to avoid cache misses

    Virtual Machines with emulated e1000 network adapter running on Parallels'
    server were seeing kernel panics due to the e1000 driver dereferencing an
    unexpected NULL pointer retrieved from buffer_info->skb.

    The problem has been addressed for the e1000e driver, but not for the e1000.
    Since the two drivers share similar code in the affected area, a port of the
    following e1000e driver commit solves the issue for the e1000 driver:

    commit 9ed318d546a29d7a591dbe648fd1a2efe3be1180
    Author: Tom Herbert <therbert@google.com>
    Date:   Wed May 5 14:02:27 2010 +0000

        e1000e: save skb counts in TX to avoid cache misses

        In e1000_tx_map, precompute number of segements and bytecounts which
        are derived from fields in skb; these are stored in buffer_info.  When
        cleaning tx in e1000_clean_tx_irq use the values in the associated
        buffer_info for statistics counting, this eliminates cache misses
        on skb fields.

    Signed-off-by: Dean Nelson <dnelson@redhat.com>
    Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
---
 drivers/net/e1000/e1000.h      |    2 ++
 drivers/net/e1000/e1000_main.c |   18 +++++++++---------
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h
index 8676899..2c71884 100644
--- a/drivers/net/e1000/e1000.h
+++ b/drivers/net/e1000/e1000.h
@@ -150,6 +150,8 @@ struct e1000_buffer {
 	unsigned long time_stamp;
 	u16 length;
 	u16 next_to_watch;
+	unsigned int segs;
+	unsigned int bytecount;
 	u16 mapped_as_page;
 };
 
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 76e8af0..99525f9 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2798,7 +2798,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 	struct e1000_buffer *buffer_info;
 	unsigned int len = skb_headlen(skb);
 	unsigned int offset = 0, size, count = 0, i;
-	unsigned int f;
+	unsigned int f, bytecount, segs;
 
 	i = tx_ring->next_to_use;
 
@@ -2899,7 +2899,13 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 		}
 	}
 
+	segs = skb_shinfo(skb)->gso_segs ?: 1;
+	/* multiply data chunks by size of headers */
+	bytecount = ((segs - 1) * skb_headlen(skb)) + skb->len;
+
 	tx_ring->buffer_info[i].skb = skb;
+	tx_ring->buffer_info[i].segs = segs;
+	tx_ring->buffer_info[i].bytecount = bytecount;
 	tx_ring->buffer_info[first].next_to_watch = i;
 
 	return count;
@@ -3573,14 +3579,8 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter,
 			cleaned = (i == eop);
 
 			if (cleaned) {
-				struct sk_buff *skb = buffer_info->skb;
-				unsigned int segs, bytecount;
-				segs = skb_shinfo(skb)->gso_segs ?: 1;
-				/* multiply data chunks by size of headers */
-				bytecount = ((segs - 1) * skb_headlen(skb)) +
-				            skb->len;
-				total_tx_packets += segs;
-				total_tx_bytes += bytecount;
+				total_tx_packets += buffer_info->segs;
+				total_tx_bytes += buffer_info->bytecount;
 			}
 			e1000_unmap_and_free_tx_resource(adapter, buffer_info);
 			tx_desc->upper.data = 0;
-- 
1.7.10.2

^ permalink raw reply related

* [PATCH (net.git) V2] stmmac: fix driver built w/ w/o both pci and platf modules
From: Giuseppe CAVALLARO @ 2012-06-07 12:48 UTC (permalink / raw)
  To: netdev; +Cc: wfg, davem, Giuseppe Cavallaro
In-Reply-To: <1339062803-24273-1-git-send-email-peppe.cavallaro@st.com>

The commit ba27ec66ffeb78cbf fixes the Kconfig of the
driver when built as module allowing to select/unselect
the PCI and Platform modules that are not anymore mutually
exclusive. This patch fixes and guarantees that the driver
builds on all the platforms w/ w/o PCI and when select/unselect
the two stmmac supports. In case of there are some problems
on both the configuration and the pci/pltf registration the
module_init will fail.

v2: set the CONFIG_STMMAC_PLATFORM enabled by default.
I've just noticed that this can actually help on
some configurations that don't enable any STMMAC
options by default (e.g. SPEAr).

Reported-by: Fengguang Wu <wfg@linux.intel.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/Kconfig       |    1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac.h      |   60 ++++++++++++++++++++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |   23 +++++----
 3 files changed, 72 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index 0076f77..9f44827 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -15,6 +15,7 @@ if STMMAC_ETH
 config STMMAC_PLATFORM
 	bool "STMMAC Platform bus support"
 	depends on STMMAC_ETH
+	default y
 	---help---
 	  This selects the platform specific bus support for
 	  the stmmac device driver. This is the driver used
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 6d07ba2..e8afe7b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -26,6 +26,7 @@
 #include <linux/clk.h>
 #include <linux/stmmac.h>
 #include <linux/phy.h>
+#include <linux/pci.h>
 #include "common.h"
 #ifdef CONFIG_STMMAC_TIMER
 #include "stmmac_timer.h"
@@ -95,8 +96,6 @@ extern int stmmac_mdio_register(struct net_device *ndev);
 extern void stmmac_set_ethtool_ops(struct net_device *netdev);
 extern const struct stmmac_desc_ops enh_desc_ops;
 extern const struct stmmac_desc_ops ndesc_ops;
-extern struct pci_driver stmmac_pci_driver;
-extern struct platform_driver stmmac_pltfr_driver;
 int stmmac_freeze(struct net_device *ndev);
 int stmmac_restore(struct net_device *ndev);
 int stmmac_resume(struct net_device *ndev);
@@ -144,3 +143,60 @@ static inline int stmmac_clk_get(struct stmmac_priv *priv)
 	return 0;
 }
 #endif /* CONFIG_HAVE_CLK */
+
+
+#ifdef CONFIG_STMMAC_PLATFORM
+extern struct platform_driver stmmac_pltfr_driver;
+static inline int stmmac_register_platform(void)
+{
+	int err;
+
+	err = platform_driver_register(&stmmac_pltfr_driver);
+	if (err)
+		pr_err("stmmac: failed to register the platform driver\n");
+
+	return err;
+}
+static inline void stmmac_unregister_platform(void)
+{
+	platform_driver_register(&stmmac_pltfr_driver);
+}
+#else
+static inline int stmmac_register_platform(void)
+{
+	pr_err("stmmac: do not register the platf driver\n");
+
+	return -EINVAL;
+}
+static inline void stmmac_unregister_platform(void)
+{
+}
+#endif /* CONFIG_STMMAC_PLATFORM */
+
+#ifdef CONFIG_STMMAC_PCI
+extern struct pci_driver stmmac_pci_driver;
+static inline int stmmac_register_pci(void)
+{
+	int err;
+
+	err = pci_register_driver(&stmmac_pci_driver);
+	if (err)
+		pr_err("stmmac: failed to register the PCI driver\n");
+
+	return err;
+}
+static inline void stmmac_unregister_pci(void)
+{
+	pci_unregister_driver(&stmmac_pci_driver);
+}
+#else
+static inline int stmmac_register_pci(void)
+{
+	pr_err("stmmac: do not register the PCI driver\n");
+
+	return -EINVAL;
+}
+static inline void stmmac_unregister_pci(void)
+{
+}
+#endif /* CONFIG_STMMAC_PCI */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3638569..51b3b68 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -42,7 +42,6 @@
 #include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/prefetch.h>
-#include <linux/pci.h>
 #ifdef CONFIG_STMMAC_DEBUG_FS
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
@@ -2094,25 +2093,29 @@ int stmmac_restore(struct net_device *ndev)
 }
 #endif /* CONFIG_PM */
 
+/* Driver can be configured w/ and w/ both PCI and Platf drivers
+ * depending on the configuration selected.
+ */
 static int __init stmmac_init(void)
 {
-	int err = 0;
+	int err_plt = 0;
+	int err_pci = 0;
 
-	err = platform_driver_register(&stmmac_pltfr_driver);
+	err_plt = stmmac_register_platform();
+	err_pci = stmmac_register_pci();
 
-	if (!err) {
-		err = pci_register_driver(&stmmac_pci_driver);
-		if (err)
-			platform_driver_unregister(&stmmac_pltfr_driver);
+	if ((err_pci) && (err_plt)) {
+		pr_err("stmmac: driver registration failed\n");
+		return -EINVAL;
 	}
 
-	return err;
+	return 0;
 }
 
 static void __exit stmmac_exit(void)
 {
-	pci_unregister_driver(&stmmac_pci_driver);
-	platform_driver_unregister(&stmmac_pltfr_driver);
+	stmmac_unregister_platform();
+	stmmac_unregister_pci();
 }
 
 module_init(stmmac_init);
-- 
1.7.4.4

^ permalink raw reply related

* Re: [PATCH] sky2: fix checksum bit management on some chips
From: Kirill Smelkov @ 2012-06-07 12:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Mirko Lindner, netdev
In-Reply-To: <20120606130130.5a86f94a@nehalam.linuxnetplumber.net>

On Wed, Jun 06, 2012 at 01:01:30PM -0700, Stephen Hemminger wrote:
> The newer flavors of Yukon II use a different method for receive
> checksum offload. This is indicated in the driver by the SKY2_HW_NEW_LE
> flag. On these newer chips, the BMU_ENA_RX_CHKSUM should not be set.
> 
> The driver would get incorrectly toggle the bit, enabling the old
> checksum logic on these chips and cause a BUG_ON() assertion. If
> receive checksum was toggled via ethtool.
> 
> Reported-by: Kirill Smelkov <kirr@mns.spb.ru>
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> ---
> Patch against net-next, please apply to net and stable kernels.
> 
> --- a/drivers/net/ethernet/marvell/sky2.c	2012-06-06 11:09:38.288440819 -0700
> +++ b/drivers/net/ethernet/marvell/sky2.c	2012-06-06 11:25:01.275782462 -0700
> @@ -4381,10 +4381,12 @@ static int sky2_set_features(struct net_
>  	struct sky2_port *sky2 = netdev_priv(dev);
>  	netdev_features_t changed = dev->features ^ features;
>  
> -	if (changed & NETIF_F_RXCSUM) {
> -		bool on = features & NETIF_F_RXCSUM;
> -		sky2_write32(sky2->hw, Q_ADDR(rxqaddr[sky2->port], Q_CSR),
> -			     on ? BMU_ENA_RX_CHKSUM : BMU_DIS_RX_CHKSUM);
> +	if ((changed & NETIF_F_RXCSUM) &&
> +	    !(sky2->hw->flags & SKY2_HW_NEW_LE)) {
> +		sky2_write32(sky2->hw,
> +			     Q_ADDR(rxqaddr[sky2->port], Q_CSR),
> +			     (features & NETIF_F_RXCSUM)
> +			     ? BMU_ENA_RX_CHKSUM : BMU_DIS_RX_CHKSUM);
>  	}
>  
>  	if (changed & NETIF_F_RXHASH)


Thanks Stephen, now that BUG_ON is gone.

^ permalink raw reply

* iwlwifi: kernel panic during boot due to module load order
From: Sasha Levin @ 2012-06-07 12:34 UTC (permalink / raw)
  To: donald.h.fry, emmanuel.grumbach, johannes.berg, linville,
	wey-yi.w.guy
  Cc: ilw, linux-wireless, netdev, linux-kernel@vger.kernel.org

Hi all,

Commit cc5f7e397 ("iwlwifi: implement dynamic opmode loading") causes a
kernel panic during boot in the following scenario:

1. All drivers are built-in.
2. Due to their build order, iwl_init gets called before iwl_drv_init.
3. iwl_init will call iwl_opmode_register which will iterate the new op
list, and cause a NULL ptr deref when trying to list_for_each_entry the
dev list, which won't be empty since it wasn't initialized (iwl_drv_init
wasn't called yet).

While it's possible to easily fix the actual deref, I suspect that the
init function call order is wrong. I've looked at getting it right in
the Makefile, but it seems to have specific ordering behind it, so I'd
rather not try patching it myself.

Thanks,
Sasha.

^ permalink raw reply

* Re: Difficulties to get 1Gbps on be2net ethernet card
From: Eric Dumazet @ 2012-06-07 12:31 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev
In-Reply-To: <CAL8zT=g67JLZZjLF81+wxe9pmuPZ4EdpY1fHJMQNgcy9R6oBzg@mail.gmail.com>

On Thu, 2012-06-07 at 14:27 +0200, Jean-Michel Hautbois wrote:

> I made some tests, and I didn't mention it : I am using the bonding
> driver over my ethernet drivers (be2net/mlx4 etc.).
> When I am using bonding, I need a big txqeuelen in order to send 2.4Gbps.
> When I disable bonding, and use directly the NIC then I don't see any
> drops in qdisc and it works well.
> So, I think there is something between 2.6.26 and 3.0 in the bonding
> driver which causes this issue.
> 

What your bond configuration looks like ?

cat /proc/net/bonding/bond0
ifconfig -a
tc -s -d qdisc

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox