Netdev List
 help / color / mirror / Atom feed
* [PATCH] netdev: pasemi: fix return value check in pasemi_mac_phy_init()
From: Wei Yongjun @ 2012-09-27  5:51 UTC (permalink / raw)
  To: olof, grant.likely, rob.herring; +Cc: yongjun_wei, netdev, devicetree-discuss

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

In case of error, the function of_phy_connect() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value
check should be replaced with NULL test.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 drivers/net/ethernet/pasemi/pasemi_mac.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/pasemi/pasemi_mac.c b/drivers/net/ethernet/pasemi/pasemi_mac.c
index e559dfa..6fa74d5 100644
--- a/drivers/net/ethernet/pasemi/pasemi_mac.c
+++ b/drivers/net/ethernet/pasemi/pasemi_mac.c
@@ -1101,9 +1101,9 @@ static int pasemi_mac_phy_init(struct net_device *dev)
 	phydev = of_phy_connect(dev, phy_dn, &pasemi_adjust_link, 0,
 				PHY_INTERFACE_MODE_SGMII);
 
-	if (IS_ERR(phydev)) {
+	if (!phydev) {
 		printk(KERN_ERR "%s: Could not attach to phy\n", dev->name);
-		return PTR_ERR(phydev);
+		return -ENODEV;
 	}
 
 	mac->phydev = phydev;

^ permalink raw reply related

* Re: Problems with tg3 on BCM5720
From: Dirkjan Ochtman @ 2012-09-27  6:45 UTC (permalink / raw)
  To: Nithin Nayak Sujir; +Cc: netdev
In-Reply-To: <50636FEA.7080509@broadcom.com>

On Wed, Sep 26, 2012 at 11:13 PM, Nithin Nayak Sujir
<nsujir@broadcom.com> wrote:
> 1. Can you tell me the last patch that is included in the tg3 driver in
> 3.4.9 on your distro?

There are no tg3-specific patches in my distro's 3.4.9 package.

> 2. Can you give more info about the working setup?

The working setup is a simple small VLAN with a 192.168.1.0/24 subnet
and a few other Linux boxes on it (some of them also have BCM5720,
others have BCM5722 or BCM5709 networking). Not sure what other
information you'd want about this?

> 3. Was there any system reset or driver reload between the working and not
> working setups? Or was it just a cable switch?

Just a cable switch suffices to reproduce the problem we're seeing.

> 4. Please give the output of
> ethtool eth0
> ethtool -i eth0
> ethtool -k eth0

djc@jansky ~ $ sudo ethtool eth0
Settings for eth0:
	Supported ports: [ TP ]
	Supported link modes:   10baseT/Half 10baseT/Full
	                        100baseT/Half 100baseT/Full
	                        1000baseT/Half 1000baseT/Full
	Supported pause frame use: No
	Supports auto-negotiation: Yes
	Advertised link modes:  10baseT/Half 10baseT/Full
	                        100baseT/Half 100baseT/Full
	                        1000baseT/Half 1000baseT/Full
	Advertised pause frame use: Symmetric
	Advertised auto-negotiation: Yes
	Link partner advertised link modes:  10baseT/Half 10baseT/Full
	                                     100baseT/Half 100baseT/Full
	Link partner advertised pause frame use: No
	Link partner advertised auto-negotiation: Yes
	Speed: 100Mb/s
	Duplex: Full
	Port: Twisted Pair
	PHYAD: 1
	Transceiver: internal
	Auto-negotiation: on
	MDI-X: off
	Supports Wake-on: g
	Wake-on: d
	Current message level: 0x000000ff (255)
			       drv probe link timer ifdown ifup rx_err tx_err
	Link detected: yes
djc@jansky ~ $ sudo ethtool -i eth0
driver: tg3
version: 3.124
firmware-version: FFV7.2.14 bc 5720-v1.25
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
djc@jansky ~ $ sudo ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: off

(Note that this is with my hacked up driver from the 3.6 tree, taken
from 185d4c8bf579322e1c2835d70729bc30f6f80f55, with
8d4057a938481351dc690fbe23e8c72af08d5890,
d3836f21b0af5513ef55701dd3f50b8c42e44c7a,
a1e8b307986ab27b7608f107aec71d3569650f46,
118008784965003307ea164370094c7d0810546e,
3f84749004925dd1e94025292fed5c76ce418516 reverted to make it compile
on 3.4.9.)

> 5. Can you run ethtool --test in the working setup?

Here's the ethtool --test result from eth1, which is currently plugged
into the VLAN (eth0 was plugged into it before; we also tried plugging
the external network into eth1, but that gave the same results as
plugging it into eth0).

djc@jansky ~ $ sudo ethtool --test eth1
The test result is PASS
The test extra info:
nvram test        (online) 	 0
link test         (online) 	 0
register test     (offline)	 0
memory test       (offline)	 0
mac loopback test (offline)	 0
phy loopback test (offline)	 0
ext loopback test (offline)	 0
interrupt test    (offline)	 0

> 6. I noticed in the syslog, the link is coming up at 100 Mbps. Is this
> expected?

No, I don't think so, it should be a Gbit line.

> 7. Does it fail immediately on connect to the data center switch? Or is it
> after some traffic goes through?

The vendor whose switch we're connecting to says they see that the
link is up, but they don't see a MAC attached. One of the things that
look weird to me is the ifconfig output saying "RX packets:513
errors:0 dropped:0 overruns:0 frame:0" but also "TX packets:0 errors:0
dropped:0 overruns:0 carrier:0".

On Wed, Sep 26, 2012 at 11:40 PM, Michael Chan <mchan@broadcom.com> wrote:
> It is most likely that the device eth0 is down.  The device needs to be
> up in order to perform all the tests that failed.  Please bring up the
> device and run the test again.  Thanks.

Right, sorry about that. Here's the results again, with the interface up:

djc@jansky ~ $ sudo ethtool --test eth0
The test result is FAIL
The test extra info:
nvram test        (online) 	 0
link test         (online) 	 0
register test     (offline)	 0
memory test       (offline)	 0
mac loopback test (offline)	 0
phy loopback test (offline)	 5
ext loopback test (offline)	 0
interrupt test    (offline)	 0

Hope that helps,

Dirkjan

^ permalink raw reply

* Linux Kernel  not responding to ARP requests
From: Nitin Yadav @ 2012-09-27  6:51 UTC (permalink / raw)
  To: linux-kernel, kernelnewbies, netdev


[-- Attachment #1.1: Type: text/plain, Size: 2877 bytes --]

 

 

From: Nitin Yadav 
Sent: Thursday, September 27, 2012 12:16 PM
To: 'netdev@vger.kernel.org'
Subject: RE: Linux Kernel not responding to ARP requests

 

Hi All,

    I am facing loss of connectivity between Linux system (2.6.18
kernel) & Cisco switch (6509) when HSRP is enabled. The Cisco switches
(STAND BY) ARP queue were flushed and new MAC address were requests, but
the kernel did not answer to this requests. 

                After investigation I found out that the Cisco switch
(STAND BY) is flushing the MAC address of the kernel port. Based on
Cisco they are flush the MAC address of inactive port every 8 minutes. 

 

A small note about the protocol (HSRP):-

Hot Standby Router Protocol (HSRP) is a Cisco proprietary redundancy
protocol for establishing a fault-tolerant default gateway, and has been
described in detail in RFC 2281.

The protocol establishes a framework between network routers in order to
achieve default gateway failover if the primary gateway becomes
inaccessible,[1] in close association with a rapid-converging routing
protocol like EIGRP or OSPF. By multicasting packets, HSRP sends its
hello messages to the multicast address 224.0.0.2 (all routers) for
version 1, or 224.0.0.102 for version 2[2], using UDP port 1985, to
other HSRP-enabled routers, defining priority between the routers. The
primary router with the highest configured priority will act as a
virtual router with a pre-defined gateway IP address and will respond to
the ARP request from machines connected to the LAN with the MAC address
0000.0c07.acXX where XX is the group ID in hex. If the primary router
should fail, the router with the next-highest priority would take over
the gateway IP address and answer ARP requests with the same mac
address, thus achieving transparent default gateway fail-over. A HSRP
Basics Simulation visualizes Active/Standby election and link failover
with Hello, Coup, ARP Reply packets and timers.

 

My queries :-

Is there any way Kernel is dropping the ARP requests (from the Stand by
Router to the Kernel)? 

If it's not dropping, is there any other reason for not replying to ARP?

 

Thanks!

Nitin Yadav

 


Information transmitted by this e-mail is proprietary to MphasiS, its associated companies and/ or its customers and is intended 
for use only by the individual or entity to which it is addressed, and may contain information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended recipient or it appears that this mail has been forwarded 
to you without proper authority, you are notified that any use or dissemination of this information in any manner is strictly 
prohibited. In such cases, please notify us immediately at mailmaster@mphasis.com and delete this mail from your records.

[-- Attachment #1.2: Type: text/html, Size: 5951 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply

* [net-next PATCH 0/5] be2net: fixes
From: Sathya Perla @ 2012-09-27  6:32 UTC (permalink / raw)
  To: netdev; +Cc: Sathya Perla

Pls apply. Thanks.

Sathya Perla (5):
  be2net: remove type argument of be_cmd_mac_addr_query()
  be2net: fix wrong handling of be_setup() failure in be_probe()
  be2net: cleanup code related to be_link_status_query()
  be2net: get rid of AMAP_SET/GET macros in TX path
  be2net: fixup log messages

 drivers/net/ethernet/emulex/benet/be.h         |    1 -
 drivers/net/ethernet/emulex/benet/be_cmds.c    |   53 ++++++++---
 drivers/net/ethernet/emulex/benet/be_cmds.h    |    6 +-
 drivers/net/ethernet/emulex/benet/be_ethtool.c |   57 +++---------
 drivers/net/ethernet/emulex/benet/be_hw.h      |   62 ++++---------
 drivers/net/ethernet/emulex/benet/be_main.c    |  118 ++++++++++++------------
 6 files changed, 130 insertions(+), 167 deletions(-)

-- 
1.7.4

^ permalink raw reply

* [net-next PATCH 1/5] be2net: remove type argument of be_cmd_mac_addr_query()
From: Sathya Perla @ 2012-09-27  6:32 UTC (permalink / raw)
  To: netdev; +Cc: Sathya Perla
In-Reply-To: <1348727568-2011-1-git-send-email-sathya.perla@emulex.com>

All invocations of this routine use the same type value.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_cmds.c |    4 ++--
 drivers/net/ethernet/emulex/benet/be_cmds.h |    2 +-
 drivers/net/ethernet/emulex/benet/be_main.c |   18 ++++++------------
 3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
index 701b3e9..6fbfb20 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
@@ -717,7 +717,7 @@ int be_cmd_eq_create(struct be_adapter *adapter,
 
 /* Use MCC */
 int be_cmd_mac_addr_query(struct be_adapter *adapter, u8 *mac_addr,
-			u8 type, bool permanent, u32 if_handle, u32 pmac_id)
+			  bool permanent, u32 if_handle, u32 pmac_id)
 {
 	struct be_mcc_wrb *wrb;
 	struct be_cmd_req_mac_query *req;
@@ -734,7 +734,7 @@ int be_cmd_mac_addr_query(struct be_adapter *adapter, u8 *mac_addr,
 
 	be_wrb_cmd_hdr_prepare(&req->hdr, CMD_SUBSYSTEM_COMMON,
 		OPCODE_COMMON_NTWK_MAC_QUERY, sizeof(*req), wrb, NULL);
-	req->type = type;
+	req->type = MAC_ADDRESS_TYPE_NETWORK;
 	if (permanent) {
 		req->permanent = 1;
 	} else {
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h
index 250f19b..1f5b839 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.h
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.h
@@ -1687,7 +1687,7 @@ struct be_cmd_req_set_ext_fat_caps {
 extern int be_pci_fnum_get(struct be_adapter *adapter);
 extern int be_fw_wait_ready(struct be_adapter *adapter);
 extern int be_cmd_mac_addr_query(struct be_adapter *adapter, u8 *mac_addr,
-			u8 type, bool permanent, u32 if_handle, u32 pmac_id);
+				 bool permanent, u32 if_handle, u32 pmac_id);
 extern int be_cmd_pmac_add(struct be_adapter *adapter, u8 *mac_addr,
 			u32 if_id, u32 *pmac_id, u32 domain);
 extern int be_cmd_pmac_del(struct be_adapter *adapter, u32 if_id,
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 84379f4..fa17430 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -241,9 +241,8 @@ static int be_mac_addr_set(struct net_device *netdev, void *p)
 	if (!is_valid_ether_addr(addr->sa_data))
 		return -EADDRNOTAVAIL;
 
-	status = be_cmd_mac_addr_query(adapter, current_mac,
-				MAC_ADDRESS_TYPE_NETWORK, false,
-				adapter->if_handle, 0);
+	status = be_cmd_mac_addr_query(adapter, current_mac, false,
+				       adapter->if_handle, 0);
 	if (status)
 		goto err;
 
@@ -2693,21 +2692,16 @@ static int be_get_mac_addr(struct be_adapter *adapter, u8 *mac, u32 if_handle,
 		status = be_cmd_get_mac_from_list(adapter, mac,
 						  active_mac, pmac_id, 0);
 		if (*active_mac) {
-			status = be_cmd_mac_addr_query(adapter, mac,
-						       MAC_ADDRESS_TYPE_NETWORK,
-						       false, if_handle,
-						       *pmac_id);
+			status = be_cmd_mac_addr_query(adapter, mac, false,
+						       if_handle, *pmac_id);
 		}
 	} else if (be_physfn(adapter)) {
 		/* For BE3, for PF get permanent MAC */
-		status = be_cmd_mac_addr_query(adapter, mac,
-					       MAC_ADDRESS_TYPE_NETWORK, true,
-					       0, 0);
+		status = be_cmd_mac_addr_query(adapter, mac, true, 0, 0);
 		*active_mac = false;
 	} else {
 		/* For BE3, for VF get soft MAC assigned by PF*/
-		status = be_cmd_mac_addr_query(adapter, mac,
-					       MAC_ADDRESS_TYPE_NETWORK, false,
+		status = be_cmd_mac_addr_query(adapter, mac, false,
 					       if_handle, 0);
 		*active_mac = true;
 	}
-- 
1.7.4

^ permalink raw reply related

* [net-next PATCH 2/5] be2net: fix wrong handling of be_setup() failure in be_probe()
From: Sathya Perla @ 2012-09-27  6:32 UTC (permalink / raw)
  To: netdev; +Cc: Sathya Perla
In-Reply-To: <1348727568-2011-1-git-send-email-sathya.perla@emulex.com>


Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index fa17430..b712091 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3889,7 +3889,7 @@ static int __devinit be_probe(struct pci_dev *pdev,
 
 	status = be_setup(adapter);
 	if (status)
-		goto msix_disable;
+		goto stats_clean;
 
 	be_netdev_init(netdev);
 	status = register_netdev(netdev);
@@ -3910,8 +3910,6 @@ static int __devinit be_probe(struct pci_dev *pdev,
 
 unsetup:
 	be_clear(adapter);
-msix_disable:
-	be_msix_disable(adapter);
 stats_clean:
 	be_stats_cleanup(adapter);
 ctrl_clean:
-- 
1.7.4

^ permalink raw reply related

* [net-next PATCH 3/5] be2net: cleanup code related to be_link_status_query()
From: Sathya Perla @ 2012-09-27  6:32 UTC (permalink / raw)
  To: netdev; +Cc: Sathya Perla
In-Reply-To: <1348727568-2011-1-git-send-email-sathya.perla@emulex.com>

1) link_status_query() is always called to query the link-speed (speed
after applying qos). When there is no qos setting, link-speed is derived from
port-speed. Do all this inside this routine and hide this from the callers.

2) adpater->phy.forced_port_speed is not being set anywhere after being
initialized. Get rid of this variable.

3) Ignore async link_speed notifications till the initial value has been
fetched from FW.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be.h         |    1 -
 drivers/net/ethernet/emulex/benet/be_cmds.c    |   46 ++++++++++++++-----
 drivers/net/ethernet/emulex/benet/be_cmds.h    |    4 +-
 drivers/net/ethernet/emulex/benet/be_ethtool.c |   57 +++++-------------------
 drivers/net/ethernet/emulex/benet/be_main.c    |    4 +-
 5 files changed, 48 insertions(+), 64 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 5b622993..cf4c05b 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -337,7 +337,6 @@ struct phy_info {
 	u16 auto_speeds_supported;
 	u16 fixed_speeds_supported;
 	int link_speed;
-	int forced_port_speed;
 	u32 dac_cable_len;
 	u32 advertising;
 	u32 supported;
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
index 6fbfb20..46a19af 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
@@ -165,14 +165,13 @@ static void be_async_grp5_cos_priority_process(struct be_adapter *adapter,
 	}
 }
 
-/* Grp5 QOS Speed evt */
+/* Grp5 QOS Speed evt: qos_link_speed is in units of 10 Mbps */
 static void be_async_grp5_qos_speed_process(struct be_adapter *adapter,
 		struct be_async_event_grp5_qos_link_speed *evt)
 {
-	if (evt->physical_port == adapter->port_num) {
-		/* qos_link_speed is in units of 10 Mbps */
-		adapter->phy.link_speed = evt->qos_link_speed * 10;
-	}
+	if (adapter->phy.link_speed >= 0 &&
+	    evt->physical_port == adapter->port_num)
+		adapter->phy.link_speed = le16_to_cpu(evt->qos_link_speed) * 10;
 }
 
 /*Grp5 PVID evt*/
@@ -1326,9 +1325,28 @@ err:
 	return status;
 }
 
-/* Uses synchronous mcc */
-int be_cmd_link_status_query(struct be_adapter *adapter, u8 *mac_speed,
-			     u16 *link_speed, u8 *link_status, u32 dom)
+static int be_mac_to_link_speed(int mac_speed)
+{
+	switch (mac_speed) {
+	case PHY_LINK_SPEED_ZERO:
+		return 0;
+	case PHY_LINK_SPEED_10MBPS:
+		return 10;
+	case PHY_LINK_SPEED_100MBPS:
+		return 100;
+	case PHY_LINK_SPEED_1GBPS:
+		return 1000;
+	case PHY_LINK_SPEED_10GBPS:
+		return 10000;
+	}
+	return 0;
+}
+
+/* Uses synchronous mcc
+ * Returns link_speed in Mbps
+ */
+int be_cmd_link_status_query(struct be_adapter *adapter, u16 *link_speed,
+			     u8 *link_status, u32 dom)
 {
 	struct be_mcc_wrb *wrb;
 	struct be_cmd_req_link_status *req;
@@ -1357,11 +1375,13 @@ int be_cmd_link_status_query(struct be_adapter *adapter, u8 *mac_speed,
 	status = be_mcc_notify_wait(adapter);
 	if (!status) {
 		struct be_cmd_resp_link_status *resp = embedded_payload(wrb);
-		if (resp->mac_speed != PHY_LINK_SPEED_ZERO) {
-			if (link_speed)
-				*link_speed = le16_to_cpu(resp->link_speed);
-			if (mac_speed)
-				*mac_speed = resp->mac_speed;
+		if (link_speed) {
+			*link_speed = resp->link_speed ?
+				      le16_to_cpu(resp->link_speed) * 10 :
+				      be_mac_to_link_speed(resp->mac_speed);
+
+			if (!resp->logical_link_status)
+				*link_speed = 0;
 		}
 		if (link_status)
 			*link_status = resp->logical_link_status;
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h
index 1f5b839..0936e21 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.h
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.h
@@ -1714,8 +1714,8 @@ extern int be_cmd_q_destroy(struct be_adapter *adapter, struct be_queue_info *q,
 			int type);
 extern int be_cmd_rxq_destroy(struct be_adapter *adapter,
 			struct be_queue_info *q);
-extern int be_cmd_link_status_query(struct be_adapter *adapter, u8 *mac_speed,
-				    u16 *link_speed, u8 *link_status, u32 dom);
+extern int be_cmd_link_status_query(struct be_adapter *adapter, u16 *link_speed,
+				    u8 *link_status, u32 dom);
 extern int be_cmd_reset(struct be_adapter *adapter);
 extern int be_cmd_get_stats(struct be_adapter *adapter,
 			struct be_dma_mem *nonemb_cmd);
diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index c0e7006..8e6fb0b 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -512,28 +512,6 @@ static u32 convert_to_et_setting(u32 if_type, u32 if_speeds)
 	return val;
 }
 
-static int convert_to_et_speed(u32 be_speed)
-{
-	int et_speed = SPEED_10000;
-
-	switch (be_speed) {
-	case PHY_LINK_SPEED_10MBPS:
-		et_speed = SPEED_10;
-		break;
-	case PHY_LINK_SPEED_100MBPS:
-		et_speed = SPEED_100;
-		break;
-	case PHY_LINK_SPEED_1GBPS:
-		et_speed = SPEED_1000;
-		break;
-	case PHY_LINK_SPEED_10GBPS:
-		et_speed = SPEED_10000;
-		break;
-	}
-
-	return et_speed;
-}
-
 bool be_pause_supported(struct be_adapter *adapter)
 {
 	return (adapter->phy.interface_type == PHY_TYPE_SFP_PLUS_10GB ||
@@ -544,27 +522,16 @@ bool be_pause_supported(struct be_adapter *adapter)
 static int be_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd)
 {
 	struct be_adapter *adapter = netdev_priv(netdev);
-	u8 port_speed = 0;
-	u16 link_speed = 0;
 	u8 link_status;
-	u32 et_speed = 0;
+	u16 link_speed = 0;
 	int status;
 
-	if (adapter->phy.link_speed < 0 || !(netdev->flags & IFF_UP)) {
-		if (adapter->phy.forced_port_speed < 0) {
-			status = be_cmd_link_status_query(adapter, &port_speed,
-						&link_speed, &link_status, 0);
-			if (!status)
-				be_link_status_update(adapter, link_status);
-			if (link_speed)
-				et_speed = link_speed * 10;
-			else if (link_status)
-				et_speed = convert_to_et_speed(port_speed);
-		} else {
-			et_speed = adapter->phy.forced_port_speed;
-		}
-
-		ethtool_cmd_speed_set(ecmd, et_speed);
+	if (adapter->phy.link_speed < 0) {
+		status = be_cmd_link_status_query(adapter, &link_speed,
+						  &link_status, 0);
+		if (!status)
+			be_link_status_update(adapter, link_status);
+		ethtool_cmd_speed_set(ecmd, link_speed);
 
 		status = be_cmd_get_phy_info(adapter);
 		if (status)
@@ -773,8 +740,8 @@ static void
 be_self_test(struct net_device *netdev, struct ethtool_test *test, u64 *data)
 {
 	struct be_adapter *adapter = netdev_priv(netdev);
-	u8 mac_speed = 0;
-	u16 qos_link_speed = 0;
+	int status;
+	u8 link_status = 0;
 
 	memset(data, 0, sizeof(u64) * ETHTOOL_TESTS_NUM);
 
@@ -798,11 +765,11 @@ be_self_test(struct net_device *netdev, struct ethtool_test *test, u64 *data)
 		test->flags |= ETH_TEST_FL_FAILED;
 	}
 
-	if (be_cmd_link_status_query(adapter, &mac_speed,
-				     &qos_link_speed, NULL, 0) != 0) {
+	status = be_cmd_link_status_query(adapter, NULL, &link_status, 0);
+	if (status) {
 		test->flags |= ETH_TEST_FL_FAILED;
 		data[4] = -1;
-	} else if (!mac_speed) {
+	} else if (!link_status) {
 		test->flags |= ETH_TEST_FL_FAILED;
 		data[4] = 1;
 	}
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index b712091..4855dd6 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -2440,8 +2440,7 @@ static int be_open(struct net_device *netdev)
 		be_eq_notify(adapter, eqo->q.id, true, false, 0);
 	}
 
-	status = be_cmd_link_status_query(adapter, NULL, NULL,
-					  &link_status, 0);
+	status = be_cmd_link_status_query(adapter, NULL, &link_status, 0);
 	if (!status)
 		be_link_status_update(adapter, link_status);
 
@@ -2670,7 +2669,6 @@ static void be_setup_init(struct be_adapter *adapter)
 	adapter->be3_native = false;
 	adapter->promiscuous = false;
 	adapter->eq_next_idx = 0;
-	adapter->phy.forced_port_speed = -1;
 }
 
 static int be_get_mac_addr(struct be_adapter *adapter, u8 *mac, u32 if_handle,
-- 
1.7.4

^ permalink raw reply related

* [net-next PATCH 4/5] be2net: get rid of AMAP_SET/GET macros in TX path
From: Sathya Perla @ 2012-09-27  6:32 UTC (permalink / raw)
  To: netdev; +Cc: Sathya Perla
In-Reply-To: <1348727568-2011-1-git-send-email-sathya.perla@emulex.com>

The AMAP macros are used in be2net for setting and parsing bits in HW
descriptors. The macros do this by calculating the mask & offset of each
field from the AMAP structure definition.
In the TX patch, replace the usage of these macros with code to explicitly
shift & mask each field. Doing this reduces instructions and improves
readability.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_hw.h   |   62 ++++++++-------------------
 drivers/net/ethernet/emulex/benet/be_main.c |   58 ++++++++++---------------
 2 files changed, 41 insertions(+), 79 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_hw.h b/drivers/net/ethernet/emulex/benet/be_hw.h
index b755f70..32a84ab 100644
--- a/drivers/net/ethernet/emulex/benet/be_hw.h
+++ b/drivers/net/ethernet/emulex/benet/be_hw.h
@@ -273,56 +273,30 @@ struct be_eth_wrb {
 	u32 frag_len;		/* dword 3: bits 0 - 15 */
 } __packed;
 
-/* Pseudo amap definition for eth_hdr_wrb in which each bit of the
- * actual structure is defined as a byte : used to calculate
- * offset/shift/mask of each field */
-struct amap_eth_hdr_wrb {
-	u8 rsvd0[32];		/* dword 0 */
-	u8 rsvd1[32];		/* dword 1 */
-	u8 complete;		/* dword 2 */
-	u8 event;
-	u8 crc;
-	u8 forward;
-	u8 lso6;
-	u8 mgmt;
-	u8 ipcs;
-	u8 udpcs;
-	u8 tcpcs;
-	u8 lso;
-	u8 vlan;
-	u8 gso[2];
-	u8 num_wrb[5];
-	u8 lso_mss[14];
-	u8 len[16];		/* dword 3 */
-	u8 vlan_tag[16];
-} __packed;
+#define TX_HDR_WRB_COMPL		1		/* word 2 */
+#define TX_HDR_WRB_EVT			(1 << 1)	/* word 2 */
+#define TX_HDR_WRB_CRC			(1 << 2)	/* word 2 */
+#define TX_HDR_WRB_LSO6			(1 << 4)	/* word 2 */
+#define TX_HDR_WRB_IPCS			(1 << 6)	/* word 2 */
+#define TX_HDR_WRB_UDPCS		(1 << 7)	/* word 2 */
+#define TX_HDR_WRB_TCPCS		(1 << 8)	/* word 2 */
+#define TX_HDR_WRB_LSO			(1 << 9)	/* word 2 */
+#define TX_HDR_WRB_VLAN			(1 << 10)	/* word 2 */
+#define TX_HDR_WRB_NUM_SHIFT		13		/* word 2: bits 13:17 */
+#define TX_HDR_WRB_NUM_MASK		0x1F		/* word 2: bits 13:17 */
+#define TX_HDR_WRB_MSS_SHIFT		18		/* word 2: bits 18:31 */
+#define TX_HDR_WRB_MSS_MASK		0x3FFF		/* word 2: bits 18:31 */
+#define TX_HDR_WRB_LEN_MASK		0xFFFF		/* word 3: bits 0:15 */
+#define TX_HDR_WRB_VLAN_TCI_SHIFT	16		/* word 3: bits 16:31 */
+#define TX_HDR_WRB_VLAN_TCI_MASK	0xFFFF		/* word 3: bits 16:31 */
 
 struct be_eth_hdr_wrb {
 	u32 dw[4];
 };
 
 /* TX Compl Queue Descriptor */
-
-/* Pseudo amap definition for eth_tx_compl in which each bit of the
- * actual structure is defined as a byte: used to calculate
- * offset/shift/mask of each field */
-struct amap_eth_tx_compl {
-	u8 wrb_index[16];	/* dword 0 */
-	u8 ct[2]; 		/* dword 0 */
-	u8 port[2];		/* dword 0 */
-	u8 rsvd0[8];		/* dword 0 */
-	u8 status[4];		/* dword 0 */
-	u8 user_bytes[16];	/* dword 1 */
-	u8 nwh_bytes[8];	/* dword 1 */
-	u8 lso;			/* dword 1 */
-	u8 cast_enc[2];		/* dword 1 */
-	u8 rsvd1[5];		/* dword 1 */
-	u8 rsvd2[32];		/* dword 2 */
-	u8 pkts[16];		/* dword 3 */
-	u8 ringid[11];		/* dword 3 */
-	u8 hash_val[4];		/* dword 3 */
-	u8 valid;		/* dword 3 */
-} __packed;
+#define TX_COMPL_WRB_IDX_MASK		0xFFFF		/* word 0: bits 0:15 */
+#define TX_COMPL_VALID			(1 << 31)	/* word 3 */
 
 struct be_eth_tx_compl {
 	u32 dw[4];
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 4855dd6..c74906d 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -585,45 +585,34 @@ static int be_vlan_tag_chk(struct be_adapter *adapter, struct sk_buff *skb)
 static void wrb_fill_hdr(struct be_adapter *adapter, struct be_eth_hdr_wrb *hdr,
 		struct sk_buff *skb, u32 wrb_cnt, u32 len)
 {
-	u16 vlan_tag;
-
-	memset(hdr, 0, sizeof(*hdr));
-
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, crc, hdr, 1);
+	u32 dw2 = 0, dw3 = 0;
 
 	if (skb_is_gso(skb)) {
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, lso, hdr, 1);
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, lso_mss,
-			hdr, skb_shinfo(skb)->gso_size);
+		dw2 |= TX_HDR_WRB_LSO;
+		dw2 |= (skb_shinfo(skb)->gso_size & TX_HDR_WRB_MSS_MASK) <<
+			TX_HDR_WRB_MSS_SHIFT;
 		if (skb_is_gso_v6(skb) && !lancer_chip(adapter))
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, lso6, hdr, 1);
-		if (lancer_chip(adapter) && adapter->sli_family  ==
-							LANCER_A0_SLI_FAMILY) {
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, ipcs, hdr, 1);
-			if (is_tcp_pkt(skb))
-				AMAP_SET_BITS(struct amap_eth_hdr_wrb,
-								tcpcs, hdr, 1);
-			else if (is_udp_pkt(skb))
-				AMAP_SET_BITS(struct amap_eth_hdr_wrb,
-								udpcs, hdr, 1);
-		}
+			dw2 |= TX_HDR_WRB_LSO6;
 	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		if (is_tcp_pkt(skb))
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, tcpcs, hdr, 1);
+			dw2 |= TX_HDR_WRB_TCPCS;
 		else if (is_udp_pkt(skb))
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, udpcs, hdr, 1);
+			dw2 |= TX_HDR_WRB_UDPCS;
 	}
 
 	if (vlan_tx_tag_present(skb)) {
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, vlan, hdr, 1);
-		vlan_tag = be_get_tx_vlan_tag(adapter, skb);
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, vlan_tag, hdr, vlan_tag);
+		dw2 |= TX_HDR_WRB_VLAN;
+		dw3 = (be_get_tx_vlan_tag(adapter, skb) & 0xFFFF) <<
+				TX_HDR_WRB_VLAN_TCI_SHIFT;
 	}
+	dw2 |= TX_HDR_WRB_CRC | TX_HDR_WRB_EVT | TX_HDR_WRB_COMPL |
+		(wrb_cnt & TX_HDR_WRB_NUM_MASK) << TX_HDR_WRB_NUM_SHIFT;
+	dw3 |= len & TX_HDR_WRB_LEN_MASK;
 
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, event, hdr, 1);
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, complete, hdr, 1);
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, num_wrb, hdr, wrb_cnt);
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, len, hdr, len);
+	hdr->dw[2] = dw2;
+	hdr->dw[3] = dw3;
+	hdr->dw[0] = 0;
+	hdr->dw[1] = 0;
 }
 
 static void unmap_tx_frag(struct device *dev, struct be_eth_wrb *wrb,
@@ -1554,13 +1543,14 @@ static struct be_eth_tx_compl *be_tx_compl_get(struct be_queue_info *tx_cq)
 {
 	struct be_eth_tx_compl *txcp = queue_tail_node(tx_cq);
 
-	if (txcp->dw[offsetof(struct amap_eth_tx_compl, valid) / 32] == 0)
+	/* valid bit is bit 31 of dw[3] */
+	if (!txcp->dw[3])
 		return NULL;
 
 	rmb();
 	be_dws_le_to_cpu(txcp, sizeof(*txcp));
 
-	txcp->dw[offsetof(struct amap_eth_tx_compl, valid) / 32] = 0;
+	txcp->dw[3] = 0;
 
 	queue_tail_inc(tx_cq);
 	return txcp;
@@ -1686,9 +1676,7 @@ static void be_tx_compl_clean(struct be_adapter *adapter)
 		for_all_tx_queues(adapter, txo, i) {
 			txq = &txo->q;
 			while ((txcp = be_tx_compl_get(&txo->cq))) {
-				end_idx =
-					AMAP_GET_BITS(struct amap_eth_tx_compl,
-						      wrb_index, txcp);
+				end_idx = txcp->dw[0] & TX_COMPL_WRB_IDX_MASK;
 				num_wrbs += be_tx_compl_process(adapter, txo,
 								end_idx);
 				cmpl++;
@@ -2040,8 +2028,8 @@ static bool be_process_tx(struct be_adapter *adapter, struct be_tx_obj *txo,
 		if (!txcp)
 			break;
 		num_wrbs += be_tx_compl_process(adapter, txo,
-				AMAP_GET_BITS(struct amap_eth_tx_compl,
-					wrb_index, txcp));
+						txcp->dw[0] &
+						TX_COMPL_WRB_IDX_MASK);
 	}
 
 	if (work_done) {
-- 
1.7.4

^ permalink raw reply related

* [net-next PATCH 5/5] be2net: fixup log messages
From: Sathya Perla @ 2012-09-27  6:32 UTC (permalink / raw)
  To: netdev; +Cc: Sathya Perla
In-Reply-To: <1348727568-2011-1-git-send-email-sathya.perla@emulex.com>

Added and modified a few log messages mostly in probe path.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_cmds.c |    3 ++
 drivers/net/ethernet/emulex/benet/be_main.c |   34 ++++++++++++++++++++++----
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
index 46a19af..af60bb2 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
@@ -2425,6 +2425,9 @@ int be_cmd_req_native_mode(struct be_adapter *adapter)
 		struct be_cmd_resp_set_func_cap *resp = embedded_payload(wrb);
 		adapter->be3_native = le32_to_cpu(resp->cap_flags) &
 					CAPABILITY_BE3_NATIVE_ERX_API;
+		if (!adapter->be3_native)
+			dev_warn(&adapter->pdev->dev,
+				 "adapter not in advanced mode\n");
 	}
 err:
 	mutex_unlock(&adapter->mbox_lock);
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index c74906d..b7f9bf7 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1884,6 +1884,8 @@ static int be_tx_qs_create(struct be_adapter *adapter)
 			return status;
 	}
 
+	dev_info(&adapter->pdev->dev, "created %d TX queue(s)\n",
+		 adapter->num_tx_qs);
 	return 0;
 }
 
@@ -1934,10 +1936,9 @@ static int be_rx_cqs_create(struct be_adapter *adapter)
 			return rc;
 	}
 
-	if (adapter->num_rx_qs != MAX_RX_QS)
-		dev_info(&adapter->pdev->dev,
-			"Created only %d receive queues\n", adapter->num_rx_qs);
-
+	dev_info(&adapter->pdev->dev,
+		 "created %d RSS queue(s) and 1 default RX queue\n",
+		 adapter->num_rx_qs - 1);
 	return 0;
 }
 
@@ -2175,6 +2176,7 @@ static void be_msix_enable(struct be_adapter *adapter)
 {
 #define BE_MIN_MSIX_VECTORS		1
 	int i, status, num_vec, num_roce_vec = 0;
+	struct device *dev = &adapter->pdev->dev;
 
 	/* If RSS queues are not used, need a vec for default RX Q */
 	num_vec = min(be_num_rss_want(adapter), num_online_cpus());
@@ -2199,6 +2201,8 @@ static void be_msix_enable(struct be_adapter *adapter)
 				num_vec) == 0)
 			goto done;
 	}
+
+	dev_warn(dev, "MSIx enable failed\n");
 	return;
 done:
 	if (be_roce_supported(adapter)) {
@@ -2212,6 +2216,7 @@ done:
 		}
 	} else
 		adapter->num_msix_vec = num_vec;
+	dev_info(dev, "enabled %d MSI-x vector(s)\n", adapter->num_msix_vec);
 	return;
 }
 
@@ -3785,6 +3790,23 @@ static bool be_reset_required(struct be_adapter *adapter)
 	return be_find_vfs(adapter, ENABLED) > 0 ? false : true;
 }
 
+static char *mc_name(struct be_adapter *adapter)
+{
+	if (adapter->function_mode & FLEX10_MODE)
+		return "FLEX10";
+	else if (adapter->function_mode & VNIC_MODE)
+		return "vNIC";
+	else if (adapter->function_mode & UMC_ENABLED)
+		return "UMC";
+	else
+		return "";
+}
+
+static inline char *func_name(struct be_adapter *adapter)
+{
+	return be_physfn(adapter) ? "PF" : "VF";
+}
+
 static int __devinit be_probe(struct pci_dev *pdev,
 			const struct pci_device_id *pdev_id)
 {
@@ -3889,8 +3911,8 @@ static int __devinit be_probe(struct pci_dev *pdev,
 
 	be_cmd_query_port_name(adapter, &port_name);
 
-	dev_info(&pdev->dev, "%s: %s port %c\n", netdev->name, nic_name(pdev),
-		 port_name);
+	dev_info(&pdev->dev, "%s: %s %s port %c\n", nic_name(pdev),
+		 func_name(adapter), mc_name(adapter), port_name);
 
 	return 0;
 
-- 
1.7.4

^ permalink raw reply related

* Re: [PATCH 5/5] smsc95xx: enable power saving mode during system suspend
From: Steve Glendinning @ 2012-09-27  8:04 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: netdev
In-Reply-To: <87mx0c92s0.fsf@nemi.mork.no>

On 26 September 2012 17:17, Bjørn Mork <bjorn@mork.no> wrote:
> Yes, but you are a lot less likely to know about it if you BUG out.  The
> user will be left with no other choice than hitting reset or poweroff.
> What's the point of that?
>
> If your driver crashes but the machine is left running, then the user
> may forward the Oops to you.  That's much more useful.

Good point, I hadn't considered that.

So for user reportability am I better off to use WARN_ON in this case,
or simply remove the check and let the null pointer dereference
happen?

-Steve

^ permalink raw reply

* Offer weight loss and sex product from China
From: alan @ 2012-09-27  7:56 UTC (permalink / raw)
  To: netdev

Dear friend,
 
Have a nice day ,this is Andy from my-lifestar Shenzhen China
 
we got some new product in our website,you would be interested 
 
Any problems about our product free to contact me 
 
Best regards
 

--------------------------------------------------------------------------------

Email: sales@my-lifestar.com 
MSN: lifestarsales@gmail.com
Skype: lifestarsales
trade manager: cnmylifestar
Website: www.my-lifestar.com http://www.aliexpress.com/store/706477

Sincerely yours
Andy

Lifestar Biotechnology Co.,Ltd. 
Add: Room 512,Shangtang Commercial Building,Industrial Road,Longhua Town,Bao'an District,Shenzhen,China  
Tel:  +86-755-81475862-805
Fax:  +86-755-81475862-808

^ permalink raw reply

* Re: bnx2x: link detected up at startup even when it should be down
From: Dmitry Kravkov @ 2012-09-27  8:21 UTC (permalink / raw)
  To: Jean-Michel Hautbois
  Cc: netdev, Barak Witkowski, Eilon Greenstein, davem@davemloft.net
In-Reply-To: <CAL8zT=i4tdneZVopZGtHVhMArGMH5n=BVrtkaNSVJPX=RtR6OQ@mail.gmail.com>

On Tue, 2012-09-25 at 16:00 +0200, Jean-Michel Hautbois wrote:
> 2012/9/25 Jean-Michel Hautbois <jhautbois@gmail.com>:
> > 2012/9/25 Dmitry Kravkov <dmitry@broadcom.com>:
> >>> -----Original Message-----
> >>> From: Jean-Michel Hautbois [mailto:jhautbois@gmail.com]
> >>> Sent: Tuesday, September 25, 2012 2:54 PM
> >>> To: Dmitry Kravkov
> >>> Cc: netdev; Barak Witkowski; Eilon Greenstein; davem@davemloft.net
> >>> Subject: Re: bnx2x: link detected up at startup even when it should be down
> >>
> >>
> FYI, eth4 and eth5 are seen UP and they should be down.
It looks driver misses DCC "disable/enable" update

Can you pls test this simple patch?

>From 1efa0314f4b912f474089f5f8375d37bc265a502 Mon Sep 17 00:00:00 2001
From: Dmitry Kravkov <dmitry@broadcom.com>
Date: Fri, 22 Mar 2013 05:12:02 +0200
Subject: [PATCH] bnx2x: DCC disable/enable update can be missed by driver

As a result of missed update, OS may be updated with wrong link status.
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 2f6361e..3962d57 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -3445,6 +3445,7 @@ static inline void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
 			int func = BP_FUNC(bp);
 
 			REG_WR(bp, MISC_REG_AEU_GENERAL_ATTN_12 + func*4, 0);
+			bnx2x_read_mf_cfg(bp);
 			bp->mf_config[BP_VN(bp)] = MF_CFG_RD(bp,
 					func_mf_config[BP_ABS_FUNC(bp)].config);
 			val = SHMEM_RD(bp,
-- 
1.7.1



> JM
> 

^ permalink raw reply related

* Re: ixgbe unstable performance at 1Gb/s
From: Charles Vejnar @ 2012-09-27  8:32 UTC (permalink / raw)
  To: Tantilov, Emil S; +Cc: netdev@vger.kernel.org, Kirsher, Jeffrey T
In-Reply-To: <87618083B2453E4A8714035B62D6799216E25F9C@FMSMSX102.amr.corp.intel.com>

Le 26/09/2012 22:22, Tantilov, Emil S a écrit :
>> -----Original Message-----
>> From: Charles Vejnar [mailto:Charles.Vejnar@unige.ch]
>> Sent: Wednesday, September 26, 2012 11:33 AM
>> To: Tantilov, Emil S; netdev@vger.kernel.org
>> Subject: Re: ixgbe unstable performance at 1Gb/s
>>
>> Le 25/09/2012 19:58, Tantilov, Emil S a écrit :
>>>> -----Original Message-----
>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> On
>>>> Behalf Of Charles
>>>> Sent: Monday, September 24, 2012 10:47 AM
>>>> To: netdev@vger.kernel.org
>>>> Subject: ixgbe unstable performance at 1Gb/s
>>>>
>>>> Hi,
>>>>
>>>> I hope I am posting on the right mailing-list. If not, sorry; please
>>>> redirect me
>>>> to the right place. Thanks.
>>>>
>>>> I have a new motherboard with integrated Intel X540 10GBase-T. For now,
>> I
>>>> want
>>>> to use it at 1Gb/s.
>>>>
>>>> The bandwidth is only of ~300 Mbit/s (with Iperf). It's actually very
>>>> unstable
>>>> (always varies between 100 to 800 Mbit/s during the transfer).
>>> Do you by any chance have CONFIG_IXGBE_PTP set in your kernel config?
>>>
>>> If so, try disabling it and see if it fixes your performance.
>>>
>>> Thanks,
>>> Emil
>>>
>> Hi,
>>
>> Thanks for your reply.
>>
>> I compiled manually the ixgbe module with the default options of my
>> distribution (Archlinux). I had the same problem.
>>
>> I then changed the CONFIG_IXGBE_PTP to no (default is yes) as you
>> suggested, and recompiled. The problem disappeared; normal transfer.
>>
>> Could you please explain why this PTP is causing a problem? How can this
>> be fixed without having to recompile the module (ethtool, /sys, bios...
>> )? Thanks
> This is actually a bug in the driver. We should have a patch out very soon to address it.
>
>> Regards,
>>
>> Charles
> Thanks,
> Emil
Hi Emil,

 > This is actually a bug in the driver. We should have a patch out very 
soon to address it.

OK. Please send a reply to this thread when it's available. I can test 
the patch.

Thanks

Charles

^ permalink raw reply

* [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT
From: Cong Wang @ 2012-09-27  8:41 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Alexey Kuznetsov, Patrick McHardy, Eric Dumazet,
	Neil Horman, Cong Wang

Some customer requests this feature, as they stated:

	"This parameter is necessary, especially for software that continually 
        creates many ephemeral processes which open sockets, to avoid socket 
        exhaustion. In many cases, the risk of the exhaustion can be reduced by 
        tuning reuse interval to allow sockets to be reusable earlier.

        In commercial Unix systems, this kind of parameters, such as 
        tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
        already been available. Their implementations allow users to tune 
        how long they keep TCP connection as TIME-WAIT state on the 
        millisecond time scale."

We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
are not equivalent in that they cannot be tuned directly on the time
scale nor in a safe way, as some combinations of tunings could still
cause some problem in NAT. And, I think second scale is enough, we don't
have to make it in millisecond time scale.

See also: https://lkml.org/lkml/2008/11/15/80

Any comments?

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Cong Wang <amwang@redhat.com>

---
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index c7fc107..4b24398 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -520,6 +520,12 @@ tcp_tw_reuse - BOOLEAN
 	It should not be changed without advice/request of technical
 	experts.
 
+tcp_tw_interval - INTEGER
+	Specify the timeout, in seconds, of TIME-WAIT sockets.
+	It should not be changed without advice/request of technical
+	experts.
+	Default: 60
+
 tcp_window_scaling - BOOLEAN
 	Enable window scaling as defined in RFC1323.
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6feeccd..72f92a1 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -114,9 +114,10 @@ extern void tcp_time_wait(struct sock *sk, int state, int timeo);
 				 * initial RTO.
 				 */
 
-#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT
-				  * state, about 60 seconds	*/
-#define TCP_FIN_TIMEOUT	TCP_TIMEWAIT_LEN
+#define TCP_TIMEWAIT_LEN (sysctl_tcp_tw_interval * HZ)
+				 /* how long to wait to destroy TIME-WAIT
+				  * state, default 60 seconds	*/
+#define TCP_FIN_TIMEOUT	(60*HZ)
                                  /* BSD style FIN_WAIT2 deadlock breaker.
 				  * It used to be 3min, new value is 60sec,
 				  * to combine FIN-WAIT-2 timeout with
@@ -292,6 +293,7 @@ extern int sysctl_tcp_thin_dupack;
 extern int sysctl_tcp_early_retrans;
 extern int sysctl_tcp_limit_output_bytes;
 extern int sysctl_tcp_challenge_ack_limit;
+extern int sysctl_tcp_tw_interval;
 
 extern atomic_long_t tcp_memory_allocated;
 extern struct percpu_counter tcp_sockets_allocated;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 9205e49..f99cacf 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -27,6 +27,7 @@
 #include <net/tcp_memcontrol.h>
 
 static int zero;
+static int one = 1;
 static int two = 2;
 static int tcp_retr1_max = 255;
 static int ip_local_port_range_min[] = { 1, 1 };
@@ -271,6 +272,28 @@ bad_key:
 	return ret;
 }
 
+static int proc_tcp_tw_interval(ctl_table *ctl, int write,
+				void __user *buffer, size_t *lenp,
+				loff_t *ppos)
+{
+	int ret;
+	ctl_table tmp = {
+		.data = &sysctl_tcp_tw_interval,
+		.maxlen = sizeof(int),
+		.mode = ctl->mode,
+		.extra1 = &one,
+	};
+
+	ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+	if (ret)
+		return ret;
+	if (write)
+		tcp_death_row.period = (HZ / INET_TWDR_TWKILL_SLOTS)
+				       * sysctl_tcp_tw_interval;
+
+	return 0;
+}
+
 static struct ctl_table ipv4_table[] = {
 	{
 		.procname	= "tcp_timestamps",
@@ -794,6 +817,13 @@ static struct ctl_table ipv4_table[] = {
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= &zero
 	},
+	{
+		.procname	= "tcp_tw_interval",
+		.data		= &sysctl_tcp_tw_interval,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_tcp_tw_interval,
+	},
 	{ }
 };
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 93406c5..64af0b6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -86,6 +86,7 @@
 #include <linux/scatterlist.h>
 
 int sysctl_tcp_tw_reuse __read_mostly;
+int sysctl_tcp_tw_interval __read_mostly = 60;
 int sysctl_tcp_low_latency __read_mostly;
 EXPORT_SYMBOL(sysctl_tcp_low_latency);
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 27536ba..e16f524 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -34,7 +34,7 @@ int sysctl_tcp_abort_on_overflow __read_mostly;
 
 struct inet_timewait_death_row tcp_death_row = {
 	.sysctl_max_tw_buckets = NR_FILE * 2,
-	.period		= TCP_TIMEWAIT_LEN / INET_TWDR_TWKILL_SLOTS,
+	.period		= (60 * HZ) / INET_TWDR_TWKILL_SLOTS,
 	.death_lock	= __SPIN_LOCK_UNLOCKED(tcp_death_row.death_lock),
 	.hashinfo	= &tcp_hashinfo,
 	.tw_timer	= TIMER_INITIALIZER(inet_twdr_hangman, 0,

^ permalink raw reply related

* Re: [PATCH V4 0/7] ipvs: IPv6 fragment handling for IPVS
From: Simon Horman @ 2012-09-27  8:46 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Jesper Dangaard Brouer, Hans Schillstrom, Hans Schillstrom,
	netdev, Pablo Neira Ayuso, lvs-devel, Patrick McHardy,
	Thomas Graf, Wensong Zhang, netfilter-devel
In-Reply-To: <alpine.LFD.2.00.1209262347280.2156@ja.ssi.bg>

On Wed, Sep 26, 2012 at 11:50:02PM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Wed, 26 Sep 2012, Jesper Dangaard Brouer wrote:
> 
> > The following patchset implement IPv6 fragment handling for IPVS.
> > 
> > This work is based upon patches from Hans Schillstrom.  I have taken
> > over the patchset, in close agreement with Hans, because he don't have
> > (gotten allocated) time to complete his work.
> > 
> > I have cleaned up the patchset significantly, and split the patchset
> > up into seven patches.
> > 
> > The first 3 patches, are ready to be merged
> > 
> >  Patch01: Trivial changes, use compressed IPv6 address in output
> >  Patch02: IPv6 extend ICMPv6 handling for future types
> >  Patch03: Use config macro IS_ENABLED()
> > 
> > The next 4 patches, is V4 of the patches I have submitted earlier.
> > Where I have incorporated Julian's recent feedback.
> > 
> > - Notice that patch04 of patchset V3, have been dropped.
> > 
> > I have also tried to make the patches easier to review, by
> > reorganizing the changes, to be more strictly split (exthdr
> > vs. fragment handling).
> > 
> > I have also removed the API changes, and moved those to patch06.  This
> > is done, (1) to make it easier to review the patches, and (2) to allow
> > easier integration of Patricks idea and my RFC patch of caching exthdr
> > info in skb->cb[].  Thus, we can get these patches applied (and later
> > go back and apply the caching scheme easier).
> > 
> >  Patch04: Fix faulty IPv6 extension header handling in IPVS
> >  Patch05: Complete IPv6 fragment handling for IPVS
> >  Patch06: IPVS API change to avoid rescan of IPv6 exthdr
> >  Patch07: IPVS SIP fragment handling
> > 
> > The SIP frag handling have been split into its own patch, as I have
> > not been able to test this part my self.
> > 
> > This patchset is based upon:
> >   Pablo's nf-next tree:  git://1984.lsi.us.es/nf-next
> >   On top of:
> >     commit 2cbc78a29e76a2e92c172651204f3117491877d2
> >     (netfilter: combine ipt_REDIRECT and ip6t_REDIRECT)
> > 
> > ---
> > 
> > Jesper Dangaard Brouer (7):
> >       ipvs: SIP fragment handling
> >       ipvs: API change to avoid rescan of IPv6 exthdr
> >       ipvs: Complete IPv6 fragment handling for IPVS
> >       ipvs: Fix faulty IPv6 extension header handling in IPVS
> >       ipvs: Use config macro IS_ENABLED()
> >       ipvs: IPv6 extend ICMPv6 handling for future types
> >       ipvs: Trivial changes, use compressed IPv6 address in output
> 
> 	All 7 patches look good to me. Thanks!
> 
> Acked-by: Julian Anastasov <ja@ssi.bg>

Thanks, I aim to review these tomorrow.

^ permalink raw reply

* Re: bnx2x: link detected up at startup even when it should be down
From: Jean-Michel Hautbois @ 2012-09-27 10:07 UTC (permalink / raw)
  To: Dmitry Kravkov
  Cc: netdev, Barak Witkowski, Eilon Greenstein, davem@davemloft.net
In-Reply-To: <1348734095.7217.46.camel@lb-tlvb-dmitry>

2012/9/27 Dmitry Kravkov <dmitry@broadcom.com>:
> On Tue, 2012-09-25 at 16:00 +0200, Jean-Michel Hautbois wrote:
>> 2012/9/25 Jean-Michel Hautbois <jhautbois@gmail.com>:
>> > 2012/9/25 Dmitry Kravkov <dmitry@broadcom.com>:
>> >>> -----Original Message-----
>> >>> From: Jean-Michel Hautbois [mailto:jhautbois@gmail.com]
>> >>> Sent: Tuesday, September 25, 2012 2:54 PM
>> >>> To: Dmitry Kravkov
>> >>> Cc: netdev; Barak Witkowski; Eilon Greenstein; davem@davemloft.net
>> >>> Subject: Re: bnx2x: link detected up at startup even when it should be down
>> >>
>> >>
>> FYI, eth4 and eth5 are seen UP and they should be down.
> It looks driver misses DCC "disable/enable" update
>
> Can you pls test this simple patch?
>
> From 1efa0314f4b912f474089f5f8375d37bc265a502 Mon Sep 17 00:00:00 2001
> From: Dmitry Kravkov <dmitry@broadcom.com>
> Date: Fri, 22 Mar 2013 05:12:02 +0200
> Subject: [PATCH] bnx2x: DCC disable/enable update can be missed by driver
>
> As a result of missed update, OS may be updated with wrong link status.
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index 2f6361e..3962d57 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -3445,6 +3445,7 @@ static inline void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
>                         int func = BP_FUNC(bp);
>
>                         REG_WR(bp, MISC_REG_AEU_GENERAL_ATTN_12 + func*4, 0);
> +                       bnx2x_read_mf_cfg(bp);
>                         bp->mf_config[BP_VN(bp)] = MF_CFG_RD(bp,
>                                         func_mf_config[BP_ABS_FUNC(bp)].config);
>                         val = SHMEM_RD(bp,
> --
> 1.7.1
>
>
>
>> JM
>>

This patch does not work (no change at all).

Regards,
JM

^ permalink raw reply

* Re: [PATCH 2/2] net: ti cpsw ethernet: set IFCTL_{A,B} bits for RMII mode
From: Daniel Mack @ 2012-09-27 11:42 UTC (permalink / raw)
  To: N, Mugunthan V
  Cc: netdev@vger.kernel.org, devicetree-discuss@lists.ozlabs.org,
	Hiremath, Vaibhav, David S. Miller
In-Reply-To: <EB1619762EAF8B4E97A227FB77B7E0293E9EBB74@DBDE01.ent.ti.com>

On 26.09.2012 20:50, N, Mugunthan V wrote:
>> For RMII mode operation in 100Mbps, the CPSW needs to set the
>> IFCTL_A / IFCTL_B bits in the MACCONTROL register.
>>
>> Signed-off-by: Daniel Mack <zonque@gmail.com>
>> Cc: Mugunthan V N <mugunthanvnm@ti.com>
>> Cc: Vaibhav Hiremath <hvaibhav@ti.com>
>> Cc: David S. Miller <davem@davemloft.net>
>> ---
>>  drivers/net/ethernet/ti/cpsw.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/ti/cpsw.c
>> b/drivers/net/ethernet/ti/cpsw.c
>> index 3d7594e..d88dbfa 100644
>> --- a/drivers/net/ethernet/ti/cpsw.c
>> +++ b/drivers/net/ethernet/ti/cpsw.c
>> @@ -386,6 +386,12 @@ static void _cpsw_adjust_link(struct cpsw_slave
>> *slave,
>>  			mac_control |= BIT(7);	/* GIGABITEN	*/
>>  		if (phy->duplex)
>>  			mac_control |= BIT(0);	/* FULLDUPLEXEN	*/
>> +
>> +		/* set speed_in input in case RMII mode is used in >10Mbps
>> */
>> +		if (phy->speed > 10 && slave->slave_num < 2 &&
>> +		    phy->interface == PHY_INTERFACE_MODE_RMII)
>> +			mac_control |= BIT(15 + slave->slave_num);
> 
> Mac control register is separate for both the slaves and has same bit definitions,
> Bit 15 has to be set for 100Mbps link for RMII and RGMII Phy interface to control
> the RMII/RGMII gasket and in GMII this bit is Un-used by CPSW.
> For slave 1, Bit 16 is set with the above code which is not used control the
> RMII/RGMII gasket control. So it is not required to pass the Phy mode from DT.
> This patch has to be reworked to set Bit 15 with any Phy mode connected.

Hmm, that's interesting. I read the datasheet differently, but I believe
you're right.

> The original driver present was tested with GMII (Beagle Bone A5) and
> RGMII (AM3358 EVM) phy , but CPSW works fine without setting this bit in
> RGMII phymode so this issue was not caught in testing.

Yes, it used to work fine for me too until the hardware was reworked
from RGMII to RMII :)

Thanks a lot for the review - I just tested that setting bit 15 for all
PHY interface modes works for me as well, so I'm fine with that
solution. Will repost a new patch.


Daniel

^ permalink raw reply

* Re: bnx2x: link detected up at startup even when it should be down
From: Dmitry Kravkov @ 2012-09-27 11:43 UTC (permalink / raw)
  To: Jean-Michel Hautbois
  Cc: netdev, Barak Witkowski, Eilon Greenstein, davem@davemloft.net
In-Reply-To: <CAL8zT=jFRJt5OT_nJWWocKjWJJ3LajKrA1f7Zy5wMKz9V-DHXA@mail.gmail.com>

On Thu, 2012-09-27 at 12:07 +0200, Jean-Michel Hautbois wrote:
> 2012/9/27 Dmitry Kravkov <dmitry@broadcom.com>:
> > On Tue, 2012-09-25 at 16:00 +0200, Jean-Michel Hautbois wrote:
> >> 2012/9/25 Jean-Michel Hautbois <jhautbois@gmail.com>:
> >> > 2012/9/25 Dmitry Kravkov <dmitry@broadcom.com>:
> >> >>> -----Original Message-----
> >> >>> From: Jean-Michel Hautbois [mailto:jhautbois@gmail.com]
> >> >>> Sent: Tuesday, September 25, 2012 2:54 PM
> >> >>> To: Dmitry Kravkov
> >> >>> Cc: netdev; Barak Witkowski; Eilon Greenstein; davem@davemloft.net
> >> >>> Subject: Re: bnx2x: link detected up at startup even when it should be down
> >> >>
> >> >>
> >> FYI, eth4 and eth5 are seen UP and they should be down.
> > It looks driver misses DCC "disable/enable" update
> >
> > Can you pls test this simple patch?
> >
> > From 1efa0314f4b912f474089f5f8375d37bc265a502 Mon Sep 17 00:00:00 2001
> > From: Dmitry Kravkov <dmitry@broadcom.com>
> > Date: Fri, 22 Mar 2013 05:12:02 +0200
> > Subject: [PATCH] bnx2x: DCC disable/enable update can be missed by driver
> >
> > As a result of missed update, OS may be updated with wrong link status.
> > ---
> >  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > index 2f6361e..3962d57 100644
> > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > @@ -3445,6 +3445,7 @@ static inline void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
> >                         int func = BP_FUNC(bp);
> >
> >                         REG_WR(bp, MISC_REG_AEU_GENERAL_ATTN_12 + func*4, 0);
> > +                       bnx2x_read_mf_cfg(bp);
> >                         bp->mf_config[BP_VN(bp)] = MF_CFG_RD(bp,
> >                                         func_mf_config[BP_ABS_FUNC(bp)].config);
> >                         val = SHMEM_RD(bp,
> > --
> > 1.7.1
> >
> >
> >
> >> JM
> >>
> 
> This patch does not work (no change at all).
> 

Since bnx2x.debug didn't work for you, Let's try to force some debug
messages from inside the driver:

---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 3962d57..c6aeac0 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -10791,7 +10791,7 @@ static int __devinit bnx2x_init_one(struct pci_dev *pdev,
 			  tx_count, rx_count);
 
 	bp->igu_sb_cnt = max_non_def_sbs;
-	bp->msg_enable = debug;
+	bp->msg_enable = debug | BNX2X_MSG_MCP | NETIF_MSG_PROBE | NETIF_MSG_LINK;
 	pci_set_drvdata(pdev, dev);
 
 	rc = bnx2x_init_dev(pdev, dev, ent->driver_data);
-- 
1.7.1

Can you post outputs then?

Thanks

> Regards,
> JM
> 

^ permalink raw reply related

* [PATCH] net: ti cpsw ethernet: set IFCTL_A bit in MACCONTROL
From: Daniel Mack @ 2012-09-27 11:50 UTC (permalink / raw)
  To: netdev; +Cc: Daniel Mack, Mugunthan V N, Vaibhav Hiremath, David S. Miller

For RMII/RGMII mode operation in 100Mbps, the CPSW needs to set the
IFCTL_A bits in the MACCONTROL register. For all other PHY modes, this
bit is unused, so setting it unconditionally shouldn't cause any
trouble.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: Mugunthan V N <mugunthanvnm@ti.com>
Cc: Vaibhav Hiremath <hvaibhav@ti.com>
Cc: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/ti/cpsw.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index aa78168..b764f75 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -386,6 +386,11 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave,
 			mac_control |= BIT(7);	/* GIGABITEN	*/
 		if (phy->duplex)
 			mac_control |= BIT(0);	/* FULLDUPLEXEN	*/
+
+		/* set speed_in input in case RMII mode is used in >10Mbps */
+		if (phy->speed > 10)
+			mac_control |= BIT(15);
+
 		*link = true;
 	} else {
 		mac_control = 0;
-- 
1.7.11.4

^ permalink raw reply related

* Re: Possible networking regression in 3.6.0
From: Chris Clayton @ 2012-09-27 11:50 UTC (permalink / raw)
  To: Chris Clayton; +Cc: Eric Dumazet, netdev, gpiez
In-Reply-To: <505D5A18.2080507@googlemail.com>

Just for information - I've pulled Linus' tree this morning and the 
problem is still present. Also, Gunther Piaz has reported, via the 
bugzilla entry, that he too has hit this regression.

On 09/22/12 07:26, Chris Clayton wrote:
> I guess you network developer folks are either very busy or this
> regression is proving a bit troublesome to identify, so I've opened a
> bugzilla report to keep track of it. The report number is 47761.
>
> Chris
>
> On 09/19/12 16:26, Chris Clayton wrote:
>>>
>>> It would help to have some traffic sample, maybe.
>>>
>>> Especially if the problem is not easily reproductible for us.
>>>
>>
>> OK, I've used an netsniff-ng to capture the traffic on all interfaces on
>> the host (that would be tap0 and eth0, I guess) whilst attempting to
>> ping the router from the WinXP KVM client. The result is a pcap file
>> that I processed with tcpdump to produce:
>>
>> reading from file net-trace.pcap, link-type EN10MB (Ethernet)
>> 14:56:31.406336 ARP, Request who-has 192.168.200.254 tell 192.168.200.1,
>> length 28
>>          0x0000:  0001 0800 0604 0001 5254 0c3b 1728 c0a8
>>          0x0010:  c801 0000 0000 0000 c0a8 c8fe
>> 14:56:31.406357 ARP, Reply 192.168.200.254 is-at 46:83:93:8f:f0:7e,
>> length 28
>>          0x0000:  0001 0800 0604 0002 4683 938f f07e c0a8
>>          0x0010:  c8fe 5254 0c3b 1728 c0a8 c801
>> 14:56:31.406534 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id
>> 512, seq 4352, length 40
>>          0x0000:  4500 003c 0195 0000 8001 efd8 c0a8 c801
>>          0x0010:  c0a8 0001 0800 3a5c 0200 1100 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:31.406566 ARP, Request who-has 192.168.0.1 tell 192.168.0.40,
>> length 28
>>          0x0000:  0001 0800 0604 0001 5c9a d85c 6331 c0a8
>>          0x0010:  0028 0000 0000 0000 c0a8 0001
>> 14:56:31.410830 ARP, Reply 192.168.0.1 is-at 00:1f:33:80:09:44, length 46
>>          0x0000:  0001 0800 0604 0002 001f 3380 0944 c0a8
>>          0x0010:  0001 5c9a d85c 6331 c0a8 0028 c0a8 0001
>>          0x0020:  e000 0001 1164 ee9b 0000 0000 4500
>> 14:56:31.410851 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id
>> 512, seq 4352, length 40
>>          0x0000:  4500 003c 0195 0000 7f01 b8b2 c0a8 0028
>>          0x0010:  c0a8 0001 0800 3a5c 0200 1100 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:31.414474 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512,
>> seq 4352, length 40
>>          0x0000:  4500 003c cf4f 0000 ff01 6af7 c0a8 0001
>>          0x0010:  c0a8 0028 0000 425c 0200 1100 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:36.404781 ARP, Request who-has 192.168.0.40 tell 192.168.0.1,
>> length 46
>>          0x0000:  0001 0800 0604 0001 001f 3380 0944 c0a8
>>          0x0010:  0001 0000 0000 0000 c0a8 0028 c0a8 0001
>>          0x0020:  c0a8 0028 0000 425c 0200 1100 6162
>> 14:56:36.404806 ARP, Reply 192.168.0.40 is-at 5c:9a:d8:5c:63:31,
>> length 28
>>          0x0000:  0001 0800 0604 0002 5c9a d85c 6331 c0a8
>>          0x0010:  0028 001f 3380 0944 c0a8 0001
>> 14:56:36.689750 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id
>> 512, seq 4608, length 40
>>          0x0000:  4500 003c 0196 0000 8001 efd7 c0a8 c801
>>          0x0010:  c0a8 0001 0800 395c 0200 1200 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:36.689774 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id
>> 512, seq 4608, length 40
>>          0x0000:  4500 003c 0196 0000 7f01 b8b1 c0a8 0028
>>          0x0010:  c0a8 0001 0800 395c 0200 1200 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:36.693330 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512,
>> seq 4608, length 40
>>          0x0000:  4500 003c cf50 0000 ff01 6af6 c0a8 0001
>>          0x0010:  c0a8 0028 0000 415c 0200 1200 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:42.189424 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id
>> 512, seq 4864, length 40
>>          0x0000:  4500 003c 0197 0000 8001 efd6 c0a8 c801
>>          0x0010:  c0a8 0001 0800 385c 0200 1300 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:42.189447 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id
>> 512, seq 4864, length 40
>>          0x0000:  4500 003c 0197 0000 7f01 b8b0 c0a8 0028
>>          0x0010:  c0a8 0001 0800 385c 0200 1300 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:42.193029 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512,
>> seq 4864, length 40
>>          0x0000:  4500 003c cf51 0000 ff01 6af5 c0a8 0001
>>          0x0010:  c0a8 0028 0000 405c 0200 1300 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:47.689414 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id
>> 512, seq 5120, length 40
>>          0x0000:  4500 003c 0198 0000 8001 efd5 c0a8 c801
>>          0x0010:  c0a8 0001 0800 375c 0200 1400 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:47.689439 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id
>> 512, seq 5120, length 40
>>          0x0000:  4500 003c 0198 0000 7f01 b8af c0a8 0028
>>          0x0010:  c0a8 0001 0800 375c 0200 1400 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>> 14:56:47.693661 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512,
>> seq 5120, length 40
>>          0x0000:  4500 003c cf52 0000 ff01 6af4 c0a8 0001
>>          0x0010:  c0a8 0028 0000 3f5c 0200 1400 6162 6364
>>          0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
>>          0x0030:  7576 7761 6263 6465 6667 6869
>>
>> Is this what you asked for?
>>
>> Chris
>>
>
>

^ permalink raw reply

* Re: Possible networking regression in 3.6.0
From: Eric Dumazet @ 2012-09-27 12:14 UTC (permalink / raw)
  To: Chris Clayton; +Cc: netdev, gpiez
In-Reply-To: <50643DA1.7070306@googlemail.com>

On Thu, 2012-09-27 at 12:50 +0100, Chris Clayton wrote:
> Just for information - I've pulled Linus' tree this morning and the 
> problem is still present. Also, Gunther Piaz has reported, via the 
> bugzilla entry, that he too has hit this regression.

I tried to reproduce the bug, and my kvm guests have no problem.

I guess you need to precisely describe how you setup your network, so
that I can reproduce the problem and eventually fix it.

Thanks

^ permalink raw reply

* [PATCH net-next 1/3] tcp: gro: add checksuming helpers
From: Eric Dumazet @ 2012-09-27 12:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

skb with CHECKSUM_NONE cant currently be handled by GRO, and
we notice this deep in GRO stack in tcp[46]_gro_receive()

But there are cases where GRO can be a benefit, even with a lack
of checksums.

This preliminary work is needed to add GRO support
to tunnels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_ipv4.c |   19 ++++++++++++++++---
 net/ipv6/tcp_ipv6.c |   20 +++++++++++++++++---
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 93406c5..27a235c 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2804,6 +2804,8 @@ void tcp4_proc_exit(void)
 struct sk_buff **tcp4_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 {
 	const struct iphdr *iph = skb_gro_network_header(skb);
+	__wsum wsum;
+	__sum16 sum;
 
 	switch (skb->ip_summed) {
 	case CHECKSUM_COMPLETE:
@@ -2812,11 +2814,22 @@ struct sk_buff **tcp4_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 			break;
 		}
-
-		/* fall through */
-	case CHECKSUM_NONE:
+flush:
 		NAPI_GRO_CB(skb)->flush = 1;
 		return NULL;
+
+	case CHECKSUM_NONE:
+		wsum = csum_tcpudp_nofold(iph->saddr, iph->daddr,
+					  skb_gro_len(skb), IPPROTO_TCP, 0);
+		sum = csum_fold(skb_checksum(skb,
+					     skb_gro_offset(skb),
+					     skb_gro_len(skb),
+					     wsum));
+		if (sum)
+			goto flush;
+
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+		break;
 	}
 
 	return tcp_gro_receive(head, skb);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index d6212d6..49c8903 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -763,6 +763,8 @@ static struct sk_buff **tcp6_gro_receive(struct sk_buff **head,
 					 struct sk_buff *skb)
 {
 	const struct ipv6hdr *iph = skb_gro_network_header(skb);
+	__wsum wsum;
+	__sum16 sum;
 
 	switch (skb->ip_summed) {
 	case CHECKSUM_COMPLETE:
@@ -771,11 +773,23 @@ static struct sk_buff **tcp6_gro_receive(struct sk_buff **head,
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 			break;
 		}
-
-		/* fall through */
-	case CHECKSUM_NONE:
+flush:
 		NAPI_GRO_CB(skb)->flush = 1;
 		return NULL;
+
+	case CHECKSUM_NONE:
+		wsum = ~csum_unfold(csum_ipv6_magic(&iph->saddr, &iph->daddr,
+						    skb_gro_len(skb),
+						    IPPROTO_TCP, 0));
+		sum = csum_fold(skb_checksum(skb,
+					     skb_gro_offset(skb),
+					     skb_gro_len(skb),
+					     wsum));
+		if (sum)
+			goto flush;
+
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+		break;
 	}
 
 	return tcp_gro_receive(head, skb);

^ permalink raw reply related

* [PATCH] inetpeer: ensure to set the maximum tokens the first time
From: Nicolas Dichtel @ 2012-09-27 12:33 UTC (permalink / raw)
  To: netdev, davem; +Cc: Nicolas Dichtel

When jiffies wraps around (for example, 5 minutes after the boot, see
INITIAL_JIFFIES) and peer has just been created, now - peer->rate_last can be
< XRLIM_BURST_FACTOR * timeout, so token is not set to the maximum value, thus
some icmp packets can be unexpectedly dropped.

With this patch, it's still possible that last_rate and rate_tokens are 0 at the
same time after jiffies wraps round, but the probability is very low and the
only consequence is to let some ICMP packets bypass the filter.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv4/inetpeer.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
index e1e0a4e..92fec02 100644
--- a/net/ipv4/inetpeer.c
+++ b/net/ipv4/inetpeer.c
@@ -559,10 +559,14 @@ bool inet_peer_xrlim_allow(struct inet_peer *peer, int timeout)
 
 	token = peer->rate_tokens;
 	now = jiffies;
-	token += now - peer->rate_last;
-	peer->rate_last = now;
-	if (token > XRLIM_BURST_FACTOR * timeout)
+	if (!peer->rate_last && !token)
 		token = XRLIM_BURST_FACTOR * timeout;
+	else {
+		token += now - peer->rate_last;
+		if (token > XRLIM_BURST_FACTOR * timeout)
+			token = XRLIM_BURST_FACTOR * timeout;
+	}
+	peer->rate_last = now;
 	if (token >= timeout) {
 		token -= timeout;
 		rc = true;
-- 
1.7.12

^ permalink raw reply related

* [PATCH net-next 2/3] net: add gro_cells infrastructure
From: Eric Dumazet @ 2012-09-27 12:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

This adds a new include file (include/net/gro_cells.h), to bring GRO
(Generic Receive Offload) capability to tunnels, in a modular way.

Because tunnels receive path is lockless, and GRO adds a serialization
using a napi_struct, I chose to add an array of up to 8 cells,
so that multi queue devices wont be slowed down because of GRO layer.

skb_get_rx_queue() is used as selector.

In the future, we might add optional fanout capabilities, using rxhash
for example.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/gro_cells.h |  103 ++++++++++++++++++++++++++++++++++++++
 net/core/dev.c          |    2 
 2 files changed, 105 insertions(+)

diff --git a/include/net/gro_cells.h b/include/net/gro_cells.h
new file mode 100644
index 0000000..ba93b1b
--- /dev/null
+++ b/include/net/gro_cells.h
@@ -0,0 +1,103 @@
+#ifndef _NET_GRO_CELLS_H
+#define _NET_GRO_CELLS_H
+
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/netdevice.h>
+
+struct gro_cell {
+	struct sk_buff_head	napi_skbs;
+	struct napi_struct	napi;
+} ____cacheline_aligned_in_smp;
+
+struct gro_cells {
+	unsigned int		gro_cells_mask;
+	struct gro_cell		*cells;
+};
+
+static inline void gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
+{
+	unsigned long flags;
+	struct gro_cell *cell = gcells->cells;
+	struct net_device *dev = skb->dev;
+
+	if (!cell || skb_cloned(skb) || !(dev->features & NETIF_F_GRO)) {
+		netif_rx(skb);
+		return;
+	}
+
+	if (skb_rx_queue_recorded(skb))
+		cell += skb_get_rx_queue(skb) & gcells->gro_cells_mask;
+
+	if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) {
+		atomic_long_inc(&dev->rx_dropped);
+		kfree_skb(skb);
+		return;
+	}
+
+	spin_lock_irqsave(&cell->napi_skbs.lock, flags);
+
+	__skb_queue_tail(&cell->napi_skbs, skb);
+	if (skb_queue_len(&cell->napi_skbs) == 1)
+		napi_schedule(&cell->napi);
+
+	spin_unlock_irqrestore(&cell->napi_skbs.lock, flags);
+}
+
+static inline int gro_cell_poll(struct napi_struct *napi, int budget)
+{
+	struct gro_cell *cell = container_of(napi, struct gro_cell, napi);
+	struct sk_buff *skb;
+	int work_done = 0;
+
+	while (work_done < budget) {
+		skb = skb_dequeue(&cell->napi_skbs);
+		if (!skb)
+			break;
+
+		napi_gro_receive(napi, skb);
+		work_done++;
+	}
+
+	if (work_done < budget)
+		napi_complete(napi);
+	return work_done;
+}
+
+static inline int gro_cells_init(struct gro_cells *gcells, struct net_device *dev)
+{
+	int i;
+
+	gcells->gro_cells_mask = roundup_pow_of_two(min_t(unsigned int, 8, nr_cpu_ids)) - 1;
+	gcells->cells = kcalloc(sizeof(struct gro_cell),
+				gcells->gro_cells_mask + 1,
+				GFP_KERNEL);
+	if (!gcells->cells)
+		return -ENOMEM;
+
+	for (i = 0; i <= gcells->gro_cells_mask; i++) {
+		struct gro_cell *cell = gcells->cells + i;
+
+		skb_queue_head_init(&cell->napi_skbs);
+		netif_napi_add(dev, &cell->napi, gro_cell_poll, 64);
+		napi_enable(&cell->napi);
+	}
+	return 0;
+}
+
+static inline void gro_cells_destroy(struct gro_cells *gcells)
+{
+	struct gro_cell *cell = gcells->cells;
+	int i;
+
+	if (!cell)
+		return;
+	for (i = 0; i <= gcells->gro_cells_mask; i++,cell++) {
+		netif_napi_del(&cell->napi);	
+		skb_queue_purge(&cell->napi_skbs);
+	}
+	kfree(gcells->cells);
+	gcells->cells = NULL;
+}
+
+#endif
diff --git a/net/core/dev.c b/net/core/dev.c
index 707b124..9f63660 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2644,6 +2644,8 @@ EXPORT_SYMBOL(dev_queue_xmit);
   =======================================================================*/
 
 int netdev_max_backlog __read_mostly = 1000;
+EXPORT_SYMBOL(netdev_max_backlog);
+
 int netdev_tstamp_prequeue __read_mostly = 1;
 int netdev_budget __read_mostly = 300;
 int weight_p __read_mostly = 64;            /* old backlog weight */

^ permalink raw reply related

* [PATCH net-next 3/3] ipv4: gre: add GRO capability
From: Eric Dumazet @ 2012-09-27 12:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

Add GRO capability to IPv4 GRE tunnels, using the gro_cells
infrastructure.

Tested using IPv4 and IPv6 TCP traffic inside this tunnel, and
checking GRO is building large packets.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/ipip.h |    3 +++
 net/ipv4/ip_gre.c  |   13 +++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/net/ipip.h b/include/net/ipip.h
index a93cf6d..ddc077c 100644
--- a/include/net/ipip.h
+++ b/include/net/ipip.h
@@ -2,6 +2,7 @@
 #define __NET_IPIP_H 1
 
 #include <linux/if_tunnel.h>
+#include <net/gro_cells.h>
 #include <net/ip.h>
 
 /* Keep error state on tunnel for 30 sec */
@@ -36,6 +37,8 @@ struct ip_tunnel {
 #endif
 	struct ip_tunnel_prl_entry __rcu *prl;		/* potential router list */
 	unsigned int			prl_count;	/* # of entries in PRL */
+
+	struct gro_cells		gro_cells;
 };
 
 struct ip_tunnel_prl_entry {
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index f233c1d..1f00b30 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -714,8 +714,7 @@ static int ipgre_rcv(struct sk_buff *skb)
 		skb_reset_network_header(skb);
 		ipgre_ecn_decapsulate(iph, skb);
 
-		netif_rx(skb);
-
+		gro_cells_receive(&tunnel->gro_cells, skb);
 		rcu_read_unlock();
 		return 0;
 	}
@@ -1296,6 +1295,9 @@ static const struct net_device_ops ipgre_netdev_ops = {
 
 static void ipgre_dev_free(struct net_device *dev)
 {
+	struct ip_tunnel *tunnel = netdev_priv(dev);
+
+	gro_cells_destroy(&tunnel->gro_cells);
 	free_percpu(dev->tstats);
 	free_netdev(dev);
 }
@@ -1327,6 +1329,7 @@ static int ipgre_tunnel_init(struct net_device *dev)
 {
 	struct ip_tunnel *tunnel;
 	struct iphdr *iph;
+	int err;
 
 	tunnel = netdev_priv(dev);
 	iph = &tunnel->parms.iph;
@@ -1353,6 +1356,12 @@ static int ipgre_tunnel_init(struct net_device *dev)
 	if (!dev->tstats)
 		return -ENOMEM;
 
+	err = gro_cells_init(&tunnel->gro_cells, dev);
+	if (err) {
+		free_percpu(dev->tstats);
+		return err;
+	}
+
 	return 0;
 }
 

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox