* RE: [E1000-devel] [Bugme-new] [Bug 12570] New: Bonding does not work over e1000e.
[not found] ` <EA929A9653AAE14F841771FB1DE5A1365F60321411@rrsmsx501.amr.corp.intel.com>
@ 2009-02-03 1:42 ` Graham, David
2009-02-06 10:18 ` Konstantin Khorenko
0 siblings, 1 reply; 2+ messages in thread
From: Graham, David @ 2009-02-03 1:42 UTC (permalink / raw)
To: netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net,
devel@lists.sourceforge.net
Cc: khorenko@parallels.com, bugme-daemon@bugzilla.kernel.org
[-- Attachment #1: Type: text/plain, Size: 10254 bytes --]
Hi Konstantin.
I have been trying but so far been failed to reproduce the reported problem.
I have a few questions.
1) While I can't repro your problem, I can see something very similar if I don't load with module load parameter miimon=100. From your /proc/net/bonding/bond1 dumps it looks like you do the right thing, but would you please confirm for me by listing exactly which bonding module params you load with.
2) At 11:56:01 in the report you "turn on eth2 uplink on the virtual connect bay5", and I see in /proc/net/bonding/bond1 , immediately after that, eth2 still shows MII status *down*, which would be incorrect. Can you confirm that this snippet of the file really is in the correct place in the reported sequence - that is, there is already a problem at this step, and that's where we should look for the problem.
3) Could you send me your network scripts for the two slaves and the bonding interface itself (on RH systems, I think that's /etc/sysconfig/networking-scripts/ifcfg-*. They should be modeled on the sample info in <kernel>Documentation/networking/bonding.txt, and I'd like to check them.
4) I'm probably not controlling the slave link state in the same way that you are, because in the NEC bladeserver that I'm using, I am bringing the e1000e link-partern ports up & down by using an admins console, using SW I don't understand. I (so far) have not been able to physically disconnect one of the (serdes) links connecting the 82571 without also disconnecting the other, as I have to pull an entire switch module to make the disconnect. Can you give me mmore information on what your system is, and how you can physically disconnect on link at a time. Then I might be able to get hold of a similar setup and see the problem.
5) I have been testing on 2.6.29-rc3 and on 2.6.28 kernels, not 2.6.29-rc1 which is what you reported the problem on. I think its unlikely that the problem is only on the 2.6.29-rc1 build, but would like to know if you've had a chance to try any other build, and what the results were. Also please let me know if you have tested with any other non-INTEL 1GB interfaces, and if you have *ever* seen bonding work properly on the system you are testing.
6) While I can't repro your issue yet, I have made some changes very recently to the serdes link detect logic in the e1000e driver. They were written to address a separate issue, and are actually NOT in the kernel that you have been testing. I also can't see how fixing that problem might fix your problem. However, because the fixes do concern serdes link detection, and so does yours, it's probably worth a (long) shot. If you are comfortable trying them out, I have attached them to this email. They are also being queued for upstream, but only after some further local testing.
Thanks
Dave
>-----Original Message-----
>From: Andrew Morton [mailto:akpm@linux-foundation.org]
>Sent: Thursday, January 29, 2009 9:53 AM
>To: netdev@vger.kernel.org; e1000-devel@lists.sourceforge.net; bonding-
>devel@lists.sourceforge.net
>Cc: khorenko@parallels.com; bugme-daemon@bugzilla.kernel.org
>Subject: Re: [E1000-devel] [Bugme-new] [Bug 12570] New: Bonding does not
>work over e1000e.
>
>
>(switched to email. Please respond via emailed reply-to-all, not via the
>bugzilla web interface).
>
>On Thu, 29 Jan 2009 03:12:01 -0800 (PST) bugme-daemon@bugzilla.kernel.org
>wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=12570
>>
>> Summary: Bonding does not work over e1000e.
>> Product: Drivers
>> Version: 2.5
>> KernelVersion: 2.6.29-rc1
>> Platform: All
>> OS/Version: Linux
>> Tree: Mainline
>> Status: NEW
>> Severity: normal
>> Priority: P1
>> Component: Network
>> AssignedTo: jgarzik@pobox.com
>> ReportedBy: khorenko@parallels.com
>>
>>
>> Checked (failing) kernel: 2.6.29-rc1
>> Latest working kernel version: unknown
>> Earliest failing kernel version: not checked but probably any. RHEL5
>kernels
>> are also affected.
>>
>> Distribution: Enterprise Linux Enterprise Linux Server release 5.1
>(Carthage)
>>
>> Hardware Environment:
>> lspci:
>> 15:00.0 Ethernet controller: Intel Corporation 82571EB Quad Port Gigabit
>> Mezzanine Adapter (rev 06)
>> 15:00.1 Ethernet controller: Intel Corporation 82571EB Quad Port Gigabit
>> Mezzanine Adapter (rev 06)
>>
>> 15:00.0 0200: 8086:10da (rev 06)
>> Subsystem: 103c:1717
>> Flags: bus master, fast devsel, latency 0, IRQ 154
>> Memory at fdde0000 (32-bit, non-prefetchable) [size=128K]
>> Memory at fdd00000 (32-bit, non-prefetchable) [size=512K]
>> I/O ports at 6000 [size=32]
>> [virtual] Expansion ROM at d1300000 [disabled] [size=512K]
>> Capabilities: [c8] Power Management version 2
>> Capabilities: [d0] Message Signalled Interrupts: 64bit+
>Queue=0/0
>> Enable+
>> Capabilities: [e0] Express Endpoint IRQ 0
>> Capabilities: [100] Advanced Error Reporting
>> Capabilities: [140] Device Serial Number 24-d1-78-ff-ff-78-1b-00
>>
>> 15:00.1 0200: 8086:10da (rev 06)
>> Subsystem: 103c:1717
>> Flags: bus master, fast devsel, latency 0, IRQ 162
>> Memory at fdce0000 (32-bit, non-prefetchable) [size=128K]
>> Memory at fdc00000 (32-bit, non-prefetchable) [size=512K]
>> I/O ports at 6020 [size=32]
>> [virtual] Expansion ROM at d1380000 [disabled] [size=512K]
>> Capabilities: [c8] Power Management version 2
>> Capabilities: [d0] Message Signalled Interrupts: 64bit+
>Queue=0/0
>> Enable+
>> Capabilities: [e0] Express Endpoint IRQ 0
>> Capabilities: [100] Advanced Error Reporting
>> Capabilities: [140] Device Serial Number 24-d1-78-ff-ff-78-1b-00
>>
>> Problem Description: Bonding does not work over NICs supported by
>e1000e: if
>> you brake/restore physical links of bonding slaves one by one - network
>won't
>> work anymore.
>>
>> Steps to reproduce:
>> 2 NICs supported by e1000e put into bond device (Bonding Mode: fault-
>tolerance
>> (active-backup)).
>> * ping to the outside node is ok
>> * physically brake the link of active bond slave (1)
>> * bond detects the failure, makes another slave (2) active.
>> * ping works fine
>> * restore the connection of (1)
>> * ping works fine
>> * brake the link of (2)
>> * bond detects it, reports that it makes active (1), but
>> * ping _does not_ work anymore
>>
>> Logs:
>> /var/log/messages:
>> Jan 27 11:53:29 host kernel: 0000:15:00.0: eth2: Link is Down
>> Jan 27 11:53:29 host kernel: bonding: bond1: link status definitely down
>for
>> interface eth2, disabling it
>> Jan 27 11:53:29 host kernel: bonding: bond1: making interface eth3 the
>new
>> active one.
>> Jan 27 11:56:37 host kernel: 0000:15:00.0: eth2: Link is Up 1000 Mbps
>Full
>> Duplex, Flow Control: RX/TX
>> Jan 27 11:56:37 host kernel: bonding: bond1: link status definitely up
>for
>> interface eth2.
>> Jan 27 11:57:39 host kernel: 0000:15:00.1: eth3: Link is Down
>> Jan 27 11:57:39 host kernel: bonding: bond1: link status definitely down
>for
>> interface eth3, disabling it
>> Jan 27 11:57:39 host kernel: bonding: bond1: making interface eth2 the
>new
>> active one.
>>
>> What was done + dumps of /proc/net/bonding/bond1:
>> ## 11:52:42
>> ##cat /proc/net/bonding/bond1
>> Ethernet Channel Bonding Driver: v3.3.0 (June 10, 2008)
>>
>> Bonding Mode: fault-tolerance (active-backup)
>> Primary Slave: None
>> Currently Active Slave: eth2
>> MII Status: up
>> MII Polling Interval (ms): 100
>> Up Delay (ms): 0
>> Down Delay (ms): 0
>>
>> Slave Interface: eth2
>> MII Status: up
>> Link Failure Count: 0
>> Permanent HW addr: 00:17:a4:77:00:1c
>>
>> Slave Interface: eth3
>> MII Status: up
>> Link Failure Count: 0
>> Permanent HW addr: 00:17:a4:77:00:1e
>>
>> ## 11:53:05 shutdown eth2 uplink on the virtual connect bay5
>> ##cat /proc/net/bonding/bond1
>> Ethernet Channel Bonding Driver: v3.3.0 (June 10, 2008)
>>
>> Bonding Mode: fault-tolerance (active-backup)
>> Primary Slave: None
>> Currently Active Slave: eth3
>> MII Status: up
>> MII Polling Interval (ms): 100
>> Up Delay (ms): 0
>> Down Delay (ms): 0
>>
>> Slave Interface: eth2
>> MII Status: down
>> Link Failure Count: 1
>> Permanent HW addr: 00:17:a4:77:00:1c
>>
>> Slave Interface: eth3
>> MII Status: up
>> Link Failure Count: 0
>> Permanent HW addr: 00:17:a4:77:00:1e
>>
>> ## 11:56:01 turn on eth2 uplink on the virtual connect bay5
>> ##cat /proc/net/bonding/bond1
>> Ethernet Channel Bonding Driver: v3.3.0 (June 10, 2008)
>>
>> Bonding Mode: fault-tolerance (active-backup)
>> Primary Slave: None
>> Currently Active Slave: eth3
>> MII Status: up
>> MII Polling Interval (ms): 100
>> Up Delay (ms): 0
>> Down Delay (ms): 0
>>
>> Slave Interface: eth2
>> MII Status: down
>> Link Failure Count: 1
>> Permanent HW addr: 00:17:a4:77:00:1c
>>
>> Slave Interface: eth3
>> MII Status: up
>> Link Failure Count: 0
>> Permanent HW addr: 00:17:a4:77:00:1e
>>
>> ## 11:57:22 turn off eth3 uplink on the virtual connect bay5
>> ##cat /proc/net/bonding/bond1
>> Ethernet Channel Bonding Driver: v3.3.0 (June 10, 2008)
>>
>> Bonding Mode: fault-tolerance (active-backup)
>> Primary Slave: None
>> Currently Active Slave: eth2
>> MII Status: up
>> MII Polling Interval (ms): 100
>> Up Delay (ms): 0
>> Down Delay (ms): 0
>>
>> Slave Interface: eth2
>> MII Status: up
>> Link Failure Count: 1
>> Permanent HW addr: 00:17:a4:77:00:1c
>>
>> Slave Interface: eth3
>> MII Status: down
>> Link Failure Count: 1
>> Permanent HW addr: 00:17:a4:77:00:1e
>>
>
>
>--------------------------------------------------------------------------
>----
>This SF.net email is sponsored by:
>SourcForge Community
>SourceForge wants to tell your story.
>http://p.sf.net/sfu/sf-spreadtheword
>_______________________________________________
>E1000-devel mailing list
>E1000-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/e1000-devel
[-- Attachment #2: SerdesSM.patch --]
[-- Type: application/octet-stream, Size: 6332 bytes --]
commit 64621828ee2071dd70740541950c04cd488e2ff7
Author: drgraha1 <drgraha1@drgraha1-t60p.(none)>
Date: Fri Jan 30 10:41:40 2009 -0800
Changes the serdes link detection for 82571 & 82572 adapters to a
state machine. Testing has shown that this both stabiitises link
reporting) no false positives even on disconnected serdes links)
and properly restarts autonegotiation when a link parter is returned
to service.
diff --git a/drivers/net/e1000e/82571.c b/drivers/net/e1000e/82571.c
index 0890162..2f3cb5f 100644
--- a/drivers/net/e1000e/82571.c
+++ b/drivers/net/e1000e/82571.c
@@ -61,6 +61,7 @@
static s32 e1000_get_phy_id_82571(struct e1000_hw *hw);
static s32 e1000_setup_copper_link_82571(struct e1000_hw *hw);
static s32 e1000_setup_fiber_serdes_link_82571(struct e1000_hw *hw);
+static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw);
static s32 e1000_write_nvm_eewr_82571(struct e1000_hw *hw, u16 offset,
u16 words, u16 *data);
static s32 e1000_fix_nvm_checksum_82571(struct e1000_hw *hw);
@@ -250,7 +251,7 @@ static s32 e1000_init_mac_params_82571(struct e1000_adapter *adapter)
case e1000_media_type_internal_serdes:
func->setup_physical_interface =
e1000_setup_fiber_serdes_link_82571;
- func->check_for_link = e1000e_check_for_serdes_link;
+ func->check_for_link = e1000_check_for_serdes_link_82571;
func->get_link_up_info =
e1000e_get_speed_and_duplex_fiber_serdes;
break;
@@ -830,6 +831,10 @@ static s32 e1000_reset_hw_82571(struct e1000_hw *hw)
hw->dev_spec.e82571.alt_mac_addr_is_present)
e1000e_set_laa_state_82571(hw, true);
+ /* Reinitialize the 82571 serdes link state machine */
+ if (hw->phy.media_type == e1000_media_type_internal_serdes)
+ hw->mac.serdes_link_state = e1000_serdes_link_down;
+
return 0;
}
@@ -1203,6 +1208,131 @@ static s32 e1000_setup_fiber_serdes_link_82571(struct e1000_hw *hw)
}
/**
+ * e1000_check_for_serdes_link_82571 - Check for link (Serdes)
+ * @hw: pointer to the HW structure
+ *
+ * Checks for link up on the hardware. If link is not up and we have
+ * a signal, then we need to force link up.
+ **/
+s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw)
+{
+ struct e1000_mac_info *mac = &hw->mac;
+ u32 rxcw;
+ u32 ctrl;
+ u32 status;
+ s32 ret_val = 0;
+
+ ctrl = er32(CTRL);
+ status = er32(STATUS);
+ rxcw = er32(RXCW);
+
+ if ((rxcw & E1000_RXCW_SYNCH) && !(rxcw & E1000_RXCW_IV)) {
+
+ /* Receiver is synchronized with no invalid bits. */
+ switch (mac->serdes_link_state) {
+ case e1000_serdes_link_autoneg_complete:
+ if (!(status & E1000_STATUS_LU)) {
+ /*
+ * We have lost link, retry autoneg before
+ * reporting link failure
+ */
+ mac->serdes_link_state =
+ e1000_serdes_link_autoneg_progress;
+ hw_dbg(hw, "AN_UP -> AN_PROG\n");
+ }
+ break;
+
+ case e1000_serdes_link_forced_up:
+ /*
+ * If we are receiving /C/ ordered sets, re-enable
+ * auto-negotiation in the TXCW register and disable
+ * forced link in the Device Control register in an
+ * attempt to auto-negotiate with our link partner.
+ */
+ if (rxcw & E1000_RXCW_C) {
+ /* Enable autoneg, and unforce link up */
+ ew32(TXCW, mac->txcw);
+ ew32(CTRL,
+ (ctrl & ~E1000_CTRL_SLU));
+ mac->serdes_link_state =
+ e1000_serdes_link_autoneg_progress;
+ hw_dbg(hw, "FORCED_UP -> AN_PROG\n");
+ }
+ break;
+
+ case e1000_serdes_link_autoneg_progress:
+ /*
+ * If the LU bit is set in the STATUS register,
+ * autoneg has completed sucessfully. If not,
+ * try foring the link because the far end may be
+ * available but not capable of autonegotiation.
+ */
+ if (status & E1000_STATUS_LU) {
+ mac->serdes_link_state =
+ e1000_serdes_link_autoneg_complete;
+ hw_dbg(hw, "AN_PROG -> AN_UP\n");
+ } else {
+ /*
+ * Disable autoneg, force link up and
+ * full duplex, and change state to forced
+ */
+ ew32(TXCW,
+ (mac->txcw & ~E1000_TXCW_ANE));
+ ctrl |= (E1000_CTRL_SLU | E1000_CTRL_FD);
+ ew32(CTRL, ctrl);
+
+ /* Configure Flow Control after link up. */
+ ret_val =
+ e1000e_config_fc_after_link_up(hw);
+ if (ret_val) {
+ hw_dbg(hw, "Error config flow control\n");
+ break;
+ }
+ mac->serdes_link_state =
+ e1000_serdes_link_forced_up;
+ hw_dbg(hw, "AN_PROG -> FORCED_UP\n");
+ }
+ mac->serdes_has_link = true;
+ break;
+
+ case e1000_serdes_link_down:
+ default:
+ /* The link was down but the receiver has now gained
+ * valid sync, so lets see if we can bring the link
+ * up. */
+ ew32(TXCW, mac->txcw);
+ ew32(CTRL,
+ (ctrl & ~E1000_CTRL_SLU));
+ mac->serdes_link_state =
+ e1000_serdes_link_autoneg_progress;
+ hw_dbg(hw, "DOWN -> AN_PROG\n");
+ break;
+ }
+ } else {
+ if (!(rxcw & E1000_RXCW_SYNCH)) {
+ mac->serdes_has_link = false;
+ mac->serdes_link_state = e1000_serdes_link_down;
+ hw_dbg(hw, "ANYSTATE -> DOWN\n");
+ } else {
+ /*
+ * We have sync, and can tolerate one
+ * invalid (IV) codeword before declaring
+ * link down, so reread to look again
+ */
+ udelay(10);
+ rxcw = er32(RXCW);
+ if (rxcw & E1000_RXCW_IV) {
+ mac->serdes_link_state = e1000_serdes_link_down;
+ mac->serdes_has_link = false;
+ hw_dbg(hw, "ANYSTATE -> DOWN\n");
+ }
+ }
+ }
+
+ return ret_val;
+}
+
+/**
* e1000_valid_led_default_82571 - Verify a valid default LED config
* @hw: pointer to the HW structure
* @data: pointer to the NVM (EEPROM)
diff --git a/drivers/net/e1000e/hw.h b/drivers/net/e1000e/hw.h
index 2d4ce04..de2be1f 100644
--- a/drivers/net/e1000e/hw.h
+++ b/drivers/net/e1000e/hw.h
@@ -459,6 +459,13 @@ enum e1000_smart_speed {
e1000_smart_speed_off
};
+enum e1000_serdes_link_state {
+ e1000_serdes_link_down = 0,
+ e1000_serdes_link_autoneg_progress,
+ e1000_serdes_link_autoneg_complete,
+ e1000_serdes_link_forced_up
+};
+
/* Receive Descriptor */
struct e1000_rx_desc {
__le64 buffer_addr; /* Address of the descriptor's data buffer */
@@ -786,7 +793,8 @@ struct e1000_mac_info {
bool get_link_status;
bool in_ifs_mode;
bool serdes_has_link;
- bool tx_pkt_filtering;
+ enum e1000_serdes_link_state serdes_link_state;
+ bool tx_pkt_filtering;
};
struct e1000_phy_info {
[-- Attachment #3: disable_dmaclkgating.patch --]
[-- Type: application/octet-stream, Size: 1776 bytes --]
commit f189b02ce5bf7233e9aa217b561d0b4ec0a818b3
Author: drgraha1 <drgraha1@drgraha1-t60p.(none)>
Date: Fri Jan 30 10:55:23 2009 -0800
Some 82571 adapters were shipped with teh eeprom setting of
DMA dynamic clock gating enabled. It hsould be disabled, and
this patch force-disables the setting in the CTRL_EXT
register (initially mapped form the EEPROM)
diff --git a/drivers/net/e1000e/82571.c b/drivers/net/e1000e/82571.c
index 2f3cb5f..565fd4e 100644
--- a/drivers/net/e1000e/82571.c
+++ b/drivers/net/e1000e/82571.c
@@ -985,6 +985,18 @@ static void e1000_initialize_hw_bits_82571(struct e1000_hw *hw)
reg |= E1000_PBA_ECC_CORR_EN;
ew32(PBA_ECC, reg);
}
+ /*
+ * Workaround for hardware errata.
+ * Ensure that DMA Dynamic Clock gating is disabled on 82571 and 82572
+ */
+
+ if ((hw->mac.type == e1000_82571) ||
+ (hw->mac.type == e1000_82572)) {
+ reg = er32(CTRL_EXT);
+ reg &= ~E1000_CTRL_EXT_DMA_DYN_CLK_EN;
+ ew32(CTRL_EXT, reg);
+ }
+
/* PCI-Ex Control Registers */
if (hw->mac.type == e1000_82574) {
diff --git a/drivers/net/e1000e/defines.h b/drivers/net/e1000e/defines.h
index e6caf29..243aa49 100644
--- a/drivers/net/e1000e/defines.h
+++ b/drivers/net/e1000e/defines.h
@@ -69,6 +69,7 @@
#define E1000_CTRL_EXT_SDP7_DATA 0x00000080 /* Value of SW Definable Pin 7 */
#define E1000_CTRL_EXT_EE_RST 0x00002000 /* Reinitialize from EEPROM */
#define E1000_CTRL_EXT_RO_DIS 0x00020000 /* Relaxed Ordering disable */
+#define E1000_CTRL_EXT_DMA_DYN_CLK_EN 0x00080000 /* DMA Dynamic Clock Gating */
#define E1000_CTRL_EXT_LINK_MODE_MASK 0x00C00000
#define E1000_CTRL_EXT_LINK_MODE_PCIE_SERDES 0x00C00000
#define E1000_CTRL_EXT_EIAME 0x01000000
[-- Attachment #4: RemoveRXSEQ.patch --]
[-- Type: application/octet-stream, Size: 1223 bytes --]
commit f8658955ae28415f1d83ed4c72ed4632a7368e26
Author: drgraha1 <drgraha1@drgraha1-t60p.(none)>
Date: Fri Jan 30 10:48:17 2009 -0800
Remove the LSC-like treatment that is given to RXSEQ interrutps.
It is thought that this was initially added to assist in serdes
link detection which would not always indicate a link-down interrupt
when a link partenre was removed from service. The new link detect
mechanism for serdes no longer requires this.
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 91817d0..55b2f6b 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -1152,7 +1152,7 @@ static irqreturn_t e1000_intr_msi(int irq, void *data)
* read ICR disables interrupts using IAM
*/
- if (icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
+ if (icr & E1000_ICR_LSC) {
hw->mac.get_link_status = 1;
/*
* ICH8 workaround-- Call gig speed drop workaround on cable
@@ -1218,7 +1218,7 @@ static irqreturn_t e1000_intr(int irq, void *data)
* IMC write
*/
- if (icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
+ if (icr & E1000_ICR_LSC) {
hw->mac.get_link_status = 1;
/*
* ICH8 workaround-- Call gig speed drop workaround on cable
^ permalink raw reply related [flat|nested] 2+ messages in thread