* [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
@ 2024-05-21 0:21 Jacob Keller
2024-05-21 16:41 ` Simon Horman
2024-05-23 9:00 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 14+ messages in thread
From: Jacob Keller @ 2024-05-21 0:21 UTC (permalink / raw)
To: netdev; +Cc: Jacob Keller, Jeff Daly, kernel.org-fo5k2w
This reverts commit 565736048bd5f9888990569993c6b6bfdf6dcb6d.
According to the commit, it implements a manual AN-37 for some
"troublesome" Juniper MX5 switches. This appears to be a workaround for a
particular switch.
It has been reported that this causes a severe breakage for other switches,
including a Cisco 3560CX-12PD-S.
The code appears to be a workaround for a specific switch which fails to
link in SFI mode. It expects to see AN-37 auto negotiation in order to
link. The Cisco switch is not expecting AN-37 auto negotiation. When the
device starts the manual AN-37, the Cisco switch decides that the port is
confused and stops attempting to link with it. This persists until a power
cycle. A simple driver unload and reload does not resolve the issue, even
if loading with a version of the driver which lacks this workaround.
The authors of the workaround commit have not responded with
clarifications, and the result of the workaround is complete failure to
connect with other switches.
This appears to be a case where the driver can either "correctly" link with
the Juniper MX5 switch, at the cost of bricking the link with the Cisco
switch, or it can behave properly for the Cisco switch, but fail to link
with the Junipir MX5 switch. I do not know enough about the standards
involved to clearly determine whether either switch is at fault or behaving
incorrectly. Nor do I know whether there exists some alternative fix which
corrects behavior with both switches.
Revert the workaround for the Juniper switch.
Fixes: 565736048bd5 ("ixgbe: Manual AN-37 for troublesome link partners for X550 SFI")
Link: https://lore.kernel.org/netdev/cbe874db-9ac9-42b8-afa0-88ea910e1e99@intel.com/T/
Link: https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.135129/#post-612291
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeff Daly <jeffd@silicom-usa.com>
Cc: kernel.org-fo5k2w@ycharbi.fr
---
drivers/net/ethernet/intel/ixgbe/ixgbe_type.h | 3 --
drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 56 ++-------------------------
2 files changed, 3 insertions(+), 56 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index 897fe357b65b..346e3d9114a8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -3675,9 +3675,7 @@ struct ixgbe_info {
#define IXGBE_KRM_LINK_S1(P) ((P) ? 0x8200 : 0x4200)
#define IXGBE_KRM_LINK_CTRL_1(P) ((P) ? 0x820C : 0x420C)
#define IXGBE_KRM_AN_CNTL_1(P) ((P) ? 0x822C : 0x422C)
-#define IXGBE_KRM_AN_CNTL_4(P) ((P) ? 0x8238 : 0x4238)
#define IXGBE_KRM_AN_CNTL_8(P) ((P) ? 0x8248 : 0x4248)
-#define IXGBE_KRM_PCS_KX_AN(P) ((P) ? 0x9918 : 0x5918)
#define IXGBE_KRM_SGMII_CTRL(P) ((P) ? 0x82A0 : 0x42A0)
#define IXGBE_KRM_LP_BASE_PAGE_HIGH(P) ((P) ? 0x836C : 0x436C)
#define IXGBE_KRM_DSP_TXFFE_STATE_4(P) ((P) ? 0x8634 : 0x4634)
@@ -3687,7 +3685,6 @@ struct ixgbe_info {
#define IXGBE_KRM_PMD_FLX_MASK_ST20(P) ((P) ? 0x9054 : 0x5054)
#define IXGBE_KRM_TX_COEFF_CTRL_1(P) ((P) ? 0x9520 : 0x5520)
#define IXGBE_KRM_RX_ANA_CTL(P) ((P) ? 0x9A00 : 0x5A00)
-#define IXGBE_KRM_FLX_TMRS_CTRL_ST31(P) ((P) ? 0x9180 : 0x5180)
#define IXGBE_KRM_PMD_FLX_MASK_ST20_SFI_10G_DA ~(0x3 << 20)
#define IXGBE_KRM_PMD_FLX_MASK_ST20_SFI_10G_SR BIT(20)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index 2decb0710b6e..a5f644934445 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -1722,59 +1722,9 @@ static int ixgbe_setup_sfi_x550a(struct ixgbe_hw *hw, ixgbe_link_speed *speed)
return -EINVAL;
}
- (void)mac->ops.write_iosf_sb_reg(hw,
- IXGBE_KRM_PMD_FLX_MASK_ST20(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
-
- /* change mode enforcement rules to hybrid */
- (void)mac->ops.read_iosf_sb_reg(hw,
- IXGBE_KRM_FLX_TMRS_CTRL_ST31(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
- reg_val |= 0x0400;
-
- (void)mac->ops.write_iosf_sb_reg(hw,
- IXGBE_KRM_FLX_TMRS_CTRL_ST31(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
-
- /* manually control the config */
- (void)mac->ops.read_iosf_sb_reg(hw,
- IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
- reg_val |= 0x20002240;
-
- (void)mac->ops.write_iosf_sb_reg(hw,
- IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
-
- /* move the AN base page values */
- (void)mac->ops.read_iosf_sb_reg(hw,
- IXGBE_KRM_PCS_KX_AN(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
- reg_val |= 0x1;
-
- (void)mac->ops.write_iosf_sb_reg(hw,
- IXGBE_KRM_PCS_KX_AN(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
-
- /* set the AN37 over CB mode */
- (void)mac->ops.read_iosf_sb_reg(hw,
- IXGBE_KRM_AN_CNTL_4(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
- reg_val |= 0x20000000;
-
- (void)mac->ops.write_iosf_sb_reg(hw,
- IXGBE_KRM_AN_CNTL_4(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
-
- /* restart AN manually */
- (void)mac->ops.read_iosf_sb_reg(hw,
- IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, ®_val);
- reg_val |= IXGBE_KRM_LINK_CTRL_1_TETH_AN_RESTART;
-
- (void)mac->ops.write_iosf_sb_reg(hw,
- IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
+ status = mac->ops.write_iosf_sb_reg(hw,
+ IXGBE_KRM_PMD_FLX_MASK_ST20(hw->bus.lan_id),
+ IXGBE_SB_IOSF_TARGET_KR_PHY, reg_val);
/* Toggle port SW reset by AN reset. */
status = ixgbe_restart_an_internal_phy_x550em(hw);
---
base-commit: e4a87abf588536d1cdfb128595e6e680af5cf3ed
change-id: 20240520-net-2024-05-20-revert-silicom-switch-workaround-48a6a7a9b0a0
Best regards,
--
Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-21 0:21 [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI" Jacob Keller
@ 2024-05-21 16:41 ` Simon Horman
2024-05-21 16:49 ` Jeff Daly
2024-05-23 9:00 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 14+ messages in thread
From: Simon Horman @ 2024-05-21 16:41 UTC (permalink / raw)
To: Jacob Keller; +Cc: netdev, Jeff Daly, kernel.org-fo5k2w
On Mon, May 20, 2024 at 05:21:27PM -0700, Jacob Keller wrote:
> This reverts commit 565736048bd5f9888990569993c6b6bfdf6dcb6d.
>
> According to the commit, it implements a manual AN-37 for some
> "troublesome" Juniper MX5 switches. This appears to be a workaround for a
> particular switch.
>
> It has been reported that this causes a severe breakage for other switches,
> including a Cisco 3560CX-12PD-S.
>
> The code appears to be a workaround for a specific switch which fails to
> link in SFI mode. It expects to see AN-37 auto negotiation in order to
> link. The Cisco switch is not expecting AN-37 auto negotiation. When the
> device starts the manual AN-37, the Cisco switch decides that the port is
> confused and stops attempting to link with it. This persists until a power
> cycle. A simple driver unload and reload does not resolve the issue, even
> if loading with a version of the driver which lacks this workaround.
>
> The authors of the workaround commit have not responded with
> clarifications, and the result of the workaround is complete failure to
> connect with other switches.
>
> This appears to be a case where the driver can either "correctly" link with
> the Juniper MX5 switch, at the cost of bricking the link with the Cisco
> switch, or it can behave properly for the Cisco switch, but fail to link
> with the Junipir MX5 switch. I do not know enough about the standards
> involved to clearly determine whether either switch is at fault or behaving
> incorrectly. Nor do I know whether there exists some alternative fix which
> corrects behavior with both switches.
>
> Revert the workaround for the Juniper switch.
>
> Fixes: 565736048bd5 ("ixgbe: Manual AN-37 for troublesome link partners for X550 SFI")
> Link: https://lore.kernel.org/netdev/cbe874db-9ac9-42b8-afa0-88ea910e1e99@intel.com/T/
> Link: https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.135129/#post-612291
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Jeff Daly <jeffd@silicom-usa.com>
> Cc: kernel.org-fo5k2w@ycharbi.fr
One of those awkward situations where the only known (in this case to Jacob
and me) resolution to a regression is itself a regression (for a different
setup).
I think that in these kind of situations it's best to go back to how things
were.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 14+ messages in thread* RE: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-21 16:41 ` Simon Horman
@ 2024-05-21 16:49 ` Jeff Daly
2024-05-21 17:12 ` kernel.org-fo5k2w
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Daly @ 2024-05-21 16:49 UTC (permalink / raw)
To: Simon Horman, Jacob Keller
Cc: netdev@vger.kernel.org, kernel.org-fo5k2w@ycharbi.fr
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: Tuesday, May 21, 2024 12:42 PM
> To: Jacob Keller <jacob.e.keller@intel.com>
> Cc: netdev@vger.kernel.org; Jeff Daly <jeffd@silicom-usa.com>; kernel.org-
> fo5k2w@ycharbi.fr
> Subject: Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link
> partners for X550 SFI"
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments.
>
>
> On Mon, May 20, 2024 at 05:21:27PM -0700, Jacob Keller wrote:
> > This reverts commit 565736048bd5f9888990569993c6b6bfdf6dcb6d.
> >
> > According to the commit, it implements a manual AN-37 for some
> > "troublesome" Juniper MX5 switches. This appears to be a workaround
> > for a particular switch.
> >
> > It has been reported that this causes a severe breakage for other
> > switches, including a Cisco 3560CX-12PD-S.
> >
> > The code appears to be a workaround for a specific switch which fails
> > to link in SFI mode. It expects to see AN-37 auto negotiation in order
> > to link. The Cisco switch is not expecting AN-37 auto negotiation.
> > When the device starts the manual AN-37, the Cisco switch decides that
> > the port is confused and stops attempting to link with it. This
> > persists until a power cycle. A simple driver unload and reload does
> > not resolve the issue, even if loading with a version of the driver which lacks
> this workaround.
> >
> > The authors of the workaround commit have not responded with
> > clarifications, and the result of the workaround is complete failure
> > to connect with other switches.
> >
> > This appears to be a case where the driver can either "correctly" link
> > with the Juniper MX5 switch, at the cost of bricking the link with the
> > Cisco switch, or it can behave properly for the Cisco switch, but fail
> > to link with the Junipir MX5 switch. I do not know enough about the
> > standards involved to clearly determine whether either switch is at
> > fault or behaving incorrectly. Nor do I know whether there exists some
> > alternative fix which corrects behavior with both switches.
> >
> > Revert the workaround for the Juniper switch.
> >
> > Fixes: 565736048bd5 ("ixgbe: Manual AN-37 for troublesome link
> > partners for X550 SFI")
> > Link:
> > https://lore.kernel.org/netdev/cbe874db-9ac9-42b8-afa0-88ea910e1e99@in
> > tel.com/T/
> > Link:
> > https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.1
> > 35129/#post-612291
> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > Cc: Jeff Daly <jeffd@silicom-usa.com>
> > Cc: kernel.org-fo5k2w@ycharbi.fr
>
> One of those awkward situations where the only known (in this case to Jacob
> and me) resolution to a regression is itself a regression (for a different setup).
>
> I think that in these kind of situations it's best to go back to how things were.
>
> Reviewed-by: Simon Horman <horms@kernel.org>
In principle, I don't disagree..... However, our customer was very sensitive to having any patches/workarounds needed for their configuration be part of the upstream. Aside from maintaining our own patchset (or figuring out whether there's a patch that works for everyone) is there a better solution?
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-21 16:49 ` Jeff Daly
@ 2024-05-21 17:12 ` kernel.org-fo5k2w
2024-05-21 21:05 ` Jacob Keller
0 siblings, 1 reply; 14+ messages in thread
From: kernel.org-fo5k2w @ 2024-05-21 17:12 UTC (permalink / raw)
To: Jeff Daly, Simon Horman, Jacob Keller; +Cc: netdev, kernel.org-fo5k2w
If any of you have the skills to develop a patch that tries to satisfy everyone, please know that I'm always available for testing on my hardware. If Jeff also has the possibilities, it's not impossible that we could come to a consensus. All we'd have to do would be to test the behavior of our equipment in the problematic situation.
Isn't there someone at Intel who can contribute their expertise on the underlying technical reasons for the problem (obviously level 1 OSI) in order to guide us towards a state-of-the-art solution?
Best regards.
21 mai 2024 à 18:49 "Jeff Daly" <jeffd@silicom-usa.com> a écrit:
>
> >
> > -----Original Message-----
> > From: Simon Horman <horms@kernel.org>
> > Sent: Tuesday, May 21, 2024 12:42 PM
> > To: Jacob Keller <jacob.e.keller@intel.com>
> > Cc: netdev@vger.kernel.org; Jeff Daly <jeffd@silicom-usa.com>; kernel.org-
> > fo5k2w@ycharbi.fr
> > Subject: Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link
> > partners for X550 SFI"
> >
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments.
> >
> >
> > On Mon, May 20, 2024 at 05:21:27PM -0700, Jacob Keller wrote:
> > This reverts commit 565736048bd5f9888990569993c6b6bfdf6dcb6d.
> >
> > According to the commit, it implements a manual AN-37 for some
> > "troublesome" Juniper MX5 switches. This appears to be a workaround
> > for a particular switch.
> >
> > It has been reported that this causes a severe breakage for other
> > switches, including a Cisco 3560CX-12PD-S.
> >
> > The code appears to be a workaround for a specific switch which fails
> > to link in SFI mode. It expects to see AN-37 auto negotiation in order
> > to link. The Cisco switch is not expecting AN-37 auto negotiation.
> > When the device starts the manual AN-37, the Cisco switch decides that
> > the port is confused and stops attempting to link with it. This
> > persists until a power cycle. A simple driver unload and reload does
> > not resolve the issue, even if loading with a version of the driver which lacks
> > this workaround.
> >
> > The authors of the workaround commit have not responded with
> > clarifications, and the result of the workaround is complete failure
> > to connect with other switches.
> >
> > This appears to be a case where the driver can either "correctly" link
> > with the Juniper MX5 switch, at the cost of bricking the link with the
> > Cisco switch, or it can behave properly for the Cisco switch, but fail
> > to link with the Junipir MX5 switch. I do not know enough about the
> > standards involved to clearly determine whether either switch is at
> > fault or behaving incorrectly. Nor do I know whether there exists some
> > alternative fix which corrects behavior with both switches.
> >
> > Revert the workaround for the Juniper switch.
> >
> > Fixes: 565736048bd5 ("ixgbe: Manual AN-37 for troublesome link
> > partners for X550 SFI")
> > Link:
> > https://lore.kernel.org/netdev/cbe874db-9ac9-42b8-afa0-88ea910e1e99@in
> > tel.com/T/
> > Link:
> > https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.1
> > 35129/#post-612291
> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > Cc: Jeff Daly <jeffd@silicom-usa.com>
> > Cc: kernel.org-fo5k2w@ycharbi.fr
> >
> > One of those awkward situations where the only known (in this case to Jacob
> > and me) resolution to a regression is itself a regression (for a different setup).
> >
> > I think that in these kind of situations it's best to go back to how things were.
> >
> > Reviewed-by: Simon Horman <horms@kernel.org>
> >
>
> In principle, I don't disagree..... However, our customer was very sensitive to having any patches/workarounds needed for their configuration be part of the upstream. Aside from maintaining our own patchset (or figuring out whether there's a patch that works for everyone) is there a better solution?
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-21 17:12 ` kernel.org-fo5k2w
@ 2024-05-21 21:05 ` Jacob Keller
2024-05-23 16:15 ` Jacob Keller
0 siblings, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2024-05-21 21:05 UTC (permalink / raw)
To: kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
On 5/21/2024 10:12 AM, kernel.org-fo5k2w@ycharbi.fr wrote:
> If any of you have the skills to develop a patch that tries to satisfy everyone, please know that I'm always available for testing on my hardware. If Jeff also has the possibilities, it's not impossible that we could come to a consensus. All we'd have to do would be to test the behavior of our equipment in the problematic situation.
>
I would love a solution which fixes both cases. I don't currently have
any idea what it would be.
> Isn't there someone at Intel who can contribute their expertise on the underlying technical reasons for the problem (obviously level 1 OSI) in order to guide us towards a state-of-the-art solution?
>
> Best regards.
>
Unfortunately I do not know anyone still here who has the expertise to
solve this. The out-of-tree ixgbe driver does not have the fix from
Silicom, so from Intel's perspective the correct implementation matches
the need for the Cisco switch...
> On 5/21/2024 9:49 AM, Jeff Daly wrote:
>>
>>
>>> -----Original Message-----
>>> From: Simon Horman <horms@kernel.org>
>>> Sent: Tuesday, May 21, 2024 12:42 PM
>>> To: Jacob Keller <jacob.e.keller@intel.com>
>>> Cc: netdev@vger.kernel.org; Jeff Daly <jeffd@silicom-usa.com>; kernel.org-
>>> fo5k2w@ycharbi.fr
>>> Subject: Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link
>>> partners for X550 SFI"
>>>
>>> One of those awkward situations where the only known (in this case to Jacob
>>> and me) resolution to a regression is itself a regression (for a different setup).
>>>
>>> I think that in these kind of situations it's best to go back to how things were.
>>>
>>> Reviewed-by: Simon Horman <horms@kernel.org>
>>
>> In principle, I don't disagree..... However, our customer was very sensitive to having any patches/workarounds needed for their configuration be part of the upstream. Aside from maintaining our own patchset (or figuring out whether there's a patch that works for everyone) is there a better solution?
>>
>>
We're somewhat stuck between a rock and a hard place here. I don't have
full context for the problem, however I did manage to get a little more
info about this from internal bugs.
Here is the facts as i understand it:
1. The Juniper MX5 switch appears to require clause 37 auto negotiation
(AN-37) to link.
2. The Cisco 3560CX-12PD-S appears to reject AN-37 as invalid and stops
trying to link if it sees it for this case.
3. As far as I understand, AN-37 is intended for 1G links, and is not
generally supported or used in 10GB? It looks like the way this fix
applies affects all 10GB SFP links, which results in the issues with the
Cisco switch.
For context, this document was the best I found from a quick google
search: https://www.ieee802.org/3/by/public/Mar15/booth_3by_01_0315.pdf
It appears the Cisco device is linking at some form of 10GB according to
the bug report here:
> show interface status | include Te1/0/1
> Te1/0/1 --- Vers Qotom --- connected trunk full 10G
> SFP-10GBase-CX1
The link is an SFP-10GBase-CX1?
@Jeff, can you provide any further details about the Juniper MX5 switch
case that the original change fixes?
The function being modified here is the ixgbe_setup_sfi_x550a, which is
called for setting up SFI, and the description says "Used to connect the
internal PHY directly to an SFP cage without auto-negotiation"
It is only called by ixgbe_setup_mac_link_sfp_n which is supposed to
configure the PHY for native SFP support for IXGBE_DEV_ID_X550EM_A_SFP_N
(0x15C4). No other device type is changed with this.
Every comment here implies that this has no auto negotiation, which
makes it extremely weird to me that we try to enable AN-37 in this flow.
Without more context, my gut instinct is that the Cisco switch is likely
following the general expectations here compared to the Juniper switch.
I also don't see a good way currently to have the driver select between
the options, if both cases are standard SFP. It can't know what its
linked against. If we try the AN-37 flow with Cisco, it essentially
bricks the link until a reboot. Even reloading to the out-of-tree driver
which doesn't do this AN-37 flow fails to recover link. This makes any
sort of "fallback" mechanism unlikely to work.
Unless we can find some obvious way to distinguish the two cases, or
there is fundamentally a different fix for the Juniper case, I don't see
how we can support both flows.
I guess there is the option of some sort of toggle via ethtool/otherwise
to allow selection... But users might try to enable this when link is
faulty and end up hitting the case where once we try the AN-37, the
remote switch refuses to try again until a cycle.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-21 21:05 ` Jacob Keller
@ 2024-05-23 16:15 ` Jacob Keller
2024-05-23 16:18 ` Jeff Daly
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Jacob Keller @ 2024-05-23 16:15 UTC (permalink / raw)
To: kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
On 5/21/2024 2:05 PM, Jacob Keller wrote:
>
>
> On 5/21/2024 10:12 AM, kernel.org-fo5k2w@ycharbi.fr wrote:
>> If any of you have the skills to develop a patch that tries to satisfy everyone, please know that I'm always available for testing on my hardware. If Jeff also has the possibilities, it's not impossible that we could come to a consensus. All we'd have to do would be to test the behavior of our equipment in the problematic situation.
>>
>
> I would love a solution which fixes both cases. I don't currently have
> any idea what it would be.
>
It looks like netdev pulled the revert. Given the lack of a full
understanding of the original fix posted from Jeff, I think this is the
right decision.
>> Isn't there someone at Intel who can contribute their expertise on the underlying technical reasons for the problem (obviously level 1 OSI) in order to guide us towards a state-of-the-art solution?
>>
I did create an internal ticket here to track and try to get a
reproduction so that some of our link experts can diagnose the issue
being seen.
I hope to have news I can report on this soon.
> I guess there is the option of some sort of toggle via ethtool/otherwise
> to allow selection... But users might try to enable this when link is
> faulty and end up hitting the case where once we try the AN-37, the
> remote switch refuses to try again until a cycle.
>
Given that we have two cases where our current understanding is a need
for mutually exclusive behavior, we (Intel) would be open to some sort
of config option, flag, or otherwise toggle to enable the Silicom folks
without breaking everything else. I don't know what the acceptance for
such an idea is with the rest of the community.
I hope that internal reproduction task above may lead to a better
understanding and possibly a fix that can resolve both cases.
^ permalink raw reply [flat|nested] 14+ messages in thread* RE: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 16:15 ` Jacob Keller
@ 2024-05-23 16:18 ` Jeff Daly
2024-05-23 16:49 ` kernel.org-fo5k2w
2024-06-26 14:30 ` Jeff Daly
2 siblings, 0 replies; 14+ messages in thread
From: Jeff Daly @ 2024-05-23 16:18 UTC (permalink / raw)
To: Jacob Keller, kernel.org-fo5k2w@ycharbi.fr, Simon Horman
Cc: netdev@vger.kernel.org
Jacob, initially a config flag option was put forward but because of the maturity of the driver, that option was shot down by the maintainers.
> -----Original Message-----
> From: Jacob Keller <jacob.e.keller@intel.com>
> Sent: Thursday, May 23, 2024 12:15 PM
> To: kernel.org-fo5k2w@ycharbi.fr; Jeff Daly <jeffd@silicom-usa.com>; Simon
> Horman <horms@kernel.org>
> Cc: netdev@vger.kernel.org
> Subject: Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link
> partners for X550 SFI"
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments.
>
>
> On 5/21/2024 2:05 PM, Jacob Keller wrote:
> >
> >
> > On 5/21/2024 10:12 AM, kernel.org-fo5k2w@ycharbi.fr wrote:
> >> If any of you have the skills to develop a patch that tries to satisfy everyone,
> please know that I'm always available for testing on my hardware. If Jeff also
> has the possibilities, it's not impossible that we could come to a consensus. All
> we'd have to do would be to test the behavior of our equipment in the
> problematic situation.
> >>
> >
> > I would love a solution which fixes both cases. I don't currently have
> > any idea what it would be.
> >
>
> It looks like netdev pulled the revert. Given the lack of a full understanding of the
> original fix posted from Jeff, I think this is the right decision.
>
> >> Isn't there someone at Intel who can contribute their expertise on the
> underlying technical reasons for the problem (obviously level 1 OSI) in order to
> guide us towards a state-of-the-art solution?
> >>
>
> I did create an internal ticket here to track and try to get a reproduction so that
> some of our link experts can diagnose the issue being seen.
>
> I hope to have news I can report on this soon.
>
> > I guess there is the option of some sort of toggle via
> > ethtool/otherwise to allow selection... But users might try to enable
> > this when link is faulty and end up hitting the case where once we try
> > the AN-37, the remote switch refuses to try again until a cycle.
> >
>
> Given that we have two cases where our current understanding is a need for
> mutually exclusive behavior, we (Intel) would be open to some sort of config
> option, flag, or otherwise toggle to enable the Silicom folks without breaking
> everything else. I don't know what the acceptance for such an idea is with the
> rest of the community.
>
> I hope that internal reproduction task above may lead to a better understanding
> and possibly a fix that can resolve both cases.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 16:15 ` Jacob Keller
2024-05-23 16:18 ` Jeff Daly
@ 2024-05-23 16:49 ` kernel.org-fo5k2w
2024-05-23 18:47 ` Jacob Keller
2024-05-23 18:49 ` Jacob Keller
2024-06-26 14:30 ` Jeff Daly
2 siblings, 2 replies; 14+ messages in thread
From: kernel.org-fo5k2w @ 2024-05-23 16:49 UTC (permalink / raw)
To: Jacob Keller, kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
> It looks like netdev pulled the revert. Given the lack of a full
> understanding of the original fix posted from Jeff, I think this is the
> right decision.
Thank you very much for your investment.
I hope a solution can be found for Jeff.
> I did create an internal ticket here to track and try to get a
> reproduction so that some of our link experts can diagnose the issue
> being seen.
>
> I hope to have news I can report on this soon.
Super. Cela va peut-être faire remonter d'autres problèmes sous-jacent dans l'implémentation de la norme.
>
> >
> > I guess there is the option of some sort of toggle via ethtool/otherwise
> > to allow selection... But users might try to enable this when link is
> > faulty and end up hitting the case where once we try the AN-37, the
> > remote switch refuses to try again until a cycle.
> >
>
> Given that we have two cases where our current understanding is a need
> for mutually exclusive behavior, we (Intel) would be open to some sort
> of config option, flag, or otherwise toggle to enable the Silicom folks
> without breaking everything else. I don't know what the acceptance for
> such an idea is with the rest of the community.
>
> I hope that internal reproduction task above may lead to a better
> understanding and possibly a fix that can resolve both cases.
>
> The link is an SFP-10GBase-CX1?
As I understand it, CX1 is the name given to Twinax copper cables such as the one I used in the experiments in this thread. It's therefore a priori the right value to display for this kind of connection (instead of “10000baseT/Full”).
Thanks again for your hard work in finding a solution. You can always contact me later so that I can carry out tests if you need. The machine is at least available for 1 year for testing purposes.
Best regards,
Yohan.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 16:49 ` kernel.org-fo5k2w
@ 2024-05-23 18:47 ` Jacob Keller
2024-05-23 18:49 ` Jacob Keller
1 sibling, 0 replies; 14+ messages in thread
From: Jacob Keller @ 2024-05-23 18:47 UTC (permalink / raw)
To: kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
On 5/23/2024 9:49 AM, kernel.org-fo5k2w@ycharbi.fr wrote:
>> The link is an SFP-10GBase-CX1?
>
> As I understand it, CX1 is the name given to Twinax copper cables such as the one I used in the experiments in this thread. It's therefore a priori the right value to display for this kind of connection (instead of “10000baseT/Full”).
>
I'm filing an internal tracking bug regarding this as well.
> Thanks again for your hard work in finding a solution. You can always contact me later so that I can carry out tests if you need. The machine is at least available for 1 year for testing purposes.
>
> Best regards,
> Yohan.
Appreciate it. Hopefully we can find something that works for both cases.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 16:49 ` kernel.org-fo5k2w
2024-05-23 18:47 ` Jacob Keller
@ 2024-05-23 18:49 ` Jacob Keller
2024-05-23 20:19 ` kernel.org-fo5k2w
1 sibling, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2024-05-23 18:49 UTC (permalink / raw)
To: kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
On 5/23/2024 9:49 AM, kernel.org-fo5k2w@ycharbi.fr wrote:
>>
>
>> The link is an SFP-10GBase-CX1?
>
> As I understand it, CX1 is the name given to Twinax copper cables such as the one I used in the experiments in this thread. It's therefore a priori the right value to display for this kind of connection (instead of “10000baseT/Full”).
>
One more thing: Could you confirm if this behavior appears in the 5.19.9
driver from the Intel website or source forge? I'm curious if this is a
case of a fix that never got published to the netdev community.
Thanks,
Jake
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 18:49 ` Jacob Keller
@ 2024-05-23 20:19 ` kernel.org-fo5k2w
2024-05-23 20:35 ` Jacob Keller
0 siblings, 1 reply; 14+ messages in thread
From: kernel.org-fo5k2w @ 2024-05-23 20:19 UTC (permalink / raw)
To: Jacob Keller, kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
> One more thing: Could you confirm if this behavior appears in the 5.19.9
> driver from the Intel website or source forge? I'm curious if this is a
> case of a fix that never got published to the netdev community.
>
> Thanks,
> Jake
I can confirm that the result of the “Supported link modes:” section is identical with the Intel ixgbe-5.19.9 driver:
uname -r
5.10.0-29-amd64
apt install linux-headers-$(uname -r) gcc make
wget https://downloadmirror.intel.com/812532/ixgbe-5.19.9.tar.gz
tar xf ixgbe-5.19.9.tar.gz -C /usr/local/src/
make -j 8
modinfo ./ixgbe.ko
rmmod ixgbe
modprobe dca
insmod ./ixgbe.ko
# eno1 is up
ethtool eno1
Settings for eno1:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Auto-negotiation: off
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 20:19 ` kernel.org-fo5k2w
@ 2024-05-23 20:35 ` Jacob Keller
0 siblings, 0 replies; 14+ messages in thread
From: Jacob Keller @ 2024-05-23 20:35 UTC (permalink / raw)
To: kernel.org-fo5k2w, Jeff Daly, Simon Horman; +Cc: netdev
On 5/23/2024 1:19 PM, kernel.org-fo5k2w@ycharbi.fr wrote:
>> One more thing: Could you confirm if this behavior appears in the 5.19.9
>> driver from the Intel website or source forge? I'm curious if this is a
>> case of a fix that never got published to the netdev community.
>>
>> Thanks,
>> Jake
>
> I can confirm that the result of the “Supported link modes:” section is identical with the Intel ixgbe-5.19.9 driver:
> uname -r
> 5.10.0-29-amd64
>
> apt install linux-headers-$(uname -r) gcc make
> wget https://downloadmirror.intel.com/812532/ixgbe-5.19.9.tar.gz
> tar xf ixgbe-5.19.9.tar.gz -C /usr/local/src/
> make -j 8
>
> modinfo ./ixgbe.ko
> rmmod ixgbe
> modprobe dca
> insmod ./ixgbe.ko
>
> # eno1 is up
> ethtool eno1
> Settings for eno1:
> Supported ports: [ FIBRE ]
> Supported link modes: 10000baseT/Full
> Supported pause frame use: Symmetric
> Supports auto-negotiation: No
> Supported FEC modes: Not reported
> Advertised link modes: 10000baseT/Full
> Advertised pause frame use: Symmetric
> Advertised auto-negotiation: No
> Advertised FEC modes: Not reported
> Speed: 10000Mb/s
> Duplex: Full
> Auto-negotiation: off
> Port: Direct Attach Copper
> PHYAD: 0
> Transceiver: internal
> Supports Wake-on: d
> Wake-on: d
> Current message level: 0x00000007 (7)
> drv probe link
> Link detected: yes
Thanks. At a glance from reviewing code, it looks like ixgbe simply
collapses all non-backplane links into 10000baseT/Full.
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-23 16:15 ` Jacob Keller
2024-05-23 16:18 ` Jeff Daly
2024-05-23 16:49 ` kernel.org-fo5k2w
@ 2024-06-26 14:30 ` Jeff Daly
2 siblings, 0 replies; 14+ messages in thread
From: Jeff Daly @ 2024-06-26 14:30 UTC (permalink / raw)
To: Jacob Keller, kernel.org-fo5k2w@ycharbi.fr, Simon Horman
Cc: netdev@vger.kernel.org
Looking for a solution that would satisfy everyone....
> -----Original Message-----
> From: Jacob Keller <jacob.e.keller@intel.com>
> Sent: Thursday, May 23, 2024 12:15 PM
> To: kernel.org-fo5k2w@ycharbi.fr; Jeff Daly <jeffd@silicom-usa.com>; Simon
> Horman <horms@kernel.org>
> Cc: netdev@vger.kernel.org
> Subject: Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link
> partners for X550 SFI"
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments.
>
>
> On 5/21/2024 2:05 PM, Jacob Keller wrote:
> >
> >
> > On 5/21/2024 10:12 AM, kernel.org-fo5k2w@ycharbi.fr wrote:
> >> If any of you have the skills to develop a patch that tries to satisfy everyone,
> please know that I'm always available for testing on my hardware. If Jeff also
> has the possibilities, it's not impossible that we could come to a consensus. All
> we'd have to do would be to test the behavior of our equipment in the
> problematic situation.
> >>
> >
> > I would love a solution which fixes both cases. I don't currently have
> > any idea what it would be.
> >
>
> It looks like netdev pulled the revert. Given the lack of a full understanding of the
> original fix posted from Jeff, I think this is the right decision.
>
> >> Isn't there someone at Intel who can contribute their expertise on the
> underlying technical reasons for the problem (obviously level 1 OSI) in order to
> guide us towards a state-of-the-art solution?
> >>
>
> I did create an internal ticket here to track and try to get a reproduction so that
> some of our link experts can diagnose the issue being seen.
>
> I hope to have news I can report on this soon.
>
> > I guess there is the option of some sort of toggle via
> > ethtool/otherwise to allow selection... But users might try to enable
> > this when link is faulty and end up hitting the case where once we try
> > the AN-37, the remote switch refuses to try again until a cycle.
> >
>
> Given that we have two cases where our current understanding is a need for
> mutually exclusive behavior, we (Intel) would be open to some sort of config
> option, flag, or otherwise toggle to enable the Silicom folks without breaking
> everything else. I don't know what the acceptance for such an idea is with the
> rest of the community.
>
Originally, this was the idea that was put forward, and if I recall it was quashed by the maintainers due to the maturity of the driver. I was told that introducing new config options were a no-go. If there's a possibility of reworking the patch to introduce a new config option during module loading that would be specific to our fix, I'm sure we can come up with something appropriate.... I just don't want to try to push something that will never get accepted, but I think in this case it's something that could be warranted.
I understand the desire to not have special workaround code for a singular case, but in this instance I think it's the only option.
> I hope that internal reproduction task above may lead to a better understanding
> and possibly a fix that can resolve both cases.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
2024-05-21 0:21 [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI" Jacob Keller
2024-05-21 16:41 ` Simon Horman
@ 2024-05-23 9:00 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-05-23 9:00 UTC (permalink / raw)
To: Jacob Keller; +Cc: netdev, jeffd, kernel.org-fo5k2w
Hello:
This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Mon, 20 May 2024 17:21:27 -0700 you wrote:
> This reverts commit 565736048bd5f9888990569993c6b6bfdf6dcb6d.
>
> According to the commit, it implements a manual AN-37 for some
> "troublesome" Juniper MX5 switches. This appears to be a workaround for a
> particular switch.
>
> It has been reported that this causes a severe breakage for other switches,
> including a Cisco 3560CX-12PD-S.
>
> [...]
Here is the summary with links:
- [net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"
https://git.kernel.org/netdev/net/c/b35b1c0b4e16
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-06-26 14:30 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-21 0:21 [PATCH net] Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI" Jacob Keller
2024-05-21 16:41 ` Simon Horman
2024-05-21 16:49 ` Jeff Daly
2024-05-21 17:12 ` kernel.org-fo5k2w
2024-05-21 21:05 ` Jacob Keller
2024-05-23 16:15 ` Jacob Keller
2024-05-23 16:18 ` Jeff Daly
2024-05-23 16:49 ` kernel.org-fo5k2w
2024-05-23 18:47 ` Jacob Keller
2024-05-23 18:49 ` Jacob Keller
2024-05-23 20:19 ` kernel.org-fo5k2w
2024-05-23 20:35 ` Jacob Keller
2024-06-26 14:30 ` Jeff Daly
2024-05-23 9:00 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).