From mboxrd@z Thu Jan 1 00:00:00 1970 From: Olivier MATZ Subject: Re: i40e_aq_get_phy_capabilities() fails when using SFP+ with no link Date: Tue, 10 Jan 2017 16:28:49 +0100 Message-ID: <20170110162849.2256dc6e@glumotte.dev.6wind.com> References: <2BF7FCC7-B2DF-43EE-B5F8-2F3271FB3DA1@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: dev@dpdk.org, helin.zhang@intel.com, jingjing.wu@intel.com To: Christos Ricudis Return-path: Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by dpdk.org (Postfix) with ESMTP id 26DCC1094 for ; Tue, 10 Jan 2017 16:28:57 +0100 (CET) Received: by mail-wm0-f44.google.com with SMTP id k184so165659850wme.1 for ; Tue, 10 Jan 2017 07:28:57 -0800 (PST) In-Reply-To: <2BF7FCC7-B2DF-43EE-B5F8-2F3271FB3DA1@gmail.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Christos, +CC i40e maintainers On Tue, 10 Jan 2017 20:32:26 +0800, Christos Ricudis wrote: > Hello,=20 >=20 > Using a X710 based 4-port 4x10Gbit NIC, I have came across the > following issue on the i40e PMD:=20 >=20 > When an optical SFP+ (Finisar FTLX8571D3BCL) is used with no active > link partner on the other end of the link (or fiber completely > disconnected from the SFP+), i40e_aq_get_phy_capabilities() (called > by i40e_dev_sync_phy_type() on port initialization), fails with a > 0x05 return value (EIO) on the AQ response structure. The struct > i40e_aq_get_phy_abilities_resp buffer passed to the Get Phy Abilities > command is unmodified upon return.=20 >=20 > This prevents DPDK 16.11 from initializing the port, and ultimately > fails with the following error:=20 >=20 > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 >=20 > The change introducing this issue was > http://dpdk.org/ml/archives/dev/2016-September/047663.html >=20 > Reading the X710 datasheet, I notice that no specific mention is > given on the meaning of EIO as a response to Get PHY Abilities > command (opcode 0x0600), whereas in most other commands, an explicit > mention of the meaning of the possible error status responses is > given.=20 >=20 > This behaviour is the same across the following NVM releases:=20 >=20 > FW 4.33 API 1.2 NVM 04.04.02 eetrack 800018a6 > FW 4.40 API 1.4 NVM 04.05.03 eetrack 80001cd8 > FW 5.0 API 1.5 NVM 05.00.04 eetrack 800024da >=20 > I will try to get around the issue by falling back to PHY > capabilities detection using the device ID in the case > i40e_aq_get_phy_capabilities() fails, but conceptually the > capabilities of the PHY should not be dependent on whether PHY > detects an active link or not.=20 >=20 > I=E2=80=99d be happy to do more testing on this issue per your > recommendations.=20 >=20 > Moreover, while trying to debug this issue, I managed to get both 3 > NIC adapters on my test system on a state where the PHY has > apparently died - no link indication at all on any ports. A reboot > solved this, and I am now trying to replicate this behaviour under > more controlled conditions.=20 >=20 I'm currently running into a similar issue (I think). I can reproduce it with testpmd with the following case: set link_check off port stop 0 # don't wait between these 2 commands port start 0 I added some logs that are displayed after the port start: PMD: i40e_set_tx_function(): Vector tx finally be used. PMD: i40e_set_rx_function(): Vector rx enabled, please make sure RX burst size no less than 4 (port=3D0). PMD: i40e_dev_rx_queue_start(): >> PMD: i40e_dev_tx_queue_start(): >> PMD: i40e_dev_start(): applying link settings... PMD: i40e_apply_link_speed(): abilities =3D 38, speed =3D 0 PMD: i40e_phy_conf_link(): i40e_aq_get_phy_capabilities failed -7 PMD: i40e_dev_start(): Fail to apply link setting PMD: i40e_dev_clear_queues(): >> The -7 corresponds to I40E_ERR_UNKNOWN_PHY. This happens in i40e_aq_get_phy_capabilities() in the following code, which makes me think it's the same problem than yours: if (hw->aq.asq_last_status =3D=3D I40E_AQ_RC_EIO) status =3D I40E_ERR_UNKNOWN_PHY; A workaround in my usecase is to wait a bit between the stop and the start. Any help is welcome. Regards, Olivier