* [PATCH net 1/3] net: ena: unmask MSI-X only after device initialization is completed
2017-12-28 21:30 [PATCH net 0/3] bug fixes for ENA Ethernet driver netanel
@ 2017-12-28 21:30 ` netanel
2017-12-28 21:30 ` [PATCH net 2/3] net: ena: fix error handling in ena_down() sequence netanel
2017-12-28 21:30 ` [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered netanel
2 siblings, 0 replies; 9+ messages in thread
From: netanel @ 2017-12-28 21:30 UTC (permalink / raw)
To: davem, netdev
Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
nafea, evgenys, gtzalik
From: Netanel Belgazal <netanel@amazon.com>
Under certain conditions MSI-X interrupt might arrive right after it
was unmasked in ena_up(). There is a chance it would be processed by
the driver before device ENA_FLAG_DEV_UP flag is set. In such a case
the interrupt is ignored.
ENA device operates in auto-masked mode, therefore ignoring
interrupt leaves it masked for good.
Moving unmask of interrupt to be the last step in ena_up().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 97c5a89a9cf7..6fb28fd43eb3 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1565,7 +1565,7 @@ static int ena_rss_configure(struct ena_adapter *adapter)
static int ena_up_complete(struct ena_adapter *adapter)
{
- int rc, i;
+ int rc;
rc = ena_rss_configure(adapter);
if (rc)
@@ -1584,17 +1584,6 @@ static int ena_up_complete(struct ena_adapter *adapter)
ena_napi_enable_all(adapter);
- /* Enable completion queues interrupt */
- for (i = 0; i < adapter->num_queues; i++)
- ena_unmask_interrupt(&adapter->tx_ring[i],
- &adapter->rx_ring[i]);
-
- /* schedule napi in case we had pending packets
- * from the last time we disable napi
- */
- for (i = 0; i < adapter->num_queues; i++)
- napi_schedule(&adapter->ena_napi[i].napi);
-
return 0;
}
@@ -1731,7 +1720,7 @@ static int ena_create_all_io_rx_queues(struct ena_adapter *adapter)
static int ena_up(struct ena_adapter *adapter)
{
- int rc;
+ int rc, i;
netdev_dbg(adapter->netdev, "%s\n", __func__);
@@ -1774,6 +1763,17 @@ static int ena_up(struct ena_adapter *adapter)
set_bit(ENA_FLAG_DEV_UP, &adapter->flags);
+ /* Enable completion queues interrupt */
+ for (i = 0; i < adapter->num_queues; i++)
+ ena_unmask_interrupt(&adapter->tx_ring[i],
+ &adapter->rx_ring[i]);
+
+ /* schedule napi in case we had pending packets
+ * from the last time we disable napi
+ */
+ for (i = 0; i < adapter->num_queues; i++)
+ napi_schedule(&adapter->ena_napi[i].napi);
+
return rc;
err_up:
--
2.7.3.AMZN
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net 2/3] net: ena: fix error handling in ena_down() sequence
2017-12-28 21:30 [PATCH net 0/3] bug fixes for ENA Ethernet driver netanel
2017-12-28 21:30 ` [PATCH net 1/3] net: ena: unmask MSI-X only after device initialization is completed netanel
@ 2017-12-28 21:30 ` netanel
2017-12-28 21:30 ` [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered netanel
2 siblings, 0 replies; 9+ messages in thread
From: netanel @ 2017-12-28 21:30 UTC (permalink / raw)
To: davem, netdev
Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
nafea, evgenys, gtzalik
From: Netanel Belgazal <netanel@amazon.com>
ENA admin command queue errors are not handled as part of ena_down().
As a result, in case of error admin queue transitions to non-running
state and aborts all subsequent commands including those coming from
ena_up(). Reset scheduled by the driver from the timer service
context would not proceed due to sharing rtnl with ena_up()/ena_down()
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 6fb28fd43eb3..fbe21a817bd8 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -75,6 +75,9 @@ static struct workqueue_struct *ena_wq;
MODULE_DEVICE_TABLE(pci, ena_pci_tbl);
static int ena_rss_init_default(struct ena_adapter *adapter);
+static void check_for_admin_com_state(struct ena_adapter *adapter);
+static void ena_destroy_device(struct ena_adapter *adapter);
+static int ena_restore_device(struct ena_adapter *adapter);
static void ena_tx_timeout(struct net_device *dev)
{
@@ -1884,6 +1887,17 @@ static int ena_close(struct net_device *netdev)
if (test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
ena_down(adapter);
+ /* Check for device status and issue reset if needed*/
+ check_for_admin_com_state(adapter);
+ if (unlikely(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+ netif_err(adapter, ifdown, adapter->netdev,
+ "Destroy failure, restarting device\n");
+ ena_dump_stats_to_dmesg(adapter);
+ /* rtnl lock already obtained in dev_ioctl() layer */
+ ena_destroy_device(adapter);
+ ena_restore_device(adapter);
+ }
+
return 0;
}
@@ -2544,11 +2558,12 @@ static void ena_destroy_device(struct ena_adapter *adapter)
ena_com_set_admin_running_state(ena_dev, false);
- ena_close(netdev);
+ if (test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
+ ena_down(adapter);
/* Before releasing the ENA resources, a device reset is required.
* (to prevent the device from accessing them).
- * In case the reset flag is set and the device is up, ena_close
+ * In case the reset flag is set and the device is up, ena_down()
* already perform the reset, so it can be skipped.
*/
if (!(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags) && dev_up))
--
2.7.3.AMZN
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
2017-12-28 21:30 [PATCH net 0/3] bug fixes for ENA Ethernet driver netanel
2017-12-28 21:30 ` [PATCH net 1/3] net: ena: unmask MSI-X only after device initialization is completed netanel
2017-12-28 21:30 ` [PATCH net 2/3] net: ena: fix error handling in ena_down() sequence netanel
@ 2017-12-28 21:30 ` netanel
2017-12-29 7:46 ` Jakub Kicinski
2018-01-02 19:08 ` David Miller
2 siblings, 2 replies; 9+ messages in thread
From: netanel @ 2017-12-28 21:30 UTC (permalink / raw)
To: davem, netdev
Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
nafea, evgenys, gtzalik
From: Netanel Belgazal <netanel@amazon.com>
netif_carrier_off() should be called only after register netdev.
Move the function's call after the registration.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index fbe21a817bd8..ee50c56765a4 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3276,14 +3276,14 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
memcpy(adapter->netdev->perm_addr, adapter->mac_addr, netdev->addr_len);
- netif_carrier_off(netdev);
-
rc = register_netdev(netdev);
if (rc) {
dev_err(&pdev->dev, "Cannot register net device\n");
goto err_rss;
}
+ netif_carrier_off(netdev);
+
INIT_WORK(&adapter->reset_task, ena_fw_reset_device);
adapter->last_keep_alive_jiffies = jiffies;
--
2.7.3.AMZN
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
2017-12-28 21:30 ` [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered netanel
@ 2017-12-29 7:46 ` Jakub Kicinski
2017-12-29 8:00 ` Belgazal, Netanel
2018-01-02 19:08 ` David Miller
1 sibling, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2017-12-29 7:46 UTC (permalink / raw)
To: netanel
Cc: davem, netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea,
evgenys, gtzalik
On Thu, 28 Dec 2017 21:30:20 +0000, netanel@amazon.com wrote:
> From: Netanel Belgazal <netanel@amazon.com>
>
> netif_carrier_off() should be called only after register netdev.
> Move the function's call after the registration.
By "should" you mean in your driver, right? I think calling
netif_carrier_off() on an unregistered netdev is a pretty standard
thing to do for drivers which manage carrier state.
> Signed-off-by: Netanel Belgazal <netanel@amazon.com>
> ---
> drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index fbe21a817bd8..ee50c56765a4 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -3276,14 +3276,14 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> memcpy(adapter->netdev->perm_addr, adapter->mac_addr, netdev->addr_len);
>
> - netif_carrier_off(netdev);
> -
> rc = register_netdev(netdev);
> if (rc) {
> dev_err(&pdev->dev, "Cannot register net device\n");
> goto err_rss;
> }
>
> + netif_carrier_off(netdev);
> +
> INIT_WORK(&adapter->reset_task, ena_fw_reset_device);
This looks suspicious. After you call register_netdev() someone can
open the device and link may come up before you clear it again with
carrier off. Leading to netdev without a carrier until it's reopened.
> adapter->last_keep_alive_jiffies = jiffies;
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
2017-12-29 7:46 ` Jakub Kicinski
@ 2017-12-29 8:00 ` Belgazal, Netanel
2017-12-29 8:09 ` Jakub Kicinski
0 siblings, 1 reply; 9+ messages in thread
From: Belgazal, Netanel @ 2017-12-29 8:00 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem@davemloft.net, netdev@vger.kernel.org, Woodhouse, David,
Machulsky, Zorik, Matushevsky, Alexander, Bshara, Saeed,
Wilson, Matt, Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny,
Tzalik, Guy
Yes, I mean in my driver.
netif_carrier_off() have no effect when netdev is uninitialized.
So I must call it after register_netdev().
On 12/29/17, 9:46 AM, "Jakub Kicinski" <kubakici@wp.pl> wrote:
By "should" you mean in your driver, right? I think calling
netif_carrier_off() on an unregistered netdev is a pretty standard
thing to do for drivers which manage carrier state.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
2017-12-29 8:00 ` Belgazal, Netanel
@ 2017-12-29 8:09 ` Jakub Kicinski
0 siblings, 0 replies; 9+ messages in thread
From: Jakub Kicinski @ 2017-12-29 8:09 UTC (permalink / raw)
To: Belgazal, Netanel
Cc: davem@davemloft.net, netdev@vger.kernel.org, Woodhouse, David,
Machulsky, Zorik, Matushevsky, Alexander, Bshara, Saeed,
Wilson, Matt, Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny,
Tzalik, Guy
On Fri, 29 Dec 2017 08:00:33 +0000, Belgazal, Netanel wrote:
> Yes, I mean in my driver.
> netif_carrier_off() have no effect when netdev is uninitialized.
Please look at the implementation again, test_*and_set*_bit().
> So I must call it after register_netdev().
Is there a user-visible problem you're trying to solve here?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
2017-12-28 21:30 ` [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered netanel
2017-12-29 7:46 ` Jakub Kicinski
@ 2018-01-02 19:08 ` David Miller
2018-01-02 20:38 ` Belgazal, Netanel
1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2018-01-02 19:08 UTC (permalink / raw)
To: netanel
Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, evgenys,
gtzalik
From: <netanel@amazon.com>
Date: Thu, 28 Dec 2017 21:30:20 +0000
> From: Netanel Belgazal <netanel@amazon.com>
>
> netif_carrier_off() should be called only after register netdev.
> Move the function's call after the registration.
>
> Signed-off-by: Netanel Belgazal <netanel@amazon.com>
> ---
> drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index fbe21a817bd8..ee50c56765a4 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -3276,14 +3276,14 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> memcpy(adapter->netdev->perm_addr, adapter->mac_addr, netdev->addr_len);
>
> - netif_carrier_off(netdev);
> -
> rc = register_netdev(netdev);
> if (rc) {
> dev_err(&pdev->dev, "Cannot register net device\n");
> goto err_rss;
> }
>
> + netif_carrier_off(netdev);
> +
You cannot invoke this after register_netdev(), asynchronous activity can cause this
call to lose information and lose a link up event.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
2018-01-02 19:08 ` David Miller
@ 2018-01-02 20:38 ` Belgazal, Netanel
0 siblings, 0 replies; 9+ messages in thread
From: Belgazal, Netanel @ 2018-01-02 20:38 UTC (permalink / raw)
To: David Miller
Cc: netdev@vger.kernel.org, Woodhouse, David, Machulsky, Zorik,
Matushevsky, Alexander, Bshara, Saeed, Wilson, Matt,
Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny, Tzalik, Guy
Right.
I’ll remove this patch.
On 1/2/18, 9:08 PM, "David Miller" <davem@davemloft.net> wrote:
From: <netanel@amazon.com>
Date: Thu, 28 Dec 2017 21:30:20 +0000
> From: Netanel Belgazal <netanel@amazon.com>
>
> netif_carrier_off() should be called only after register netdev.
> Move the function's call after the registration.
>
> Signed-off-by: Netanel Belgazal <netanel@amazon.com>
> ---
> drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index fbe21a817bd8..ee50c56765a4 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -3276,14 +3276,14 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> memcpy(adapter->netdev->perm_addr, adapter->mac_addr, netdev->addr_len);
>
> - netif_carrier_off(netdev);
> -
> rc = register_netdev(netdev);
> if (rc) {
> dev_err(&pdev->dev, "Cannot register net device\n");
> goto err_rss;
> }
>
> + netif_carrier_off(netdev);
> +
You cannot invoke this after register_netdev(), asynchronous activity can cause this
call to lose information and lose a link up event.
^ permalink raw reply [flat|nested] 9+ messages in thread