From: <akiyano@amazon.com>
To: <davem@davemloft.net>, <netdev@vger.kernel.org>
Cc: Arthur Kiyanovski <akiyano@amazon.com>, <dwmw@amazon.com>,
<zorik@amazon.com>, <matua@amazon.com>, <saeedb@amazon.com>,
<msw@amazon.com>, <aliguori@amazon.com>, <nafea@amazon.com>,
<gtzalik@amazon.com>, <netanel@amazon.com>, <alisaidi@amazon.com>
Subject: [PATCH V1 net 2/3] net: ena: fix crash during ena_remove()
Date: Mon, 19 Nov 2018 12:05:21 +0200 [thread overview]
Message-ID: <1542621922-31484-3-git-send-email-akiyano@amazon.com> (raw)
In-Reply-To: <1542621922-31484-1-git-send-email-akiyano@amazon.com>
From: Arthur Kiyanovski <akiyano@amazon.com>
In ena_remove() we have the following stack call:
ena_remove()
unregister_netdev()
ena_destroy_device()
netif_carrier_off()
Calling netif_carrier_off() causes linkwatch to try to handle the
link change event on the already unregistered netdev, which leads
to a read from an unreadable memory address.
This patch switches the order of the two functions, so that
netif_carrier_off() is called on a regiestered netdev.
To accomplish this fix we also had to:
1. Remove the set bit ENA_FLAG_TRIGGER_RESET
2. Add a sanitiy check in ena_close()
both to prevent double device reset (when calling unregister_netdev()
ena_close is called, but the device was already deleted in
ena_destroy_device()).
3. Set the admin_queue running state to false to avoid using it after
device was reset (for example when calling ena_destroy_all_io_queues()
right after ena_com_dev_reset() in ena_down)
Fixes: 944b28aa2982 ("net: ena: fix missing lock during device destruction")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 1d3cead..a70bb1b 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1848,6 +1848,8 @@ static void ena_down(struct ena_adapter *adapter)
rc = ena_com_dev_reset(adapter->ena_dev, adapter->reset_reason);
if (rc)
dev_err(&adapter->pdev->dev, "Device reset failed\n");
+ /* stop submitting admin commands on a device that was reset */
+ ena_com_set_admin_running_state(adapter->ena_dev, false);
}
ena_destroy_all_io_queues(adapter);
@@ -1914,6 +1916,9 @@ static int ena_close(struct net_device *netdev)
netif_dbg(adapter, ifdown, netdev, "%s\n", __func__);
+ if (!test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
+ return 0;
+
if (test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
ena_down(adapter);
@@ -2613,9 +2618,7 @@ static void ena_destroy_device(struct ena_adapter *adapter, bool graceful)
ena_down(adapter);
/* Stop the device from sending AENQ events (in case reset flag is set
- * and device is up, ena_close already reset the device
- * In case the reset flag is set and the device is up, ena_down()
- * already perform the reset, so it can be skipped.
+ * and device is up, ena_down() already reset the device.
*/
if (!(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags) && dev_up))
ena_com_dev_reset(adapter->ena_dev, adapter->reset_reason);
@@ -3452,6 +3455,8 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
ena_com_rss_destroy(ena_dev);
err_free_msix:
ena_com_dev_reset(ena_dev, ENA_REGS_RESET_INIT_ERR);
+ /* stop submitting admin commands on a device that was reset */
+ ena_com_set_admin_running_state(ena_dev, false);
ena_free_mgmnt_irq(adapter);
ena_disable_msix(adapter);
err_worker_destroy:
@@ -3498,18 +3503,12 @@ static void ena_remove(struct pci_dev *pdev)
cancel_work_sync(&adapter->reset_task);
- unregister_netdev(netdev);
-
- /* If the device is running then we want to make sure the device will be
- * reset to make sure no more events will be issued by the device.
- */
- if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
- set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
-
rtnl_lock();
ena_destroy_device(adapter, true);
rtnl_unlock();
+ unregister_netdev(netdev);
+
free_netdev(netdev);
ena_com_rss_destroy(ena_dev);
--
2.7.4
next prev parent reply other threads:[~2018-11-19 20:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-19 10:05 [PATCH V1 net 0/3] net: ena: hibernation and rmmod bug fixes akiyano
2018-11-19 10:05 ` [PATCH V1 net 1/3] net: ena: fix crash during failed resume from hibernation akiyano
2018-11-19 10:05 ` akiyano [this message]
2018-11-19 10:05 ` [PATCH V1 net 3/3] net: ena: update driver version from 2.0.1 to 2.0.2 akiyano
2018-11-19 23:13 ` [PATCH V1 net 0/3] net: ena: hibernation and rmmod bug fixes David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1542621922-31484-3-git-send-email-akiyano@amazon.com \
--to=akiyano@amazon.com \
--cc=aliguori@amazon.com \
--cc=alisaidi@amazon.com \
--cc=davem@davemloft.net \
--cc=dwmw@amazon.com \
--cc=gtzalik@amazon.com \
--cc=matua@amazon.com \
--cc=msw@amazon.com \
--cc=nafea@amazon.com \
--cc=netanel@amazon.com \
--cc=netdev@vger.kernel.org \
--cc=saeedb@amazon.com \
--cc=zorik@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).