netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <netanel@amazon.com>
To: <davem@davemloft.net>, <netdev@vger.kernel.org>
Cc: Netanel Belgazal <netanel@amazon.com>, <dwmw@amazon.com>,
	<zorik@amazon.com>, <matua@amazon.com>, <saeedb@amazon.com>,
	<msw@amazon.com>, <aliguori@amazon.com>, <nafea@amazon.com>,
	<evgenys@amazon.com>
Subject: [PATCH net 1/9] net: ena: fix rare uncompleted admin command false alarm
Date: Sun, 11 Jun 2017 15:42:43 +0300	[thread overview]
Message-ID: <1497184971-28178-2-git-send-email-netanel@amazon.com> (raw)
In-Reply-To: <1497184971-28178-1-git-send-email-netanel@amazon.com>

From: Netanel Belgazal <netanel@amazon.com>

The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 08d11cede9c9..e1c2fab6292f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno(u8 comp_status)
 static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_ctx,
 						     struct ena_com_admin_queue *admin_queue)
 {
-	unsigned long flags;
-	u32 start_time;
+	unsigned long flags, timeout;
 	int ret;
 
-	start_time = ((u32)jiffies_to_usecs(jiffies));
+	timeout = jiffies + ADMIN_CMD_TIMEOUT_US;
+
+	while (1) {
+		spin_lock_irqsave(&admin_queue->q_lock, flags);
+		ena_com_handle_admin_completion(admin_queue);
+		spin_unlock_irqrestore(&admin_queue->q_lock, flags);
 
-	while (comp_ctx->status == ENA_CMD_SUBMITTED) {
-		if ((((u32)jiffies_to_usecs(jiffies)) - start_time) >
-		    ADMIN_CMD_TIMEOUT_US) {
+		if (comp_ctx->status != ENA_CMD_SUBMITTED)
+			break;
+
+		if (time_is_before_jiffies(timeout)) {
 			pr_err("Wait for completion (polling) timeout\n");
 			/* ENA didn't have any completion */
 			spin_lock_irqsave(&admin_queue->q_lock, flags);
@@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_c
 			goto err;
 		}
 
-		spin_lock_irqsave(&admin_queue->q_lock, flags);
-		ena_com_handle_admin_completion(admin_queue);
-		spin_unlock_irqrestore(&admin_queue->q_lock, flags);
-
 		msleep(100);
 	}
 
-- 
2.7.4

  reply	other threads:[~2017-06-11 12:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-11 12:42 [PATCH net 0/9] Bugs fixes in ena ethernet driver netanel
2017-06-11 12:42 ` netanel [this message]
2017-06-11 12:42 ` [PATCH net 2/9] net: ena: fix bug that might cause hang after consecutive open/close interface netanel
2017-06-11 12:42 ` [PATCH net 3/9] net: ena: add missing return when ena_com_get_io_handlers() fails netanel
2017-06-11 12:42 ` [PATCH net 4/9] net: ena: fix race condition between submit and completion admin command netanel
2017-06-11 12:42 ` [PATCH net 5/9] net: ena: add missing unmap bars on device removal netanel
2017-06-11 12:42 ` [PATCH net 6/9] net: ena: fix theoretical Rx hang on low memory systems netanel
2017-06-11 12:42 ` [PATCH net 7/9] net: ena: disable admin msix while working in polling mode netanel
2017-06-11 12:42 ` [PATCH net 8/9] net: ena: bug fix in lost tx packets detection mechanism netanel
2017-06-11 12:42 ` [PATCH net 9/9] net: ena: update ena driver to version 1.1.7 netanel
2017-06-11 20:38 ` [PATCH net 0/9] Bugs fixes in ena ethernet driver David Miller
2017-06-12 14:09   ` Belgazal, Netanel
2017-06-12 14:24     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1497184971-28178-2-git-send-email-netanel@amazon.com \
    --to=netanel@amazon.com \
    --cc=aliguori@amazon.com \
    --cc=davem@davemloft.net \
    --cc=dwmw@amazon.com \
    --cc=evgenys@amazon.com \
    --cc=matua@amazon.com \
    --cc=msw@amazon.com \
    --cc=nafea@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedb@amazon.com \
    --cc=zorik@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).