All of lore.kernel.org
 help / color / mirror / Atom feed
From: <netanel@amazon.com>
To: <davem@davemloft.net>, <netdev@vger.kernel.org>
Cc: Netanel Belgazal <netanel@amazon.com>, <dwmw@amazon.com>,
	<zorik@amazon.com>, <matua@amazon.com>, <saeedb@amazon.com>,
	<msw@amazon.com>, <aliguori@amazon.com>, <nafea@amazon.com>,
	<evgenys@amazon.com>
Subject: [PATCH net 1/9] net: ena: fix rare uncompleted admin command false alarm
Date: Sun, 11 Jun 2017 15:42:43 +0300	[thread overview]
Message-ID: <1497184971-28178-2-git-send-email-netanel@amazon.com> (raw)
In-Reply-To: <1497184971-28178-1-git-send-email-netanel@amazon.com>

From: Netanel Belgazal <netanel@amazon.com>

The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 08d11cede9c9..e1c2fab6292f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno(u8 comp_status)
 static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_ctx,
 						     struct ena_com_admin_queue *admin_queue)
 {
-	unsigned long flags;
-	u32 start_time;
+	unsigned long flags, timeout;
 	int ret;
 
-	start_time = ((u32)jiffies_to_usecs(jiffies));
+	timeout = jiffies + ADMIN_CMD_TIMEOUT_US;
+
+	while (1) {
+		spin_lock_irqsave(&admin_queue->q_lock, flags);
+		ena_com_handle_admin_completion(admin_queue);
+		spin_unlock_irqrestore(&admin_queue->q_lock, flags);
 
-	while (comp_ctx->status == ENA_CMD_SUBMITTED) {
-		if ((((u32)jiffies_to_usecs(jiffies)) - start_time) >
-		    ADMIN_CMD_TIMEOUT_US) {
+		if (comp_ctx->status != ENA_CMD_SUBMITTED)
+			break;
+
+		if (time_is_before_jiffies(timeout)) {
 			pr_err("Wait for completion (polling) timeout\n");
 			/* ENA didn't have any completion */
 			spin_lock_irqsave(&admin_queue->q_lock, flags);
@@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_c
 			goto err;
 		}
 
-		spin_lock_irqsave(&admin_queue->q_lock, flags);
-		ena_com_handle_admin_completion(admin_queue);
-		spin_unlock_irqrestore(&admin_queue->q_lock, flags);
-
 		msleep(100);
 	}
 
-- 
2.7.4

  reply	other threads:[~2017-06-11 12:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-11 12:42 [PATCH net 0/9] Bugs fixes in ena ethernet driver netanel
2017-06-11 12:42 ` netanel [this message]
2017-06-11 12:42 ` [PATCH net 2/9] net: ena: fix bug that might cause hang after consecutive open/close interface netanel
2017-06-11 12:42 ` [PATCH net 3/9] net: ena: add missing return when ena_com_get_io_handlers() fails netanel
2017-06-11 12:42 ` [PATCH net 4/9] net: ena: fix race condition between submit and completion admin command netanel
2017-06-11 12:42 ` [PATCH net 5/9] net: ena: add missing unmap bars on device removal netanel
2017-06-11 12:42 ` [PATCH net 6/9] net: ena: fix theoretical Rx hang on low memory systems netanel
2017-06-11 12:42 ` [PATCH net 7/9] net: ena: disable admin msix while working in polling mode netanel
2017-06-11 12:42 ` [PATCH net 8/9] net: ena: bug fix in lost tx packets detection mechanism netanel
2017-06-11 12:42 ` [PATCH net 9/9] net: ena: update ena driver to version 1.1.7 netanel
2017-06-11 20:38 ` [PATCH net 0/9] Bugs fixes in ena ethernet driver David Miller
2017-06-12 14:09   ` Belgazal, Netanel
2017-06-12 14:24     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1497184971-28178-2-git-send-email-netanel@amazon.com \
    --to=netanel@amazon.com \
    --cc=aliguori@amazon.com \
    --cc=davem@davemloft.net \
    --cc=dwmw@amazon.com \
    --cc=evgenys@amazon.com \
    --cc=matua@amazon.com \
    --cc=msw@amazon.com \
    --cc=nafea@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedb@amazon.com \
    --cc=zorik@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.