netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] Bug fixes in ena ethernet driver
@ 2017-06-08 21:46 netanel
  2017-06-08 21:46 ` [PATCH net-next 1/8] net: ena: fix rare uncompleted admin command false alarm netanel
                   ` (17 more replies)
  0 siblings, 18 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

This patchset contains fixes for the bugs that were discovered so far.

Netanel Belgazal (8):
  net: ena: fix rare uncompleted admin command false alarm
  net: ena: fix bug that might cause hang after consecutive open/close
    interface.
  net: ena: add missing return when ena_com_get_io_handlers() fails
  net: ena: fix race condition between submit and completion admin
    command
  net: ena: add missing unmap bars on device removal
  net: ena: fix theoretical Rx hang on low memory systems
  net: ena: disable admin msix while working in polling mode
  net: ena: bug fix in lost tx packets detection mechanism

 drivers/net/ethernet/amazon/ena/ena_com.c     |  35 +++--
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |   2 +-
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 179 +++++++++++++++++++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |  16 ++-
 4 files changed, 168 insertions(+), 64 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next 1/8] net: ena: fix rare uncompleted admin command false alarm
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 2/8] net: ena: fix bug that might cause hang after consecutive open/close interface netanel
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 08d11ce..e1c2fab 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno(u8 comp_status)
 static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_ctx,
 						     struct ena_com_admin_queue *admin_queue)
 {
-	unsigned long flags;
-	u32 start_time;
+	unsigned long flags, timeout;
 	int ret;
 
-	start_time = ((u32)jiffies_to_usecs(jiffies));
+	timeout = jiffies + ADMIN_CMD_TIMEOUT_US;
+
+	while (1) {
+		spin_lock_irqsave(&admin_queue->q_lock, flags);
+		ena_com_handle_admin_completion(admin_queue);
+		spin_unlock_irqrestore(&admin_queue->q_lock, flags);
 
-	while (comp_ctx->status == ENA_CMD_SUBMITTED) {
-		if ((((u32)jiffies_to_usecs(jiffies)) - start_time) >
-		    ADMIN_CMD_TIMEOUT_US) {
+		if (comp_ctx->status != ENA_CMD_SUBMITTED)
+			break;
+
+		if (time_is_before_jiffies(timeout)) {
 			pr_err("Wait for completion (polling) timeout\n");
 			/* ENA didn't have any completion */
 			spin_lock_irqsave(&admin_queue->q_lock, flags);
@@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_c
 			goto err;
 		}
 
-		spin_lock_irqsave(&admin_queue->q_lock, flags);
-		ena_com_handle_admin_completion(admin_queue);
-		spin_unlock_irqrestore(&admin_queue->q_lock, flags);
-
 		msleep(100);
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 2/8] net: ena: fix bug that might cause hang after consecutive open/close interface.
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
  2017-06-08 21:46 ` [PATCH net-next 1/8] net: ena: fix rare uncompleted admin command false alarm netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 3/8] net: ena: add missing return when ena_com_get_io_handlers() fails netanel
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 41 ++++++++++++++++++----------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 7c1214d..0e3c60c7 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1078,6 +1078,26 @@ inline void ena_adjust_intr_moderation(struct ena_ring *rx_ring,
 	rx_ring->per_napi_bytes = 0;
 }
 
+static inline void ena_unmask_interrupt(struct ena_ring *tx_ring,
+					struct ena_ring *rx_ring)
+{
+	struct ena_eth_io_intr_reg intr_reg;
+
+	/* Update intr register: rx intr delay,
+	 * tx intr delay and interrupt unmask
+	 */
+	ena_com_update_intr_reg(&intr_reg,
+				rx_ring->smoothed_interval,
+				tx_ring->smoothed_interval,
+				true);
+
+	/* It is a shared MSI-X.
+	 * Tx and Rx CQ have pointer to it.
+	 * So we use one of them to reach the intr reg
+	 */
+	ena_com_unmask_intr(rx_ring->ena_com_io_cq, &intr_reg);
+}
+
 static inline void ena_update_ring_numa_node(struct ena_ring *tx_ring,
 					     struct ena_ring *rx_ring)
 {
@@ -1108,7 +1128,6 @@ static int ena_io_poll(struct napi_struct *napi, int budget)
 {
 	struct ena_napi *ena_napi = container_of(napi, struct ena_napi, napi);
 	struct ena_ring *tx_ring, *rx_ring;
-	struct ena_eth_io_intr_reg intr_reg;
 
 	u32 tx_work_done;
 	u32 rx_work_done;
@@ -1149,22 +1168,9 @@ static int ena_io_poll(struct napi_struct *napi, int budget)
 			if (ena_com_get_adaptive_moderation_enabled(rx_ring->ena_dev))
 				ena_adjust_intr_moderation(rx_ring, tx_ring);
 
-			/* Update intr register: rx intr delay,
-			 * tx intr delay and interrupt unmask
-			 */
-			ena_com_update_intr_reg(&intr_reg,
-						rx_ring->smoothed_interval,
-						tx_ring->smoothed_interval,
-						true);
-
-			/* It is a shared MSI-X.
-			 * Tx and Rx CQ have pointer to it.
-			 * So we use one of them to reach the intr reg
-			 */
-			ena_com_unmask_intr(rx_ring->ena_com_io_cq, &intr_reg);
+			ena_unmask_interrupt(tx_ring, rx_ring);
 		}
 
-
 		ena_update_ring_numa_node(tx_ring, rx_ring);
 
 		ret = rx_work_done;
@@ -1485,6 +1491,11 @@ static int ena_up_complete(struct ena_adapter *adapter)
 
 	ena_napi_enable_all(adapter);
 
+	/* Enable completion queues interrupt */
+	for (i = 0; i < adapter->num_queues; i++)
+		ena_unmask_interrupt(&adapter->tx_ring[i],
+				     &adapter->rx_ring[i]);
+
 	/* schedule napi in case we had pending packets
 	 * from the last time we disable napi
 	 */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 3/8] net: ena: add missing return when ena_com_get_io_handlers() fails
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
  2017-06-08 21:46 ` [PATCH net-next 1/8] net: ena: fix rare uncompleted admin command false alarm netanel
  2017-06-08 21:46 ` [PATCH net-next 2/8] net: ena: fix bug that might cause hang after consecutive open/close interface netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 4/8] net: ena: fix race condition between submit and completion admin command netanel
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 0e3c60c7..1e71e89 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1543,6 +1543,7 @@ static int ena_create_io_tx_queue(struct ena_adapter *adapter, int qid)
 			  "Failed to get TX queue handlers. TX queue num %d rc: %d\n",
 			  qid, rc);
 		ena_com_destroy_io_queue(ena_dev, ena_qid);
+		return rc;
 	}
 
 	ena_com_update_numa_node(tx_ring->ena_com_io_cq, ctx.numa_node);
@@ -1607,6 +1608,7 @@ static int ena_create_io_rx_queue(struct ena_adapter *adapter, int qid)
 			  "Failed to get RX queue handlers. RX queue num %d rc: %d\n",
 			  qid, rc);
 		ena_com_destroy_io_queue(ena_dev, ena_qid);
+		return rc;
 	}
 
 	ena_com_update_numa_node(rx_ring->ena_com_io_cq, ctx.numa_node);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 4/8] net: ena: fix race condition between submit and completion admin command
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (2 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 3/8] net: ena: add missing return when ena_com_get_io_handlers() fails netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 5/8] net: ena: add missing unmap bars on device removal netanel
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).

Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.

This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.

Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index e1c2fab..ea60b9e 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -232,11 +232,9 @@ static struct ena_comp_ctx *__ena_com_submit_admin_cmd(struct ena_com_admin_queu
 	tail_masked = admin_queue->sq.tail & queue_size_mask;
 
 	/* In case of queue FULL */
-	cnt = admin_queue->sq.tail - admin_queue->sq.head;
+	cnt = atomic_read(&admin_queue->outstanding_cmds);
 	if (cnt >= admin_queue->q_depth) {
-		pr_debug("admin queue is FULL (tail %d head %d depth: %d)\n",
-			 admin_queue->sq.tail, admin_queue->sq.head,
-			 admin_queue->q_depth);
+		pr_debug("admin queue is full.\n");
 		admin_queue->stats.out_of_space++;
 		return ERR_PTR(-ENOSPC);
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 5/8] net: ena: add missing unmap bars on device removal
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (3 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 4/8] net: ena: fix race condition between submit and completion admin command netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 6/8] net: ena: fix theoretical Rx hang on low memory systems netanel
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

This patch also change the mapping functions to devm_ functions

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 1e71e89..4e9fbdd 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2853,6 +2853,11 @@ static void ena_release_bars(struct ena_com_dev *ena_dev, struct pci_dev *pdev)
 {
 	int release_bars;
 
+	if (ena_dev->mem_bar)
+		devm_iounmap(&pdev->dev, ena_dev->mem_bar);
+
+	devm_iounmap(&pdev->dev, ena_dev->reg_bar);
+
 	release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & ENA_BAR_MASK;
 	pci_release_selected_regions(pdev, release_bars);
 }
@@ -2940,8 +2945,9 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_free_ena_dev;
 	}
 
-	ena_dev->reg_bar = ioremap(pci_resource_start(pdev, ENA_REG_BAR),
-				   pci_resource_len(pdev, ENA_REG_BAR));
+	ena_dev->reg_bar = devm_ioremap(&pdev->dev,
+					pci_resource_start(pdev, ENA_REG_BAR),
+					pci_resource_len(pdev, ENA_REG_BAR));
 	if (!ena_dev->reg_bar) {
 		dev_err(&pdev->dev, "failed to remap regs bar\n");
 		rc = -EFAULT;
@@ -2961,8 +2967,9 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	ena_set_push_mode(pdev, ena_dev, &get_feat_ctx);
 
 	if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) {
-		ena_dev->mem_bar = ioremap_wc(pci_resource_start(pdev, ENA_MEM_BAR),
-					      pci_resource_len(pdev, ENA_MEM_BAR));
+		ena_dev->mem_bar = devm_ioremap_wc(&pdev->dev,
+						   pci_resource_start(pdev, ENA_MEM_BAR),
+						   pci_resource_len(pdev, ENA_MEM_BAR));
 		if (!ena_dev->mem_bar) {
 			rc = -EFAULT;
 			goto err_device_destroy;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 6/8] net: ena: fix theoretical Rx hang on low memory systems
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (4 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 5/8] net: ena: add missing unmap bars on device removal netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 6/8] net: ena: fix theoretical Rx stuck " netanel
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |  1 +
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 55 +++++++++++++++++++++++++++
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |  2 +
 3 files changed, 58 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 67b2338f..533b2fb 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -94,6 +94,7 @@ static const struct ena_stats ena_stats_rx_strings[] = {
 	ENA_STAT_RX_ENTRY(dma_mapping_err),
 	ENA_STAT_RX_ENTRY(bad_desc_num),
 	ENA_STAT_RX_ENTRY(rx_copybreak_pkt),
+	ENA_STAT_RX_ENTRY(empty_rx_ring),
 };
 
 static const struct ena_stats ena_stats_ena_com_strings[] = {
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 4e9fbdd..3c366bf 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -190,6 +190,7 @@ static void ena_init_io_rings(struct ena_adapter *adapter)
 		rxr->sgl_size = adapter->max_rx_sgl_size;
 		rxr->smoothed_interval =
 			ena_com_get_nonadaptive_moderation_interval_rx(ena_dev);
+		rxr->empty_rx_queue = 0;
 	}
 }
 
@@ -2619,6 +2620,58 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter)
 	adapter->last_monitored_tx_qid = i % adapter->num_queues;
 }
 
+/* trigger napi schedule after 2 consecutive detections */
+#define EMPTY_RX_REFILL 2
+/* For the rare case where the device runs out of Rx descriptors and the
+ * napi handler failed to refill new Rx descriptors (due to a lack of memory
+ * for example).
+ * This case will lead to a deadlock:
+ * The device won't send interrupts since all the new Rx packets will be dropped
+ * The napi handler won't allocate new Rx descriptors so the device will be
+ * able to send new packets.
+ *
+ * This scenario can happen when the kernel's vm.min_free_kbytes is too small.
+ * It is recommended to have at least 512MB, with a minimum of 128MB for
+ * constrained environment).
+ *
+ * When such a situation is detected - Reschedule napi
+ */
+static void check_for_empty_rx_ring(struct ena_adapter *adapter)
+{
+	struct ena_ring *rx_ring;
+	int i, refill_required;
+
+	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
+		return;
+
+	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
+	for (i = 0; i < adapter->num_queues; i++) {
+		rx_ring = &adapter->rx_ring[i];
+
+		refill_required =
+			ena_com_sq_empty_space(rx_ring->ena_com_io_sq);
+		if (unlikely(refill_required == (rx_ring->ring_size - 1))) {
+			rx_ring->empty_rx_queue++;
+
+			if (rx_ring->empty_rx_queue >= EMPTY_RX_REFILL) {
+				u64_stats_update_begin(&rx_ring->syncp);
+				rx_ring->rx_stats.empty_rx_ring++;
+				u64_stats_update_end(&rx_ring->syncp);
+
+				netif_err(adapter, drv, adapter->netdev,
+					  "trigger refill for ring %d\n", i);
+
+				napi_schedule(rx_ring->napi);
+				rx_ring->empty_rx_queue = 0;
+			}
+		} else {
+			rx_ring->empty_rx_queue = 0;
+		}
+	}
+}
+
 /* Check for keep alive expiration */
 static void check_for_missing_keep_alive(struct ena_adapter *adapter)
 {
@@ -2673,6 +2726,8 @@ static void ena_timer_service(unsigned long data)
 
 	check_for_missing_tx_completions(adapter);
 
+	check_for_empty_rx_ring(adapter);
+
 	if (debug_area)
 		ena_dump_stats_to_buf(adapter, debug_area);
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 0e22bce..8828f1d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -184,6 +184,7 @@ struct ena_stats_rx {
 	u64 dma_mapping_err;
 	u64 bad_desc_num;
 	u64 rx_copybreak_pkt;
+	u64 empty_rx_ring;
 };
 
 struct ena_ring {
@@ -231,6 +232,7 @@ struct ena_ring {
 		struct ena_stats_tx tx_stats;
 		struct ena_stats_rx rx_stats;
 	};
+	int empty_rx_queue;
 } ____cacheline_aligned;
 
 struct ena_stats_dev {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 6/8] net: ena: fix theoretical Rx stuck on low memory systems
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (5 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 6/8] net: ena: fix theoretical Rx hang on low memory systems netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 7/8] net: ena: disable admin msix while working in polling mode netanel
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |  1 +
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 55 +++++++++++++++++++++++++++
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |  2 +
 3 files changed, 58 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 67b2338f..533b2fb 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -94,6 +94,7 @@ static const struct ena_stats ena_stats_rx_strings[] = {
 	ENA_STAT_RX_ENTRY(dma_mapping_err),
 	ENA_STAT_RX_ENTRY(bad_desc_num),
 	ENA_STAT_RX_ENTRY(rx_copybreak_pkt),
+	ENA_STAT_RX_ENTRY(empty_rx_ring),
 };
 
 static const struct ena_stats ena_stats_ena_com_strings[] = {
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 4e9fbdd..3c366bf 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -190,6 +190,7 @@ static void ena_init_io_rings(struct ena_adapter *adapter)
 		rxr->sgl_size = adapter->max_rx_sgl_size;
 		rxr->smoothed_interval =
 			ena_com_get_nonadaptive_moderation_interval_rx(ena_dev);
+		rxr->empty_rx_queue = 0;
 	}
 }
 
@@ -2619,6 +2620,58 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter)
 	adapter->last_monitored_tx_qid = i % adapter->num_queues;
 }
 
+/* trigger napi schedule after 2 consecutive detections */
+#define EMPTY_RX_REFILL 2
+/* For the rare case where the device runs out of Rx descriptors and the
+ * napi handler failed to refill new Rx descriptors (due to a lack of memory
+ * for example).
+ * This case will lead to a deadlock:
+ * The device won't send interrupts since all the new Rx packets will be dropped
+ * The napi handler won't allocate new Rx descriptors so the device will be
+ * able to send new packets.
+ *
+ * This scenario can happen when the kernel's vm.min_free_kbytes is too small.
+ * It is recommended to have at least 512MB, with a minimum of 128MB for
+ * constrained environment).
+ *
+ * When such a situation is detected - Reschedule napi
+ */
+static void check_for_empty_rx_ring(struct ena_adapter *adapter)
+{
+	struct ena_ring *rx_ring;
+	int i, refill_required;
+
+	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
+		return;
+
+	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
+	for (i = 0; i < adapter->num_queues; i++) {
+		rx_ring = &adapter->rx_ring[i];
+
+		refill_required =
+			ena_com_sq_empty_space(rx_ring->ena_com_io_sq);
+		if (unlikely(refill_required == (rx_ring->ring_size - 1))) {
+			rx_ring->empty_rx_queue++;
+
+			if (rx_ring->empty_rx_queue >= EMPTY_RX_REFILL) {
+				u64_stats_update_begin(&rx_ring->syncp);
+				rx_ring->rx_stats.empty_rx_ring++;
+				u64_stats_update_end(&rx_ring->syncp);
+
+				netif_err(adapter, drv, adapter->netdev,
+					  "trigger refill for ring %d\n", i);
+
+				napi_schedule(rx_ring->napi);
+				rx_ring->empty_rx_queue = 0;
+			}
+		} else {
+			rx_ring->empty_rx_queue = 0;
+		}
+	}
+}
+
 /* Check for keep alive expiration */
 static void check_for_missing_keep_alive(struct ena_adapter *adapter)
 {
@@ -2673,6 +2726,8 @@ static void ena_timer_service(unsigned long data)
 
 	check_for_missing_tx_completions(adapter);
 
+	check_for_empty_rx_ring(adapter);
+
 	if (debug_area)
 		ena_dump_stats_to_buf(adapter, debug_area);
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 0e22bce..8828f1d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -184,6 +184,7 @@ struct ena_stats_rx {
 	u64 dma_mapping_err;
 	u64 bad_desc_num;
 	u64 rx_copybreak_pkt;
+	u64 empty_rx_ring;
 };
 
 struct ena_ring {
@@ -231,6 +232,7 @@ struct ena_ring {
 		struct ena_stats_tx tx_stats;
 		struct ena_stats_rx rx_stats;
 	};
+	int empty_rx_queue;
 } ____cacheline_aligned;
 
 struct ena_stats_dev {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 7/8] net: ena: disable admin msix while working in polling mode
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (6 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 6/8] net: ena: fix theoretical Rx stuck " netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 8/8] net: ena: bug fix in lost tx packets detection mechanism netanel
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index ea60b9e..f5b237e 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -61,6 +61,8 @@
 
 #define ENA_MMIO_READ_TIMEOUT 0xFFFFFFFF
 
+#define ENA_REGS_ADMIN_INTR_MASK 1
+
 /*****************************************************************************/
 /*****************************************************************************/
 /*****************************************************************************/
@@ -1454,6 +1456,12 @@ void ena_com_admin_destroy(struct ena_com_dev *ena_dev)
 
 void ena_com_set_admin_polling_mode(struct ena_com_dev *ena_dev, bool polling)
 {
+	u32 mask_value = 0;
+
+	if (polling)
+		mask_value = ENA_REGS_ADMIN_INTR_MASK;
+
+	writel(mask_value, ena_dev->reg_bar + ENA_REGS_INTR_MASK_OFF);
 	ena_dev->admin_queue.polling = polling;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 8/8] net: ena: bug fix in lost tx packets detection mechanism
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (7 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 7/8] net: ena: disable admin msix while working in polling mode netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 00/13] update ena ethernet driver to version 1.2.0 netanel
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |  1 -
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 66 +++++++++++++++------------
 drivers/net/ethernet/amazon/ena/ena_netdev.h  | 14 +++++-
 3 files changed, 50 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 533b2fb..3ee55e2 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -80,7 +80,6 @@ static const struct ena_stats ena_stats_tx_strings[] = {
 	ENA_STAT_TX_ENTRY(tx_poll),
 	ENA_STAT_TX_ENTRY(doorbells),
 	ENA_STAT_TX_ENTRY(prepare_ctx_err),
-	ENA_STAT_TX_ENTRY(missing_tx_comp),
 	ENA_STAT_TX_ENTRY(bad_req_id),
 };
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 3c366bf..4f16ed3 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1995,6 +1995,7 @@ static netdev_tx_t ena_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	tx_info->tx_descs = nb_hw_desc;
 	tx_info->last_jiffies = jiffies;
+	tx_info->print_once = 0;
 
 	tx_ring->next_to_use = ENA_TX_RING_IDX_NEXT(next_to_use,
 		tx_ring->ring_size);
@@ -2564,13 +2565,44 @@ static void ena_fw_reset_device(struct work_struct *work)
 		"Reset attempt failed. Can not reset the device\n");
 }
 
-static void check_for_missing_tx_completions(struct ena_adapter *adapter)
+static int check_missing_comp_in_queue(struct ena_adapter *adapter,
+				       struct ena_ring *tx_ring)
 {
 	struct ena_tx_buffer *tx_buf;
 	unsigned long last_jiffies;
+	u32 missed_tx = 0;
+	int i;
+
+	for (i = 0; i < tx_ring->ring_size; i++) {
+		tx_buf = &tx_ring->tx_buffer_info[i];
+		last_jiffies = tx_buf->last_jiffies;
+		if (unlikely(last_jiffies &&
+			     time_is_before_jiffies(last_jiffies + TX_TIMEOUT))) {
+			if (!tx_buf->print_once)
+				netif_notice(adapter, tx_err, adapter->netdev,
+					     "Found a Tx that wasn't completed on time, qid %d, index %d.\n",
+					     tx_ring->qid, i);
+
+			tx_buf->print_once = 1;
+			missed_tx++;
+
+			if (unlikely(missed_tx > MAX_NUM_OF_TIMEOUTED_PACKETS)) {
+				netif_err(adapter, tx_err, adapter->netdev,
+					  "The number of lost tx completions is above the threshold (%d > %d). Reset the device\n",
+					  missed_tx, MAX_NUM_OF_TIMEOUTED_PACKETS);
+				set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
+				return -EIO;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void check_for_missing_tx_completions(struct ena_adapter *adapter)
+{
 	struct ena_ring *tx_ring;
-	int i, j, budget;
-	u32 missed_tx;
+	int i, budget, rc;
 
 	/* Make sure the driver doesn't turn the device in other process */
 	smp_rmb();
@@ -2586,31 +2618,9 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter)
 	for (i = adapter->last_monitored_tx_qid; i < adapter->num_queues; i++) {
 		tx_ring = &adapter->tx_ring[i];
 
-		for (j = 0; j < tx_ring->ring_size; j++) {
-			tx_buf = &tx_ring->tx_buffer_info[j];
-			last_jiffies = tx_buf->last_jiffies;
-			if (unlikely(last_jiffies && time_is_before_jiffies(last_jiffies + TX_TIMEOUT))) {
-				netif_notice(adapter, tx_err, adapter->netdev,
-					     "Found a Tx that wasn't completed on time, qid %d, index %d.\n",
-					     tx_ring->qid, j);
-
-				u64_stats_update_begin(&tx_ring->syncp);
-				missed_tx = tx_ring->tx_stats.missing_tx_comp++;
-				u64_stats_update_end(&tx_ring->syncp);
-
-				/* Clear last jiffies so the lost buffer won't
-				 * be counted twice.
-				 */
-				tx_buf->last_jiffies = 0;
-
-				if (unlikely(missed_tx > MAX_NUM_OF_TIMEOUTED_PACKETS)) {
-					netif_err(adapter, tx_err, adapter->netdev,
-						  "The number of lost tx completion is above the threshold (%d > %d). Reset the device\n",
-						  missed_tx, MAX_NUM_OF_TIMEOUTED_PACKETS);
-					set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
-				}
-			}
-		}
+		rc = check_missing_comp_in_queue(adapter, tx_ring);
+		if (unlikely(rc))
+			return;
 
 		budget--;
 		if (!budget)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 8828f1d..88b5e56 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -146,7 +146,18 @@ struct ena_tx_buffer {
 	u32 tx_descs;
 	/* num of buffers used by this skb */
 	u32 num_of_bufs;
-	/* Save the last jiffies to detect missing tx packets */
+
+	/* Used for detect missing tx packets to limit the number of prints */
+	u32 print_once;
+	/* Save the last jiffies to detect missing tx packets
+	 *
+	 * sets to non zero value on ena_start_xmit and set to zero on
+	 * napi and timer_Service_routine.
+	 *
+	 * while this value is not protected by lock,
+	 * a given packet is not expected to be handled by ena_start_xmit
+	 * and by napi/timer_service at the same time.
+	 */
 	unsigned long last_jiffies;
 	struct ena_com_buf bufs[ENA_PKT_MAX_BUFS];
 } ____cacheline_aligned;
@@ -170,7 +181,6 @@ struct ena_stats_tx {
 	u64 napi_comp;
 	u64 tx_poll;
 	u64 doorbells;
-	u64 missing_tx_comp;
 	u64 bad_req_id;
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 00/13] update ena ethernet driver to version 1.2.0
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (8 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 8/8] net: ena: bug fix in lost tx packets detection mechanism netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 01/13] net: ena: change return value for unsupported features unsupported return value netanel
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

This patchset contains some new features/improvements that were added
to the ENA driver to increase its robustness and are based on
experience of wide ENA deployment.

Depends on:
[PATCH net-next 0/8] Bug fixes in ena ethernet driver

Netanel Belgazal (13):
  net: ena: change return value for unsupported features unsupported
    return value
  net: ena: add hardware hints capability to the driver
  net: ena: change sizeof() argument to be the type pointer
  net: ena: add reset reason for each device FLR
  net: ena: add support for out of order rx buffers refill
  net: ena: allow the driver to work with small number of msix vectors
  net: ena: use napi_schedule_irqoff when possible
  net: ena: separate skb allocation to dedicated function
  net: ena: adding missing cast in ena_com_mem_addr_set()
  net: ena: add mtu limitation in ena_change_mtu()
  net: ena: update driver's rx drop statistics
  net: ena: change validate_tx_req_id() to be inline function
  net: ena: update ena driver to version 1.2.0

 drivers/net/ethernet/amazon/ena/ena_admin_defs.h |  31 +++
 drivers/net/ethernet/amazon/ena/ena_com.c        |  83 ++++--
 drivers/net/ethernet/amazon/ena/ena_com.h        |  10 +-
 drivers/net/ethernet/amazon/ena/ena_eth_com.c    |   5 +
 drivers/net/ethernet/amazon/ena/ena_ethtool.c    |  11 +-
 drivers/net/ethernet/amazon/ena/ena_netdev.c     | 317 +++++++++++++++++------
 drivers/net/ethernet/amazon/ena/ena_netdev.h     |  31 ++-
 drivers/net/ethernet/amazon/ena/ena_regs_defs.h  |  34 +++
 8 files changed, 407 insertions(+), 115 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next 01/13] net: ena: change return value for unsupported features unsupported return value
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (9 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 00/13] update ena ethernet driver to version 1.2.0 netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 02/13] net: ena: add hardware hints capability to the driver netanel
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

return -EOPNOTSUPP instead of -EPERM.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c     | 22 +++++++++++-----------
 drivers/net/ethernet/amazon/ena/ena_ethtool.c | 10 +++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 20 ++++++++++----------
 3 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index f5b237e..02752d5 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -494,7 +494,7 @@ static int ena_com_comp_status_to_errno(u8 comp_status)
 	case ENA_ADMIN_RESOURCE_ALLOCATION_FAILURE:
 		return -ENOMEM;
 	case ENA_ADMIN_UNSUPPORTED_OPCODE:
-		return -EPERM;
+		return -EOPNOTSUPP;
 	case ENA_ADMIN_BAD_OPCODE:
 	case ENA_ADMIN_MALFORMED_REQUEST:
 	case ENA_ADMIN_ILLEGAL_PARAMETER:
@@ -786,7 +786,7 @@ static int ena_com_get_feature_ex(struct ena_com_dev *ena_dev,
 
 	if (!ena_com_check_supported_feature_id(ena_dev, feature_id)) {
 		pr_debug("Feature %d isn't supported\n", feature_id);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	memset(&get_cmd, 0x0, sizeof(get_cmd));
@@ -1324,7 +1324,7 @@ int ena_com_set_aenq_config(struct ena_com_dev *ena_dev, u32 groups_flag)
 	if ((get_resp.u.aenq.supported_groups & groups_flag) != groups_flag) {
 		pr_warn("Trying to set unsupported aenq events. supported flag: %x asked flag: %x\n",
 			get_resp.u.aenq.supported_groups, groups_flag);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	memset(&cmd, 0x0, sizeof(cmd));
@@ -1909,7 +1909,7 @@ int ena_com_set_dev_mtu(struct ena_com_dev *ena_dev, int mtu)
 
 	if (!ena_com_check_supported_feature_id(ena_dev, ENA_ADMIN_MTU)) {
 		pr_debug("Feature %d isn't supported\n", ENA_ADMIN_MTU);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	memset(&cmd, 0x0, sizeof(cmd));
@@ -1963,7 +1963,7 @@ int ena_com_set_hash_function(struct ena_com_dev *ena_dev)
 						ENA_ADMIN_RSS_HASH_FUNCTION)) {
 		pr_debug("Feature %d isn't supported\n",
 			 ENA_ADMIN_RSS_HASH_FUNCTION);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	/* Validate hash function is supported */
@@ -1975,7 +1975,7 @@ int ena_com_set_hash_function(struct ena_com_dev *ena_dev)
 	if (get_resp.u.flow_hash_func.supported_func & (1 << rss->hash_func)) {
 		pr_err("Func hash %d isn't supported by device, abort\n",
 		       rss->hash_func);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	memset(&cmd, 0x0, sizeof(cmd));
@@ -2034,7 +2034,7 @@ int ena_com_fill_hash_function(struct ena_com_dev *ena_dev,
 
 	if (!((1 << func) & get_resp.u.flow_hash_func.supported_func)) {
 		pr_err("Flow hash function %d isn't supported\n", func);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	switch (func) {
@@ -2127,7 +2127,7 @@ int ena_com_set_hash_ctrl(struct ena_com_dev *ena_dev)
 						ENA_ADMIN_RSS_HASH_INPUT)) {
 		pr_debug("Feature %d isn't supported\n",
 			 ENA_ADMIN_RSS_HASH_INPUT);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	memset(&cmd, 0x0, sizeof(cmd));
@@ -2208,7 +2208,7 @@ int ena_com_set_default_hash_ctrl(struct ena_com_dev *ena_dev)
 			pr_err("hash control doesn't support all the desire configuration. proto %x supported %x selected %x\n",
 			       i, hash_ctrl->supported_fields[i].fields,
 			       hash_ctrl->selected_fields[i].fields);
-			return -EPERM;
+			return -EOPNOTSUPP;
 		}
 	}
 
@@ -2286,7 +2286,7 @@ int ena_com_indirect_table_set(struct ena_com_dev *ena_dev)
 		    ena_dev, ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG)) {
 		pr_debug("Feature %d isn't supported\n",
 			 ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG);
-		return -EPERM;
+		return -EOPNOTSUPP;
 	}
 
 	ret = ena_com_ind_tbl_convert_to_device(ena_dev);
@@ -2553,7 +2553,7 @@ int ena_com_init_interrupt_moderation(struct ena_com_dev *ena_dev)
 				 ENA_ADMIN_INTERRUPT_MODERATION);
 
 	if (rc) {
-		if (rc == -EPERM) {
+		if (rc == -EOPNOTSUPP) {
 			pr_debug("Feature %d isn't supported\n",
 				 ENA_ADMIN_INTERRUPT_MODERATION);
 			rc = 0;
diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 3ee55e2..d51a67f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -539,12 +539,8 @@ static int ena_get_rss_hash(struct ena_com_dev *ena_dev,
 	}
 
 	rc = ena_com_get_hash_ctrl(ena_dev, proto, &hash_fields);
-	if (rc) {
-		/* If device don't have permission, return unsupported */
-		if (rc == -EPERM)
-			rc = -EOPNOTSUPP;
+	if (rc)
 		return rc;
-	}
 
 	cmd->data = ena_flow_hash_to_flow_type(hash_fields);
 
@@ -612,7 +608,7 @@ static int ena_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *info)
 		rc = -EOPNOTSUPP;
 	}
 
-	return (rc == -EPERM) ? -EOPNOTSUPP : rc;
+	return rc;
 }
 
 static int ena_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *info,
@@ -638,7 +634,7 @@ static int ena_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *info,
 		rc = -EOPNOTSUPP;
 	}
 
-	return (rc == -EPERM) ? -EOPNOTSUPP : rc;
+	return rc;
 }
 
 static u32 ena_get_rxfh_indir_size(struct net_device *netdev)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 4f16ed3..8d4f65e 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1446,7 +1446,7 @@ static int ena_rss_configure(struct ena_adapter *adapter)
 	/* In case the RSS table wasn't initialized by probe */
 	if (!ena_dev->rss.tbl_log_size) {
 		rc = ena_rss_init_default(adapter);
-		if (rc && (rc != -EPERM)) {
+		if (rc && (rc != -EOPNOTSUPP)) {
 			netif_err(adapter, ifup, adapter->netdev,
 				  "Failed to init RSS rc: %d\n", rc);
 			return rc;
@@ -1455,17 +1455,17 @@ static int ena_rss_configure(struct ena_adapter *adapter)
 
 	/* Set indirect table */
 	rc = ena_com_indirect_table_set(ena_dev);
-	if (unlikely(rc && rc != -EPERM))
+	if (unlikely(rc && rc != -EOPNOTSUPP))
 		return rc;
 
 	/* Configure hash function (if supported) */
 	rc = ena_com_set_hash_function(ena_dev);
-	if (unlikely(rc && (rc != -EPERM)))
+	if (unlikely(rc && (rc != -EOPNOTSUPP)))
 		return rc;
 
 	/* Configure hash inputs (if supported) */
 	rc = ena_com_set_hash_ctrl(ena_dev);
-	if (unlikely(rc && (rc != -EPERM)))
+	if (unlikely(rc && (rc != -EOPNOTSUPP)))
 		return rc;
 
 	return 0;
@@ -2144,7 +2144,7 @@ static void ena_config_host_info(struct ena_com_dev *ena_dev)
 
 	rc = ena_com_set_host_attributes(ena_dev);
 	if (rc) {
-		if (rc == -EPERM)
+		if (rc == -EOPNOTSUPP)
 			pr_warn("Cannot set host attributes\n");
 		else
 			pr_err("Cannot set host attributes\n");
@@ -2181,7 +2181,7 @@ static void ena_config_debug_area(struct ena_adapter *adapter)
 
 	rc = ena_com_set_host_attributes(adapter->ena_dev);
 	if (rc) {
-		if (rc == -EPERM)
+		if (rc == -EOPNOTSUPP)
 			netif_warn(adapter, drv, adapter->netdev,
 				   "Cannot set host attributes\n");
 		else
@@ -2886,7 +2886,7 @@ static int ena_rss_init_default(struct ena_adapter *adapter)
 		val = ethtool_rxfh_indir_default(i, adapter->num_queues);
 		rc = ena_com_indirect_table_fill_entry(ena_dev, i,
 						       ENA_IO_RXQ_IDX(val));
-		if (unlikely(rc && (rc != -EPERM))) {
+		if (unlikely(rc && (rc != -EOPNOTSUPP))) {
 			dev_err(dev, "Cannot fill indirect table\n");
 			goto err_fill_indir;
 		}
@@ -2894,13 +2894,13 @@ static int ena_rss_init_default(struct ena_adapter *adapter)
 
 	rc = ena_com_fill_hash_function(ena_dev, ENA_ADMIN_CRC32, NULL,
 					ENA_HASH_KEY_SIZE, 0xFFFFFFFF);
-	if (unlikely(rc && (rc != -EPERM))) {
+	if (unlikely(rc && (rc != -EOPNOTSUPP))) {
 		dev_err(dev, "Cannot fill hash function\n");
 		goto err_fill_indir;
 	}
 
 	rc = ena_com_set_default_hash_ctrl(ena_dev);
-	if (unlikely(rc && (rc != -EPERM))) {
+	if (unlikely(rc && (rc != -EOPNOTSUPP))) {
 		dev_err(dev, "Cannot fill hash control\n");
 		goto err_fill_indir;
 	}
@@ -3114,7 +3114,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_worker_destroy;
 	}
 	rc = ena_rss_init_default(adapter);
-	if (rc && (rc != -EPERM)) {
+	if (rc && (rc != -EOPNOTSUPP)) {
 		dev_err(&pdev->dev, "Cannot init RSS rc: %d\n", rc);
 		goto err_free_msix;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 02/13] net: ena: add hardware hints capability to the driver
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (10 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 01/13] net: ena: change return value for unsupported features unsupported return value netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 03/13] net: ena: change sizeof() argument to be the type pointer netanel
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

With this patch, ENA device can update the ena driver about
the desired timeout values:
These values are part of the "hardware hints" which are transmitted
to the driver as Asynchronous event through ENA async
event notification queue.

In case the ENA device does not support this capability,
the driver will use its own default values.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_admin_defs.h | 31 +++++++++++
 drivers/net/ethernet/amazon/ena/ena_com.c        | 38 +++++++++++---
 drivers/net/ethernet/amazon/ena/ena_com.h        |  6 +++
 drivers/net/ethernet/amazon/ena/ena_netdev.c     | 66 ++++++++++++++++++++++--
 drivers/net/ethernet/amazon/ena/ena_netdev.h     |  5 ++
 drivers/net/ethernet/amazon/ena/ena_regs_defs.h  |  2 +
 6 files changed, 137 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index 5b6509d..305dc19 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -70,6 +70,8 @@ enum ena_admin_aq_feature_id {
 
 	ENA_ADMIN_MAX_QUEUES_NUM		= 2,
 
+	ENA_ADMIN_HW_HINTS			= 3,
+
 	ENA_ADMIN_RSS_HASH_FUNCTION		= 10,
 
 	ENA_ADMIN_STATELESS_OFFLOAD_CONFIG	= 11,
@@ -749,6 +751,31 @@ struct ena_admin_feature_rss_ind_table {
 	struct ena_admin_rss_ind_table_entry inline_entry;
 };
 
+/* When hint value is 0, driver should use it's own predefined value */
+struct ena_admin_ena_hw_hints {
+	/* value in ms */
+	u16 mmio_read_timeout;
+
+	/* value in ms */
+	u16 driver_watchdog_timeout;
+
+	/* Per packet tx completion timeout. value in ms */
+	u16 missing_tx_completion_timeout;
+
+	u16 missed_tx_completion_count_threshold_to_reset;
+
+	/* value in ms */
+	u16 admin_completion_tx_timeout;
+
+	u16 netdev_wd_timeout;
+
+	u16 max_tx_sgl_size;
+
+	u16 max_rx_sgl_size;
+
+	u16 reserved[8];
+};
+
 struct ena_admin_get_feat_cmd {
 	struct ena_admin_aq_common_desc aq_common_descriptor;
 
@@ -782,6 +809,8 @@ struct ena_admin_get_feat_resp {
 		struct ena_admin_feature_rss_ind_table ind_table;
 
 		struct ena_admin_feature_intr_moder_desc intr_moderation;
+
+		struct ena_admin_ena_hw_hints hw_hints;
 	} u;
 };
 
@@ -857,6 +886,8 @@ enum ena_admin_aenq_notification_syndrom {
 	ENA_ADMIN_SUSPEND	= 0,
 
 	ENA_ADMIN_RESUME	= 1,
+
+	ENA_ADMIN_UPDATE_HINTS	= 2,
 };
 
 struct ena_admin_aenq_entry {
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 02752d5..65d53d5 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -511,7 +511,7 @@ static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_c
 	unsigned long flags, timeout;
 	int ret;
 
-	timeout = jiffies + ADMIN_CMD_TIMEOUT_US;
+	timeout = jiffies + usecs_to_jiffies(admin_queue->completion_timeout);
 
 	while (1) {
 		spin_lock_irqsave(&admin_queue->q_lock, flags);
@@ -561,7 +561,8 @@ static int ena_com_wait_and_process_admin_cq_interrupts(struct ena_comp_ctx *com
 	int ret;
 
 	wait_for_completion_timeout(&comp_ctx->wait_event,
-				    usecs_to_jiffies(ADMIN_CMD_TIMEOUT_US));
+				    usecs_to_jiffies(
+					    admin_queue->completion_timeout));
 
 	/* In case the command wasn't completed find out the root cause.
 	 * There might be 2 kinds of errors
@@ -601,12 +602,15 @@ static u32 ena_com_reg_bar_read32(struct ena_com_dev *ena_dev, u16 offset)
 	struct ena_com_mmio_read *mmio_read = &ena_dev->mmio_read;
 	volatile struct ena_admin_ena_mmio_req_read_less_resp *read_resp =
 		mmio_read->read_resp;
-	u32 mmio_read_reg, ret;
+	u32 mmio_read_reg, ret, i;
 	unsigned long flags;
-	int i;
+	u32 timeout = mmio_read->reg_read_to;
 
 	might_sleep();
 
+	if (timeout == 0)
+		timeout = ENA_REG_READ_TIMEOUT;
+
 	/* If readless is disabled, perform regular read */
 	if (!mmio_read->readless_supported)
 		return readl(ena_dev->reg_bar + offset);
@@ -627,14 +631,14 @@ static u32 ena_com_reg_bar_read32(struct ena_com_dev *ena_dev, u16 offset)
 
 	writel(mmio_read_reg, ena_dev->reg_bar + ENA_REGS_MMIO_REG_READ_OFF);
 
-	for (i = 0; i < ENA_REG_READ_TIMEOUT; i++) {
+	for (i = 0; i < timeout; i++) {
 		if (read_resp->req_id == mmio_read->seq_num)
 			break;
 
 		udelay(1);
 	}
 
-	if (unlikely(i == ENA_REG_READ_TIMEOUT)) {
+	if (unlikely(i == timeout)) {
 		pr_err("reading reg failed for timeout. expected: req id[%hu] offset[%hu] actual: req id[%hu] offset[%hu]\n",
 		       mmio_read->seq_num, offset, read_resp->req_id,
 		       read_resp->reg_off);
@@ -1730,6 +1734,20 @@ int ena_com_get_dev_attr_feat(struct ena_com_dev *ena_dev,
 	memcpy(&get_feat_ctx->offload, &get_resp.u.offload,
 	       sizeof(get_resp.u.offload));
 
+	/* Driver hints isn't mandatory admin command. So in case the
+	 * command isn't supported set driver hints to 0
+	 */
+	rc = ena_com_get_feature(ena_dev, &get_resp, ENA_ADMIN_HW_HINTS);
+
+	if (!rc)
+		memcpy(&get_feat_ctx->hw_hints, &get_resp.u.hw_hints,
+		       sizeof(get_resp.u.hw_hints));
+	else if (rc == -EOPNOTSUPP)
+		memset(&get_feat_ctx->hw_hints, 0x0,
+		       sizeof(get_feat_ctx->hw_hints));
+	else
+		return rc;
+
 	return 0;
 }
 
@@ -1855,6 +1873,14 @@ int ena_com_dev_reset(struct ena_com_dev *ena_dev)
 		return rc;
 	}
 
+	timeout = (cap & ENA_REGS_CAPS_ADMIN_CMD_TO_MASK) >>
+		ENA_REGS_CAPS_ADMIN_CMD_TO_SHIFT;
+	if (timeout)
+		/* the resolution of timeout reg is 100ms */
+		ena_dev->admin_queue.completion_timeout = timeout * 100000;
+	else
+		ena_dev->admin_queue.completion_timeout = ADMIN_CMD_TIMEOUT_US;
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h
index c9b33ee..630c09a 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -97,6 +97,8 @@
 #define ENA_INTR_MODER_LEVEL_STRIDE			2
 #define ENA_INTR_BYTE_COUNT_NOT_SUPPORTED		0xFFFFFF
 
+#define ENA_HW_HINTS_NO_TIMEOUT				0xFFFF
+
 enum ena_intr_moder_level {
 	ENA_INTR_MODER_LOWEST = 0,
 	ENA_INTR_MODER_LOW,
@@ -232,7 +234,9 @@ struct ena_com_stats_admin {
 struct ena_com_admin_queue {
 	void *q_dmadev;
 	spinlock_t q_lock; /* spinlock for the admin queue */
+
 	struct ena_comp_ctx *comp_ctx;
+	u32 completion_timeout;
 	u16 q_depth;
 	struct ena_com_admin_cq cq;
 	struct ena_com_admin_sq sq;
@@ -267,6 +271,7 @@ struct ena_com_aenq {
 struct ena_com_mmio_read {
 	struct ena_admin_ena_mmio_req_read_less_resp *read_resp;
 	dma_addr_t read_resp_dma_addr;
+	u32 reg_read_to; /* in us */
 	u16 seq_num;
 	bool readless_supported;
 	/* spin lock to ensure a single outstanding read */
@@ -336,6 +341,7 @@ struct ena_com_dev_get_features_ctx {
 	struct ena_admin_device_attr_feature_desc dev_attr;
 	struct ena_admin_feature_aenq_desc aenq;
 	struct ena_admin_feature_offload_desc offload;
+	struct ena_admin_ena_hw_hints hw_hints;
 };
 
 struct ena_com_create_io_ctx {
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 8d4f65e..1ee06e1 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2577,7 +2577,7 @@ static int check_missing_comp_in_queue(struct ena_adapter *adapter,
 		tx_buf = &tx_ring->tx_buffer_info[i];
 		last_jiffies = tx_buf->last_jiffies;
 		if (unlikely(last_jiffies &&
-			     time_is_before_jiffies(last_jiffies + TX_TIMEOUT))) {
+			     time_is_before_jiffies(last_jiffies + adapter->missing_tx_completion_to))) {
 			if (!tx_buf->print_once)
 				netif_notice(adapter, tx_err, adapter->netdev,
 					     "Found a Tx that wasn't completed on time, qid %d, index %d.\n",
@@ -2586,10 +2586,11 @@ static int check_missing_comp_in_queue(struct ena_adapter *adapter,
 			tx_buf->print_once = 1;
 			missed_tx++;
 
-			if (unlikely(missed_tx > MAX_NUM_OF_TIMEOUTED_PACKETS)) {
+			if (unlikely(missed_tx > adapter->missing_tx_completion_threshold)) {
 				netif_err(adapter, tx_err, adapter->netdev,
 					  "The number of lost tx completions is above the threshold (%d > %d). Reset the device\n",
-					  missed_tx, MAX_NUM_OF_TIMEOUTED_PACKETS);
+					  missed_tx,
+					  adapter->missing_tx_completion_threshold);
 				set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 				return -EIO;
 			}
@@ -2613,6 +2614,9 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter)
 	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
 		return;
 
+	if (adapter->missing_tx_completion_to == ENA_HW_HINTS_NO_TIMEOUT)
+		return;
+
 	budget = ENA_MONITORED_TX_QUEUES;
 
 	for (i = adapter->last_monitored_tx_qid; i < adapter->num_queues; i++) {
@@ -2690,8 +2694,11 @@ static void check_for_missing_keep_alive(struct ena_adapter *adapter)
 	if (!adapter->wd_state)
 		return;
 
-	keep_alive_expired = round_jiffies(adapter->last_keep_alive_jiffies
-					   + ENA_DEVICE_KALIVE_TIMEOUT);
+	if (adapter->keep_alive_timeout == ENA_HW_HINTS_NO_TIMEOUT)
+		return;
+
+	keep_alive_expired = round_jiffies(adapter->last_keep_alive_jiffies +
+					   adapter->keep_alive_timeout);
 	if (unlikely(time_is_before_jiffies(keep_alive_expired))) {
 		netif_err(adapter, drv, adapter->netdev,
 			  "Keep alive watchdog timeout.\n");
@@ -2714,6 +2721,44 @@ static void check_for_admin_com_state(struct ena_adapter *adapter)
 	}
 }
 
+static void ena_update_hints(struct ena_adapter *adapter,
+			     struct ena_admin_ena_hw_hints *hints)
+{
+	struct net_device *netdev = adapter->netdev;
+
+	if (hints->admin_completion_tx_timeout)
+		adapter->ena_dev->admin_queue.completion_timeout =
+			hints->admin_completion_tx_timeout * 1000;
+
+	if (hints->mmio_read_timeout)
+		/* convert to usec */
+		adapter->ena_dev->mmio_read.reg_read_to =
+			hints->mmio_read_timeout * 1000;
+
+	if (hints->missed_tx_completion_count_threshold_to_reset)
+		adapter->missing_tx_completion_threshold =
+			hints->missed_tx_completion_count_threshold_to_reset;
+
+	if (hints->missing_tx_completion_timeout) {
+		if (hints->missing_tx_completion_timeout == ENA_HW_HINTS_NO_TIMEOUT)
+			adapter->missing_tx_completion_to = ENA_HW_HINTS_NO_TIMEOUT;
+		else
+			adapter->missing_tx_completion_to =
+				msecs_to_jiffies(hints->missing_tx_completion_timeout);
+	}
+
+	if (hints->netdev_wd_timeout)
+		netdev->watchdog_timeo = msecs_to_jiffies(hints->netdev_wd_timeout);
+
+	if (hints->driver_watchdog_timeout) {
+		if (hints->driver_watchdog_timeout == ENA_HW_HINTS_NO_TIMEOUT)
+			adapter->keep_alive_timeout = ENA_HW_HINTS_NO_TIMEOUT;
+		else
+			adapter->keep_alive_timeout =
+				msecs_to_jiffies(hints->driver_watchdog_timeout);
+	}
+}
+
 static void ena_update_host_info(struct ena_admin_host_info *host_info,
 				 struct net_device *netdev)
 {
@@ -3136,6 +3181,11 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	INIT_WORK(&adapter->reset_task, ena_fw_reset_device);
 
 	adapter->last_keep_alive_jiffies = jiffies;
+	adapter->keep_alive_timeout = ENA_DEVICE_KALIVE_TIMEOUT;
+	adapter->missing_tx_completion_to = TX_TIMEOUT;
+	adapter->missing_tx_completion_threshold = MAX_NUM_OF_TIMEOUTED_PACKETS;
+
+	ena_update_hints(adapter, &get_feat_ctx.hw_hints);
 
 	setup_timer(&adapter->timer_service, ena_timer_service,
 		    (unsigned long)adapter);
@@ -3337,6 +3387,7 @@ static void ena_notification(void *adapter_data,
 			     struct ena_admin_aenq_entry *aenq_e)
 {
 	struct ena_adapter *adapter = (struct ena_adapter *)adapter_data;
+	struct ena_admin_ena_hw_hints *hints;
 
 	WARN(aenq_e->aenq_common_desc.group != ENA_ADMIN_NOTIFICATION,
 	     "Invalid group(%x) expected %x\n",
@@ -3354,6 +3405,11 @@ static void ena_notification(void *adapter_data,
 	case ENA_ADMIN_RESUME:
 		queue_work(ena_wq, &adapter->resume_io_task);
 		break;
+	case ENA_ADMIN_UPDATE_HINTS:
+		hints = (struct ena_admin_ena_hw_hints *)
+			(&aenq_e->inline_data_w4);
+		ena_update_hints(adapter, hints);
+		break;
 	default:
 		netif_err(adapter, drv, adapter->netdev,
 			  "Invalid aenq notification link state %d\n",
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 88b5e56..56637be 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -280,6 +280,8 @@ struct ena_adapter {
 
 	int msix_vecs;
 
+	u32 missing_tx_completion_threshold;
+
 	u32 tx_usecs, rx_usecs; /* interrupt moderation */
 	u32 tx_frames, rx_frames; /* interrupt moderation */
 
@@ -293,6 +295,9 @@ struct ena_adapter {
 
 	u8 mac_addr[ETH_ALEN];
 
+	unsigned long keep_alive_timeout;
+	unsigned long missing_tx_completion_to;
+
 	char name[ENA_NAME_MAX_LEN];
 
 	unsigned long flags;
diff --git a/drivers/net/ethernet/amazon/ena/ena_regs_defs.h b/drivers/net/ethernet/amazon/ena/ena_regs_defs.h
index 26097a2..c3891c5 100644
--- a/drivers/net/ethernet/amazon/ena/ena_regs_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_regs_defs.h
@@ -78,6 +78,8 @@
 #define ENA_REGS_CAPS_RESET_TIMEOUT_MASK		0x3e
 #define ENA_REGS_CAPS_DMA_ADDR_WIDTH_SHIFT		8
 #define ENA_REGS_CAPS_DMA_ADDR_WIDTH_MASK		0xff00
+#define ENA_REGS_CAPS_ADMIN_CMD_TO_SHIFT		16
+#define ENA_REGS_CAPS_ADMIN_CMD_TO_MASK		0xf0000
 
 /* aq_caps register */
 #define ENA_REGS_AQ_CAPS_AQ_DEPTH_MASK		0xffff
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 03/13] net: ena: change sizeof() argument to be the type pointer
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (11 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 02/13] net: ena: add hardware hints capability to the driver netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 04/13] net: ena: add reset reason for each device FLR netanel
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Instead of using:
memset(ptr, 0x0, sizeof(struct ...))
use:
memset(ptr, 0x0, sizeor(*ptr))

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 65d53d5..2721c70 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -329,7 +329,7 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
 	size_t size;
 	int dev_node = 0;
 
-	memset(&io_sq->desc_addr, 0x0, sizeof(struct ena_com_io_desc_addr));
+	memset(&io_sq->desc_addr, 0x0, sizeof(io_sq->desc_addr));
 
 	io_sq->desc_entry_size =
 		(io_sq->direction == ENA_COM_IO_QUEUE_DIRECTION_TX) ?
@@ -383,7 +383,7 @@ static int ena_com_init_io_cq(struct ena_com_dev *ena_dev,
 	size_t size;
 	int prev_node = 0;
 
-	memset(&io_cq->cdesc_addr, 0x0, sizeof(struct ena_com_io_desc_addr));
+	memset(&io_cq->cdesc_addr, 0x0, sizeof(io_cq->cdesc_addr));
 
 	/* Use the basic completion descriptor for Rx */
 	io_cq->cdesc_entry_size_in_bytes =
@@ -685,7 +685,7 @@ static int ena_com_destroy_io_sq(struct ena_com_dev *ena_dev,
 	u8 direction;
 	int ret;
 
-	memset(&destroy_cmd, 0x0, sizeof(struct ena_admin_aq_destroy_sq_cmd));
+	memset(&destroy_cmd, 0x0, sizeof(destroy_cmd));
 
 	if (io_sq->direction == ENA_COM_IO_QUEUE_DIRECTION_TX)
 		direction = ENA_ADMIN_SQ_DIRECTION_TX;
@@ -967,7 +967,7 @@ static int ena_com_create_io_sq(struct ena_com_dev *ena_dev,
 	u8 direction;
 	int ret;
 
-	memset(&create_cmd, 0x0, sizeof(struct ena_admin_aq_create_sq_cmd));
+	memset(&create_cmd, 0x0, sizeof(create_cmd));
 
 	create_cmd.aq_common_descriptor.opcode = ENA_ADMIN_CREATE_SQ;
 
@@ -1159,7 +1159,7 @@ int ena_com_create_io_cq(struct ena_com_dev *ena_dev,
 	struct ena_admin_acq_create_cq_resp_desc cmd_completion;
 	int ret;
 
-	memset(&create_cmd, 0x0, sizeof(struct ena_admin_aq_create_cq_cmd));
+	memset(&create_cmd, 0x0, sizeof(create_cmd));
 
 	create_cmd.aq_common_descriptor.opcode = ENA_ADMIN_CREATE_CQ;
 
@@ -1267,7 +1267,7 @@ int ena_com_destroy_io_cq(struct ena_com_dev *ena_dev,
 	struct ena_admin_acq_destroy_cq_resp_desc destroy_resp;
 	int ret;
 
-	memset(&destroy_cmd, 0x0, sizeof(struct ena_admin_aq_destroy_sq_cmd));
+	memset(&destroy_cmd, 0x0, sizeof(destroy_cmd));
 
 	destroy_cmd.cq_idx = io_cq->idx;
 	destroy_cmd.aq_common_descriptor.opcode = ENA_ADMIN_DESTROY_CQ;
@@ -1623,8 +1623,8 @@ int ena_com_create_io_queue(struct ena_com_dev *ena_dev,
 	io_sq = &ena_dev->io_sq_queues[ctx->qid];
 	io_cq = &ena_dev->io_cq_queues[ctx->qid];
 
-	memset(io_sq, 0x0, sizeof(struct ena_com_io_sq));
-	memset(io_cq, 0x0, sizeof(struct ena_com_io_cq));
+	memset(io_sq, 0x0, sizeof(*io_sq));
+	memset(io_cq, 0x0, sizeof(*io_cq));
 
 	/* Init CQ */
 	io_cq->q_depth = ctx->queue_size;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 04/13] net: ena: add reset reason for each device FLR
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (12 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 03/13] net: ena: change sizeof() argument to be the type pointer netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 05/13] net: ena: add support for out of order rx buffers refill netanel
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

For each device reset, log to the device what is the cause
the reset occur.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c       |  5 +++-
 drivers/net/ethernet/amazon/ena/ena_com.h       |  4 +++-
 drivers/net/ethernet/amazon/ena/ena_netdev.c    | 17 +++++++++----
 drivers/net/ethernet/amazon/ena/ena_netdev.h    |  2 ++
 drivers/net/ethernet/amazon/ena/ena_regs_defs.h | 32 +++++++++++++++++++++++++
 5 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 2721c70..f6e1d30 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -1825,7 +1825,8 @@ void ena_com_aenq_intr_handler(struct ena_com_dev *dev, void *data)
 	writel((u32)aenq->head, dev->reg_bar + ENA_REGS_AENQ_HEAD_DB_OFF);
 }
 
-int ena_com_dev_reset(struct ena_com_dev *ena_dev)
+int ena_com_dev_reset(struct ena_com_dev *ena_dev,
+		      enum ena_regs_reset_reason_types reset_reason)
 {
 	u32 stat, timeout, cap, reset_val;
 	int rc;
@@ -1853,6 +1854,8 @@ int ena_com_dev_reset(struct ena_com_dev *ena_dev)
 
 	/* start reset */
 	reset_val = ENA_REGS_DEV_CTL_DEV_RESET_MASK;
+	reset_val |= (reset_reason << ENA_REGS_DEV_CTL_RESET_REASON_SHIFT) &
+		     ENA_REGS_DEV_CTL_RESET_REASON_MASK;
 	writel(reset_val, ena_dev->reg_bar + ENA_REGS_DEV_CTL_OFF);
 
 	/* Write again the MMIO read request address */
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h
index 630c09a..7b784f8 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -420,10 +420,12 @@ void ena_com_admin_destroy(struct ena_com_dev *ena_dev);
 
 /* ena_com_dev_reset - Perform device FLR to the device.
  * @ena_dev: ENA communication layer struct
+ * @reset_reason: Specify what is the trigger for the reset in case of an error.
  *
  * @return - 0 on success, negative value on failure.
  */
-int ena_com_dev_reset(struct ena_com_dev *ena_dev);
+int ena_com_dev_reset(struct ena_com_dev *ena_dev,
+		      enum ena_regs_reset_reason_types reset_reason);
 
 /* ena_com_create_io_queue - Create io queue.
  * @ena_dev: ENA communication layer struct
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 1ee06e1..0d35a4cc 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -87,6 +87,7 @@ static void ena_tx_timeout(struct net_device *dev)
 	if (test_and_set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
 		return;
 
+	adapter->reset_reason = ENA_REGS_RESET_OS_NETDEV_WD;
 	u64_stats_update_begin(&adapter->syncp);
 	adapter->dev_stats.tx_timeout++;
 	u64_stats_update_end(&adapter->syncp);
@@ -670,6 +671,7 @@ static int validate_tx_req_id(struct ena_ring *tx_ring, u16 req_id)
 	u64_stats_update_end(&tx_ring->syncp);
 
 	/* Trigger device reset */
+	tx_ring->adapter->reset_reason = ENA_REGS_RESET_INV_TX_REQ_ID;
 	set_bit(ENA_FLAG_TRIGGER_RESET, &tx_ring->adapter->flags);
 	return -EFAULT;
 }
@@ -1055,6 +1057,7 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi,
 	u64_stats_update_end(&rx_ring->syncp);
 
 	/* Too many desc from the device. Trigger reset */
+	adapter->reset_reason = ENA_REGS_RESET_TOO_MANY_RX_DESCS;
 	set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 
 	return 0;
@@ -1720,7 +1723,7 @@ static void ena_down(struct ena_adapter *adapter)
 	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) {
 		int rc;
 
-		rc = ena_com_dev_reset(adapter->ena_dev);
+		rc = ena_com_dev_reset(adapter->ena_dev, adapter->reset_reason);
 		if (rc)
 			dev_err(&adapter->pdev->dev, "Device reset failed\n");
 	}
@@ -2353,7 +2356,7 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev,
 	readless_supported = !(pdev->revision & ENA_MMIO_DISABLE_REG_READ);
 	ena_com_set_mmio_read_mode(ena_dev, readless_supported);
 
-	rc = ena_com_dev_reset(ena_dev);
+	rc = ena_com_dev_reset(ena_dev, ENA_REGS_RESET_NORMAL);
 	if (rc) {
 		dev_err(dev, "Can not reset device\n");
 		goto err_mmio_read_less;
@@ -2512,6 +2515,7 @@ static void ena_fw_reset_device(struct work_struct *work)
 
 	ena_com_mmio_reg_read_request_destroy(ena_dev);
 
+	adapter->reset_reason = ENA_REGS_RESET_NORMAL;
 	clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 
 	/* Finish with the destroy part. Start the init part */
@@ -2591,6 +2595,8 @@ static int check_missing_comp_in_queue(struct ena_adapter *adapter,
 					  "The number of lost tx completions is above the threshold (%d > %d). Reset the device\n",
 					  missed_tx,
 					  adapter->missing_tx_completion_threshold);
+				adapter->reset_reason =
+					ENA_REGS_RESET_MISS_TX_CMPL;
 				set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 				return -EIO;
 			}
@@ -2705,6 +2711,7 @@ static void check_for_missing_keep_alive(struct ena_adapter *adapter)
 		u64_stats_update_begin(&adapter->syncp);
 		adapter->dev_stats.wd_expired++;
 		u64_stats_update_end(&adapter->syncp);
+		adapter->reset_reason = ENA_REGS_RESET_KEEP_ALIVE_TO;
 		set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 	}
 }
@@ -2717,6 +2724,7 @@ static void check_for_admin_com_state(struct ena_adapter *adapter)
 		u64_stats_update_begin(&adapter->syncp);
 		adapter->dev_stats.admin_q_pause++;
 		u64_stats_update_end(&adapter->syncp);
+		adapter->reset_reason = ENA_REGS_RESET_ADMIN_TO;
 		set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 	}
 }
@@ -3121,6 +3129,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	ena_set_conf_feat_params(adapter, &get_feat_ctx);
 
 	adapter->msg_enable = netif_msg_init(debug, DEFAULT_MSG_ENABLE);
+	adapter->reset_reason = ENA_REGS_RESET_NORMAL;
 
 	adapter->tx_ring_size = queue_size;
 	adapter->rx_ring_size = queue_size;
@@ -3205,7 +3214,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	ena_com_delete_debug_area(ena_dev);
 	ena_com_rss_destroy(ena_dev);
 err_free_msix:
-	ena_com_dev_reset(ena_dev);
+	ena_com_dev_reset(ena_dev, ENA_REGS_RESET_INIT_ERR);
 	ena_free_mgmnt_irq(adapter);
 	pci_free_irq_vectors(adapter->pdev);
 err_worker_destroy:
@@ -3288,7 +3297,7 @@ static void ena_remove(struct pci_dev *pdev)
 
 	/* Reset the device only if the device is running. */
 	if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
-		ena_com_dev_reset(ena_dev);
+		ena_com_dev_reset(ena_dev, adapter->reset_reason);
 
 	ena_free_mgmnt_irq(adapter);
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 56637be..bbae971 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -327,6 +327,8 @@ struct ena_adapter {
 
 	/* last queue index that was checked for uncompleted tx packets */
 	u32 last_monitored_tx_qid;
+
+	enum ena_regs_reset_reason_types reset_reason;
 };
 
 void ena_set_ethtool_ops(struct net_device *netdev);
diff --git a/drivers/net/ethernet/amazon/ena/ena_regs_defs.h b/drivers/net/ethernet/amazon/ena/ena_regs_defs.h
index c3891c5..9aec43c 100644
--- a/drivers/net/ethernet/amazon/ena/ena_regs_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_regs_defs.h
@@ -32,6 +32,36 @@
 #ifndef _ENA_REGS_H_
 #define _ENA_REGS_H_
 
+enum ena_regs_reset_reason_types {
+	ENA_REGS_RESET_NORMAL			= 0,
+
+	ENA_REGS_RESET_KEEP_ALIVE_TO		= 1,
+
+	ENA_REGS_RESET_ADMIN_TO			= 2,
+
+	ENA_REGS_RESET_MISS_TX_CMPL		= 3,
+
+	ENA_REGS_RESET_INV_RX_REQ_ID		= 4,
+
+	ENA_REGS_RESET_INV_TX_REQ_ID		= 5,
+
+	ENA_REGS_RESET_TOO_MANY_RX_DESCS	= 6,
+
+	ENA_REGS_RESET_INIT_ERR			= 7,
+
+	ENA_REGS_RESET_DRIVER_INVALID_STATE	= 8,
+
+	ENA_REGS_RESET_OS_TRIGGER		= 9,
+
+	ENA_REGS_RESET_OS_NETDEV_WD		= 10,
+
+	ENA_REGS_RESET_SHUTDOWN			= 11,
+
+	ENA_REGS_RESET_USER_TRIGGER		= 12,
+
+	ENA_REGS_RESET_GENERIC			= 13,
+};
+
 /* ena_registers offsets */
 #define ENA_REGS_VERSION_OFF		0x0
 #define ENA_REGS_CONTROLLER_VERSION_OFF		0x4
@@ -104,6 +134,8 @@
 #define ENA_REGS_DEV_CTL_QUIESCENT_MASK		0x4
 #define ENA_REGS_DEV_CTL_IO_RESUME_SHIFT		3
 #define ENA_REGS_DEV_CTL_IO_RESUME_MASK		0x8
+#define ENA_REGS_DEV_CTL_RESET_REASON_SHIFT		28
+#define ENA_REGS_DEV_CTL_RESET_REASON_MASK		0xf0000000
 
 /* dev_sts register */
 #define ENA_REGS_DEV_STS_READY_MASK		0x1
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 05/13] net: ena: add support for out of order rx buffers refill
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (13 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 04/13] net: ena: add reset reason for each device FLR netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 06/13] net: ena: allow the driver to work with small number of msix vectors netanel
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

ENA driver post Rx buffers through the Rx submission queue
for the ENA device to fill them with receive packets.
Each Rx buffer is marked with req_id in the Rx descriptor.

Newer ENA devices could consume the posted Rx buffer in out of order,
and as result the corresponding Rx completion queue will have Rx
completion descriptors with non contiguous req_id(s)

In this change the driver holds two rings.
The first ring (called free_rx_ids) is a mapping ring.
It holds all the unused request ids.
The values in this ring are from 0 to ring_size -1.

When the driver wants to allocate a new Rx buffer it uses the head of
free_rx_ids and uses it's value as the index for rx_buffer_info ring.
The req_id is also written to the Rx descriptor

Upon Rx completion,
The driver took the req_id from the completion descriptor and uses it
as index in rx_buffer_info.
The req_id is then return to the free_rx_ids ring.

This patch also adds statistics to inform when the driver receive out
of range or unused req_id.

Note:
free_rx_ids is only accessible from the napi handler, so no locking is
required

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_eth_com.c |  5 ++
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |  1 +
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 83 ++++++++++++++++++++++-----
 drivers/net/ethernet/amazon/ena/ena_netdev.h  | 11 +++-
 4 files changed, 83 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_eth_com.c b/drivers/net/ethernet/amazon/ena/ena_eth_com.c
index f999305..b11e573 100644
--- a/drivers/net/ethernet/amazon/ena/ena_eth_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_eth_com.c
@@ -493,6 +493,11 @@ int ena_com_tx_comp_req_id_get(struct ena_com_io_cq *io_cq, u16 *req_id)
 	if (cdesc_phase != expected_phase)
 		return -EAGAIN;
 
+	if (unlikely(cdesc->req_id >= io_cq->q_depth)) {
+		pr_err("Invalid req id %d\n", cdesc->req_id);
+		return -EINVAL;
+	}
+
 	ena_com_cq_inc_head(io_cq);
 
 	*req_id = READ_ONCE(cdesc->req_id);
diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index d51a67f..b1212de 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -93,6 +93,7 @@ static const struct ena_stats ena_stats_rx_strings[] = {
 	ENA_STAT_RX_ENTRY(dma_mapping_err),
 	ENA_STAT_RX_ENTRY(bad_desc_num),
 	ENA_STAT_RX_ENTRY(rx_copybreak_pkt),
+	ENA_STAT_RX_ENTRY(bad_req_id),
 	ENA_STAT_RX_ENTRY(empty_rx_ring),
 };
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 0d35a4cc..fcbcd18 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -304,6 +304,24 @@ static void ena_free_all_io_tx_resources(struct ena_adapter *adapter)
 		ena_free_tx_resources(adapter, i);
 }
 
+static inline int validate_rx_req_id(struct ena_ring *rx_ring, u16 req_id)
+{
+	if (likely(req_id < rx_ring->ring_size))
+		return 0;
+
+	netif_err(rx_ring->adapter, rx_err, rx_ring->netdev,
+		  "Invalid rx req_id: %hu\n", req_id);
+
+	u64_stats_update_begin(&rx_ring->syncp);
+	rx_ring->rx_stats.bad_req_id++;
+	u64_stats_update_end(&rx_ring->syncp);
+
+	/* Trigger device reset */
+	rx_ring->adapter->reset_reason = ENA_REGS_RESET_INV_RX_REQ_ID;
+	set_bit(ENA_FLAG_TRIGGER_RESET, &rx_ring->adapter->flags);
+	return -EFAULT;
+}
+
 /* ena_setup_rx_resources - allocate I/O Rx resources (Descriptors)
  * @adapter: network interface device structure
  * @qid: queue index
@@ -315,7 +333,7 @@ static int ena_setup_rx_resources(struct ena_adapter *adapter,
 {
 	struct ena_ring *rx_ring = &adapter->rx_ring[qid];
 	struct ena_irq *ena_irq = &adapter->irq_tbl[ENA_IO_IRQ_IDX(qid)];
-	int size, node;
+	int size, node, i;
 
 	if (rx_ring->rx_buffer_info) {
 		netif_err(adapter, ifup, adapter->netdev,
@@ -336,6 +354,20 @@ static int ena_setup_rx_resources(struct ena_adapter *adapter,
 			return -ENOMEM;
 	}
 
+	size = sizeof(u16) * rx_ring->ring_size;
+	rx_ring->free_rx_ids = vzalloc_node(size, node);
+	if (!rx_ring->free_rx_ids) {
+		rx_ring->free_rx_ids = vzalloc(size);
+		if (!rx_ring->free_rx_ids) {
+			vfree(rx_ring->rx_buffer_info);
+			return -ENOMEM;
+		}
+	}
+
+	/* Req id ring for receiving RX pkts out of order */
+	for (i = 0; i < rx_ring->ring_size; i++)
+		rx_ring->free_rx_ids[i] = i;
+
 	/* Reset rx statistics */
 	memset(&rx_ring->rx_stats, 0x0, sizeof(rx_ring->rx_stats));
 
@@ -359,6 +391,9 @@ static void ena_free_rx_resources(struct ena_adapter *adapter,
 
 	vfree(rx_ring->rx_buffer_info);
 	rx_ring->rx_buffer_info = NULL;
+
+	vfree(rx_ring->free_rx_ids);
+	rx_ring->free_rx_ids = NULL;
 }
 
 /* ena_setup_all_rx_resources - allocate I/O Rx queues resources for all queues
@@ -464,15 +499,22 @@ static void ena_free_rx_page(struct ena_ring *rx_ring,
 
 static int ena_refill_rx_bufs(struct ena_ring *rx_ring, u32 num)
 {
-	u16 next_to_use;
+	u16 next_to_use, req_id;
 	u32 i;
 	int rc;
 
 	next_to_use = rx_ring->next_to_use;
 
 	for (i = 0; i < num; i++) {
-		struct ena_rx_buffer *rx_info =
-			&rx_ring->rx_buffer_info[next_to_use];
+		struct ena_rx_buffer *rx_info;
+
+		req_id = rx_ring->free_rx_ids[next_to_use];
+		rc = validate_rx_req_id(rx_ring, req_id);
+		if (unlikely(rc < 0))
+			break;
+
+		rx_info = &rx_ring->rx_buffer_info[req_id];
+
 
 		rc = ena_alloc_rx_page(rx_ring, rx_info,
 				       __GFP_COLD | GFP_ATOMIC | __GFP_COMP);
@@ -484,7 +526,7 @@ static int ena_refill_rx_bufs(struct ena_ring *rx_ring, u32 num)
 		}
 		rc = ena_com_add_single_rx_desc(rx_ring->ena_com_io_sq,
 						&rx_info->ena_buf,
-						next_to_use);
+						req_id);
 		if (unlikely(rc)) {
 			netif_warn(rx_ring->adapter, rx_status, rx_ring->netdev,
 				   "failed to add buffer for rx queue %d\n",
@@ -789,13 +831,14 @@ static struct sk_buff *ena_rx_skb(struct ena_ring *rx_ring,
 				  u16 *next_to_clean)
 {
 	struct sk_buff *skb;
-	struct ena_rx_buffer *rx_info =
-		&rx_ring->rx_buffer_info[*next_to_clean];
-	u32 len;
-	u32 buf = 0;
+	struct ena_rx_buffer *rx_info;
+	u16 len, req_id, buf = 0;
 	void *va;
 
-	len = ena_bufs[0].len;
+	len = ena_bufs[buf].len;
+	req_id = ena_bufs[buf].req_id;
+	rx_info = &rx_ring->rx_buffer_info[req_id];
+
 	if (unlikely(!rx_info->page)) {
 		netif_err(rx_ring->adapter, rx_err, rx_ring->netdev,
 			  "Page is NULL\n");
@@ -867,13 +910,18 @@ static struct sk_buff *ena_rx_skb(struct ena_ring *rx_ring,
 			  skb->len, skb->data_len);
 
 		rx_info->page = NULL;
+
+		rx_ring->free_rx_ids[*next_to_clean] = req_id;
 		*next_to_clean =
 			ENA_RX_RING_IDX_NEXT(*next_to_clean,
 					     rx_ring->ring_size);
 		if (likely(--descs == 0))
 			break;
-		rx_info = &rx_ring->rx_buffer_info[*next_to_clean];
-		len = ena_bufs[++buf].len;
+
+		buf++;
+		len = ena_bufs[buf].len;
+		req_id = ena_bufs[buf].req_id;
+		rx_info = &rx_ring->rx_buffer_info[req_id];
 	} while (1);
 
 	return skb;
@@ -974,6 +1022,7 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi,
 	int rc = 0;
 	int total_len = 0;
 	int rx_copybreak_pkt = 0;
+	int i;
 
 	netif_dbg(rx_ring->adapter, rx_status, rx_ring->netdev,
 		  "%s qid %d\n", __func__, rx_ring->qid);
@@ -1003,9 +1052,13 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi,
 
 		/* exit if we failed to retrieve a buffer */
 		if (unlikely(!skb)) {
-			next_to_clean = ENA_RX_RING_IDX_ADD(next_to_clean,
-							    ena_rx_ctx.descs,
-							    rx_ring->ring_size);
+			for (i = 0; i < ena_rx_ctx.descs; i++) {
+				rx_ring->free_tx_ids[next_to_clean] =
+					rx_ring->ena_bufs[i].req_id;
+				next_to_clean =
+					ENA_RX_RING_IDX_NEXT(next_to_clean,
+							     rx_ring->ring_size);
+			}
 			break;
 		}
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index bbae971..c6b17ce 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -194,12 +194,19 @@ struct ena_stats_rx {
 	u64 dma_mapping_err;
 	u64 bad_desc_num;
 	u64 rx_copybreak_pkt;
+	u64 bad_req_id;
 	u64 empty_rx_ring;
 };
 
 struct ena_ring {
-	/* Holds the empty requests for TX out of order completions */
-	u16 *free_tx_ids;
+	union {
+		/* Holds the empty requests for TX/RX
+		 * out of order completions
+		 */
+		u16 *free_tx_ids;
+		u16 *free_rx_ids;
+	};
+
 	union {
 		struct ena_tx_buffer *tx_buffer_info;
 		struct ena_rx_buffer *rx_buffer_info;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 06/13] net: ena: allow the driver to work with small number of msix vectors
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (14 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 05/13] net: ena: add support for out of order rx buffers refill netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 21:46 ` [PATCH net-next 07/13] net: ena: use napi_schedule_irqoff when possible netanel
  2017-06-08 23:17 ` [PATCH net-next 0/8] Bug fixes in ena ethernet driver David Miller
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Current driver tries to allocate msix vectors as the number of the
negotiated io queues. (with another msix vector for management).
If pci_alloc_irq_vectors() fails, the driver aborts the probe
and the ENA network device is never brought up.

With this patch, the driver's logic will reduce the number of IO
queues to the number of allocated msix vectors (minus one for management)
instead of failing probe().

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 65 ++++++++++++++++++++--------
 drivers/net/ethernet/amazon/ena/ena_netdev.h |  6 ++-
 2 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index fcbcd18..04aade8 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1269,9 +1269,20 @@ static irqreturn_t ena_intr_msix_io(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+/* Reserve a single MSI-X vector for management (admin + aenq).
+ * plus reserve one vector for each potential io queue.
+ * the number of potential io queues is the minimum of what the device
+ * supports and the number of vCPUs.
+ */
 static int ena_enable_msix(struct ena_adapter *adapter, int num_queues)
 {
-	int msix_vecs, rc;
+	int msix_vecs, irq_cnt;
+
+	if (test_bit(ENA_FLAG_MSIX_ENABLED, &adapter->flags)) {
+		netif_err(adapter, probe, adapter->netdev,
+			  "Error, MSI-X is already enabled\n");
+		return -EPERM;
+	}
 
 	/* Reserved the max msix vectors we might need */
 	msix_vecs = ENA_MAX_MSIX_VEC(num_queues);
@@ -1279,25 +1290,28 @@ static int ena_enable_msix(struct ena_adapter *adapter, int num_queues)
 	netif_dbg(adapter, probe, adapter->netdev,
 		  "trying to enable MSI-X, vectors %d\n", msix_vecs);
 
-	rc = pci_alloc_irq_vectors(adapter->pdev, msix_vecs, msix_vecs,
-			PCI_IRQ_MSIX);
-	if (rc < 0) {
+	irq_cnt = pci_alloc_irq_vectors(adapter->pdev, ENA_MIN_MSIX_VEC,
+					msix_vecs, PCI_IRQ_MSIX);
+
+	if (irq_cnt < 0) {
 		netif_err(adapter, probe, adapter->netdev,
-			  "Failed to enable MSI-X, vectors %d rc %d\n",
-			  msix_vecs, rc);
+			  "Failed to enable MSI-X. irq_cnt %d\n", irq_cnt);
 		return -ENOSPC;
 	}
 
-	netif_dbg(adapter, probe, adapter->netdev, "enable MSI-X, vectors %d\n",
-		  msix_vecs);
-
-	if (msix_vecs >= 1) {
-		if (ena_init_rx_cpu_rmap(adapter))
-			netif_warn(adapter, probe, adapter->netdev,
-				   "Failed to map IRQs to CPUs\n");
+	if (irq_cnt != msix_vecs) {
+		netif_notice(adapter, probe, adapter->netdev,
+			     "enable only %d MSI-X (out of %d), reduce the number of queues\n",
+			     irq_cnt, msix_vecs);
+		adapter->num_queues = irq_cnt - ENA_ADMIN_MSIX_VEC;
 	}
 
-	adapter->msix_vecs = msix_vecs;
+	if (ena_init_rx_cpu_rmap(adapter))
+		netif_warn(adapter, probe, adapter->netdev,
+			   "Failed to map IRQs to CPUs\n");
+
+	adapter->msix_vecs = irq_cnt;
+	set_bit(ENA_FLAG_MSIX_ENABLED, &adapter->flags);
 
 	return 0;
 }
@@ -1374,6 +1388,12 @@ static int ena_request_io_irq(struct ena_adapter *adapter)
 	struct ena_irq *irq;
 	int rc = 0, i, k;
 
+	if (!test_bit(ENA_FLAG_MSIX_ENABLED, &adapter->flags)) {
+		netif_err(adapter, ifup, adapter->netdev,
+			  "Failed to request I/O IRQ: MSI-X is not enabled\n");
+		return -EINVAL;
+	}
+
 	for (i = ENA_IO_IRQ_FIRST_IDX; i < adapter->msix_vecs; i++) {
 		irq = &adapter->irq_tbl[i];
 		rc = request_irq(irq->vector, irq->handler, flags, irq->name,
@@ -1432,6 +1452,12 @@ static void ena_free_io_irq(struct ena_adapter *adapter)
 	}
 }
 
+static void ena_disable_msix(struct ena_adapter *adapter)
+{
+	if (test_and_clear_bit(ENA_FLAG_MSIX_ENABLED, &adapter->flags))
+		pci_free_irq_vectors(adapter->pdev);
+}
+
 static void ena_disable_io_intr_sync(struct ena_adapter *adapter)
 {
 	int i;
@@ -2520,7 +2546,8 @@ static int ena_enable_msix_and_set_admin_interrupts(struct ena_adapter *adapter,
 	return 0;
 
 err_disable_msix:
-	pci_free_irq_vectors(adapter->pdev);
+	ena_disable_msix(adapter);
+
 	return rc;
 }
 
@@ -2558,7 +2585,7 @@ static void ena_fw_reset_device(struct work_struct *work)
 
 	ena_free_mgmnt_irq(adapter);
 
-	pci_free_irq_vectors(adapter->pdev);
+	ena_disable_msix(adapter);
 
 	ena_com_abort_admin_commands(ena_dev);
 
@@ -2610,7 +2637,7 @@ static void ena_fw_reset_device(struct work_struct *work)
 	return;
 err_disable_msix:
 	ena_free_mgmnt_irq(adapter);
-	pci_free_irq_vectors(adapter->pdev);
+	ena_disable_msix(adapter);
 err_device_destroy:
 	ena_com_admin_destroy(ena_dev);
 err:
@@ -3269,7 +3296,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 err_free_msix:
 	ena_com_dev_reset(ena_dev, ENA_REGS_RESET_INIT_ERR);
 	ena_free_mgmnt_irq(adapter);
-	pci_free_irq_vectors(adapter->pdev);
+	ena_disable_msix(adapter);
 err_worker_destroy:
 	ena_com_destroy_interrupt_moderation(ena_dev);
 	del_timer(&adapter->timer_service);
@@ -3354,7 +3381,7 @@ static void ena_remove(struct pci_dev *pdev)
 
 	ena_free_mgmnt_irq(adapter);
 
-	pci_free_irq_vectors(adapter->pdev);
+	ena_disable_msix(adapter);
 
 	free_netdev(netdev);
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index c6b17ce..4518a9d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -58,7 +58,10 @@
 #define DEVICE_NAME	"Elastic Network Adapter (ENA)"
 
 /* 1 for AENQ + ADMIN */
-#define ENA_MAX_MSIX_VEC(io_queues)	(1 + (io_queues))
+#define ENA_ADMIN_MSIX_VEC		1
+#define ENA_MAX_MSIX_VEC(io_queues)	(ENA_ADMIN_MSIX_VEC + (io_queues))
+
+#define ENA_MIN_MSIX_VEC		2
 
 #define ENA_REG_BAR			0
 #define ENA_MEM_BAR			2
@@ -267,6 +270,7 @@ enum ena_flags_t {
 	ENA_FLAG_DEVICE_RUNNING,
 	ENA_FLAG_DEV_UP,
 	ENA_FLAG_LINK_UP,
+	ENA_FLAG_MSIX_ENABLED,
 	ENA_FLAG_TRIGGER_RESET
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next 07/13] net: ena: use napi_schedule_irqoff when possible
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (15 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 06/13] net: ena: allow the driver to work with small number of msix vectors netanel
@ 2017-06-08 21:46 ` netanel
  2017-06-08 23:17 ` [PATCH net-next 0/8] Bug fixes in ena ethernet driver David Miller
  17 siblings, 0 replies; 27+ messages in thread
From: netanel @ 2017-06-08 21:46 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 04aade8..424b4d7 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1264,7 +1264,7 @@ static irqreturn_t ena_intr_msix_io(int irq, void *data)
 {
 	struct ena_napi *ena_napi = data;
 
-	napi_schedule(&ena_napi->napi);
+	napi_schedule_irqoff(&ena_napi->napi);
 
 	return IRQ_HANDLED;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
                   ` (16 preceding siblings ...)
  2017-06-08 21:46 ` [PATCH net-next 07/13] net: ena: use napi_schedule_irqoff when possible netanel
@ 2017-06-08 23:17 ` David Miller
  2017-06-09  7:13   ` Belgazal, Netanel
  17 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2017-06-08 23:17 UTC (permalink / raw)
  To: netanel; +Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, evgenys


Two parallel patch series to the same driver and targetting the
same GIT tree is extremely undesirable, please don't do this.

Submit one series, and once applied submit the second series.

I'm deleting all of your patches from my queue, please resubmit
things properly.

Thank you.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next 0/8] Bug fixes in ena ethernet driver
@ 2017-06-09  6:55 netanel
  2017-06-09 19:33 ` David Miller
  0 siblings, 1 reply; 27+ messages in thread
From: netanel @ 2017-06-09  6:55 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

This patchset contains fixes for the bugs that were discovered so far.

Netanel Belgazal (8):
  net: ena: fix rare uncompleted admin command false alarm
  net: ena: fix bug that might cause hang after consecutive open/close
    interface.
  net: ena: add missing return when ena_com_get_io_handlers() fails
  net: ena: fix race condition between submit and completion admin
    command
  net: ena: add missing unmap bars on device removal
  net: ena: fix theoretical Rx hang on low memory systems
  net: ena: disable admin msix while working in polling mode
  net: ena: bug fix in lost tx packets detection mechanism

 drivers/net/ethernet/amazon/ena/ena_com.c     |  35 +++--
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |   2 +-
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 179 +++++++++++++++++++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |  16 ++-
 4 files changed, 168 insertions(+), 64 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-08 23:17 ` [PATCH net-next 0/8] Bug fixes in ena ethernet driver David Miller
@ 2017-06-09  7:13   ` Belgazal, Netanel
  0 siblings, 0 replies; 27+ messages in thread
From: Belgazal, Netanel @ 2017-06-09  7:13 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, BSHARA, Said, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny

My apologies,
I was not aware.
Will make sure this won't happen again.

Regards,
Netanel
________________________________________
From: David Miller <davem@davemloft.net>
Sent: Friday, June 9, 2017 2:17 AM
To: Belgazal, Netanel
Cc: netdev@vger.kernel.org; Woodhouse, David; Machulsky, Zorik; Matushevsky, Alexander; BSHARA, Said; Wilson, Matt; Liguori, Anthony; Bshara, Nafea; Schmeilin, Evgeny
Subject: Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver

Two parallel patch series to the same driver and targetting the
same GIT tree is extremely undesirable, please don't do this.

Submit one series, and once applied submit the second series.

I'm deleting all of your patches from my queue, please resubmit
things properly.

Thank you.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-09  6:55 netanel
@ 2017-06-09 19:33 ` David Miller
  2017-06-09 22:13   ` Belgazal, Netanel
  0 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2017-06-09 19:33 UTC (permalink / raw)
  To: netanel; +Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, evgenys

From: <netanel@amazon.com>
Date: Fri, 9 Jun 2017 09:55:16 +0300

> This patchset contains fixes for the bugs that were discovered so far.

You submitted patch #6 twice, once with the word "stuck" in the subject
line, once with the word "hang" in the subject line.

Please sort this out and resubmit, thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-09 19:33 ` David Miller
@ 2017-06-09 22:13   ` Belgazal, Netanel
  0 siblings, 0 replies; 27+ messages in thread
From: Belgazal, Netanel @ 2017-06-09 22:13 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, BSHARA, Said, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny

I the last minute I fixed patchset #6 commit subject from stuck to hang and I forget to remove it.
Sorry for that.
resubmitted.
________________________________________
From: David Miller <davem@davemloft.net>
Sent: Friday, June 9, 2017 10:33 PM
To: Belgazal, Netanel
Cc: netdev@vger.kernel.org; Woodhouse, David; Machulsky, Zorik; Matushevsky, Alexander; BSHARA, Said; Wilson, Matt; Liguori, Anthony; Bshara, Nafea; Schmeilin, Evgeny
Subject: Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver

From: <netanel@amazon.com>
Date: Fri, 9 Jun 2017 09:55:16 +0300

> This patchset contains fixes for the bugs that were discovered so far.

You submitted patch #6 twice, once with the word "stuck" in the subject
line, once with the word "hang" in the subject line.

Please sort this out and resubmit, thanks.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next 0/8] Bug fixes in ena ethernet driver
@ 2017-06-09 22:13 netanel
  2017-06-09 22:19 ` Florian Fainelli
  0 siblings, 1 reply; 27+ messages in thread
From: netanel @ 2017-06-09 22:13 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys

From: Netanel Belgazal <netanel@amazon.com>

This patchset contains fixes for the bugs that were discovered so far.

Netanel Belgazal (8):
  net: ena: fix rare uncompleted admin command false alarm
  net: ena: fix bug that might cause hang after consecutive open/close
    interface.
  net: ena: add missing return when ena_com_get_io_handlers() fails
  net: ena: fix race condition between submit and completion admin
    command
  net: ena: add missing unmap bars on device removal
  net: ena: fix theoretical Rx hang on low memory systems
  net: ena: disable admin msix while working in polling mode
  net: ena: bug fix in lost tx packets detection mechanism

 drivers/net/ethernet/amazon/ena/ena_com.c     |  35 +++--
 drivers/net/ethernet/amazon/ena/ena_ethtool.c |   2 +-
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 179 +++++++++++++++++++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |  16 ++-
 4 files changed, 168 insertions(+), 64 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-09 22:13 netanel
@ 2017-06-09 22:19 ` Florian Fainelli
  2017-06-10 20:11   ` David Miller
  0 siblings, 1 reply; 27+ messages in thread
From: Florian Fainelli @ 2017-06-09 22:19 UTC (permalink / raw)
  To: netanel, davem, netdev
  Cc: dwmw, zorik, matua, saeedb, msw, aliguori, nafea, evgenys

On 06/09/2017 03:13 PM, netanel@amazon.com wrote:
> From: Netanel Belgazal <netanel@amazon.com>
> 
> This patchset contains fixes for the bugs that were discovered so far.

If these are all fixes you should submit them against the "net" tree.
net-next is for features [1].

Since these are fixes, you may also want to provide a Fixes: 12-digit
commit ("commit subject") [2] such that David can queue these patches
for stable trees and this can be retrofitted into kernel distributions.

[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/netdev-FAQ.txt#n25

[2]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n183

> 
> Netanel Belgazal (8):
>   net: ena: fix rare uncompleted admin command false alarm
>   net: ena: fix bug that might cause hang after consecutive open/close
>     interface.
>   net: ena: add missing return when ena_com_get_io_handlers() fails
>   net: ena: fix race condition between submit and completion admin
>     command
>   net: ena: add missing unmap bars on device removal
>   net: ena: fix theoretical Rx hang on low memory systems
>   net: ena: disable admin msix while working in polling mode
>   net: ena: bug fix in lost tx packets detection mechanism
> 
>  drivers/net/ethernet/amazon/ena/ena_com.c     |  35 +++--
>  drivers/net/ethernet/amazon/ena/ena_ethtool.c |   2 +-
>  drivers/net/ethernet/amazon/ena/ena_netdev.c  | 179 +++++++++++++++++++-------
>  drivers/net/ethernet/amazon/ena/ena_netdev.h  |  16 ++-
>  4 files changed, 168 insertions(+), 64 deletions(-)
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-09 22:19 ` Florian Fainelli
@ 2017-06-10 20:11   ` David Miller
  2017-06-11 12:28     ` Belgazal, Netanel
  0 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2017-06-10 20:11 UTC (permalink / raw)
  To: f.fainelli
  Cc: netanel, netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea,
	evgenys

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Fri, 9 Jun 2017 15:19:54 -0700

> On 06/09/2017 03:13 PM, netanel@amazon.com wrote:
>> From: Netanel Belgazal <netanel@amazon.com>
>> 
>> This patchset contains fixes for the bugs that were discovered so far.
> 
> If these are all fixes you should submit them against the "net" tree.
> net-next is for features [1].
> 
> Since these are fixes, you may also want to provide a Fixes: 12-digit
> commit ("commit subject") [2] such that David can queue these patches
> for stable trees and this can be retrofitted into kernel distributions.
> 
> [1]:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/netdev-FAQ.txt#n25
> 
> [2]:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n183

Yeah I agree.  If they are genuine bug fixes they should be submitted
against 'net'.  And yes, Fixes: tags are quite desirable as well.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver
  2017-06-10 20:11   ` David Miller
@ 2017-06-11 12:28     ` Belgazal, Netanel
  0 siblings, 0 replies; 27+ messages in thread
From: Belgazal, Netanel @ 2017-06-11 12:28 UTC (permalink / raw)
  To: David Miller, Florian Fainelli
  Cc: netdev@vger.kernel.org, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, BSHARA, Said, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny

Ack.
resubmitting to net branch and adding "Fixes" mark.
________________________________________
From: David Miller <davem@davemloft.net>
Sent: Saturday, June 10, 2017 11:11 PM
To: f.fainelli@gmail.com
Cc: Belgazal, Netanel; netdev@vger.kernel.org; Woodhouse, David; Machulsky, Zorik; Matushevsky, Alexander; BSHARA, Said; Wilson, Matt; Liguori, Anthony; Bshara, Nafea; Schmeilin, Evgeny
Subject: Re: [PATCH net-next 0/8] Bug fixes in ena ethernet driver

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Fri, 9 Jun 2017 15:19:54 -0700

> On 06/09/2017 03:13 PM, netanel@amazon.com wrote:
>> From: Netanel Belgazal <netanel@amazon.com>
>>
>> This patchset contains fixes for the bugs that were discovered so far.
>
> If these are all fixes you should submit them against the "net" tree.
> net-next is for features [1].
>
> Since these are fixes, you may also want to provide a Fixes: 12-digit
> commit ("commit subject") [2] such that David can queue these patches
> for stable trees and this can be retrofitted into kernel distributions.
>
> [1]:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/netdev-FAQ.txt#n25
>
> [2]:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n183

Yeah I agree.  If they are genuine bug fixes they should be submitted
against 'net'.  And yes, Fixes: tags are quite desirable as well.


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-06-11 12:28 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-08 21:46 [PATCH net-next 0/8] Bug fixes in ena ethernet driver netanel
2017-06-08 21:46 ` [PATCH net-next 1/8] net: ena: fix rare uncompleted admin command false alarm netanel
2017-06-08 21:46 ` [PATCH net-next 2/8] net: ena: fix bug that might cause hang after consecutive open/close interface netanel
2017-06-08 21:46 ` [PATCH net-next 3/8] net: ena: add missing return when ena_com_get_io_handlers() fails netanel
2017-06-08 21:46 ` [PATCH net-next 4/8] net: ena: fix race condition between submit and completion admin command netanel
2017-06-08 21:46 ` [PATCH net-next 5/8] net: ena: add missing unmap bars on device removal netanel
2017-06-08 21:46 ` [PATCH net-next 6/8] net: ena: fix theoretical Rx hang on low memory systems netanel
2017-06-08 21:46 ` [PATCH net-next 6/8] net: ena: fix theoretical Rx stuck " netanel
2017-06-08 21:46 ` [PATCH net-next 7/8] net: ena: disable admin msix while working in polling mode netanel
2017-06-08 21:46 ` [PATCH net-next 8/8] net: ena: bug fix in lost tx packets detection mechanism netanel
2017-06-08 21:46 ` [PATCH net-next 00/13] update ena ethernet driver to version 1.2.0 netanel
2017-06-08 21:46 ` [PATCH net-next 01/13] net: ena: change return value for unsupported features unsupported return value netanel
2017-06-08 21:46 ` [PATCH net-next 02/13] net: ena: add hardware hints capability to the driver netanel
2017-06-08 21:46 ` [PATCH net-next 03/13] net: ena: change sizeof() argument to be the type pointer netanel
2017-06-08 21:46 ` [PATCH net-next 04/13] net: ena: add reset reason for each device FLR netanel
2017-06-08 21:46 ` [PATCH net-next 05/13] net: ena: add support for out of order rx buffers refill netanel
2017-06-08 21:46 ` [PATCH net-next 06/13] net: ena: allow the driver to work with small number of msix vectors netanel
2017-06-08 21:46 ` [PATCH net-next 07/13] net: ena: use napi_schedule_irqoff when possible netanel
2017-06-08 23:17 ` [PATCH net-next 0/8] Bug fixes in ena ethernet driver David Miller
2017-06-09  7:13   ` Belgazal, Netanel
  -- strict thread matches above, loose matches on Subject: below --
2017-06-09  6:55 netanel
2017-06-09 19:33 ` David Miller
2017-06-09 22:13   ` Belgazal, Netanel
2017-06-09 22:13 netanel
2017-06-09 22:19 ` Florian Fainelli
2017-06-10 20:11   ` David Miller
2017-06-11 12:28     ` Belgazal, Netanel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).