netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 net 0/3] ENA driver bug fixes
@ 2022-01-02  7:37 Arthur Kiyanovski
  2022-01-02  7:37 ` [PATCH V2 net 1/3] net: ena: Fix undefined state when tx request id is out of bounds Arthur Kiyanovski
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Arthur Kiyanovski @ 2022-01-02  7:37 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, netdev
  Cc: Arthur Kiyanovski, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, Saeed Bshara, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Belgazal, Netanel, Saidi, Ali,
	Herrenschmidt, Benjamin, Dagan, Noam, Agroskin, Shay,
	Arinzon, David


Patchset V2 chages:
-------------------
Updated SHA1 of Fixes tag in patch 3/3 to be 12 digits long


Original cover letter:
----------------------
ENA driver bug fixes

Arthur Kiyanovski (3):
  net: ena: Fix undefined state when tx request id is out of bounds
  net: ena: Fix wrong rx request id by resetting device
  net: ena: Fix error handling when calculating max IO queues number

 drivers/net/ethernet/amazon/ena/ena_netdev.c | 49 ++++++++++++--------
 1 file changed, 29 insertions(+), 20 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH V2 net 1/3] net: ena: Fix undefined state when tx request id is out of bounds
  2022-01-02  7:37 [PATCH V2 net 0/3] ENA driver bug fixes Arthur Kiyanovski
@ 2022-01-02  7:37 ` Arthur Kiyanovski
  2022-01-02  7:37 ` [PATCH V2 net 2/3] net: ena: Fix wrong rx request id by resetting device Arthur Kiyanovski
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Arthur Kiyanovski @ 2022-01-02  7:37 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, netdev
  Cc: Arthur Kiyanovski, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, Saeed Bshara, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Belgazal, Netanel, Saidi, Ali,
	Herrenschmidt, Benjamin, Dagan, Noam, Agroskin, Shay,
	Arinzon, David

ena_com_tx_comp_req_id_get() checks the req_id of a received completion,
and if it is out of bounds returns -EINVAL. This is a sign that
something is wrong with the device and it needs to be reset.

The current code does not reset the device in this case, which leaves
the driver in an undefined state, where this completion is not properly
handled.

This commit adds a call to handle_invalid_req_id() in ena_clean_tx_irq()
and ena_clean_xdp_irq() which resets the device to fix the issue.

This commit also removes unnecessary request id checks from
validate_tx_req_id() and validate_xdp_req_id(). This check is unneeded
because it was already performed in ena_com_tx_comp_req_id_get(), which
is called right before these functions.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 34 ++++++++++++--------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 7d5d885d85d5..2274063e34cc 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1288,26 +1288,22 @@ static int handle_invalid_req_id(struct ena_ring *ring, u16 req_id,
 
 static int validate_tx_req_id(struct ena_ring *tx_ring, u16 req_id)
 {
-	struct ena_tx_buffer *tx_info = NULL;
+	struct ena_tx_buffer *tx_info;
 
-	if (likely(req_id < tx_ring->ring_size)) {
-		tx_info = &tx_ring->tx_buffer_info[req_id];
-		if (likely(tx_info->skb))
-			return 0;
-	}
+	tx_info = &tx_ring->tx_buffer_info[req_id];
+	if (likely(tx_info->skb))
+		return 0;
 
 	return handle_invalid_req_id(tx_ring, req_id, tx_info, false);
 }
 
 static int validate_xdp_req_id(struct ena_ring *xdp_ring, u16 req_id)
 {
-	struct ena_tx_buffer *tx_info = NULL;
+	struct ena_tx_buffer *tx_info;
 
-	if (likely(req_id < xdp_ring->ring_size)) {
-		tx_info = &xdp_ring->tx_buffer_info[req_id];
-		if (likely(tx_info->xdpf))
-			return 0;
-	}
+	tx_info = &xdp_ring->tx_buffer_info[req_id];
+	if (likely(tx_info->xdpf))
+		return 0;
 
 	return handle_invalid_req_id(xdp_ring, req_id, tx_info, true);
 }
@@ -1332,9 +1328,14 @@ static int ena_clean_tx_irq(struct ena_ring *tx_ring, u32 budget)
 
 		rc = ena_com_tx_comp_req_id_get(tx_ring->ena_com_io_cq,
 						&req_id);
-		if (rc)
+		if (rc) {
+			if (unlikely(rc == -EINVAL))
+				handle_invalid_req_id(tx_ring, req_id, NULL,
+						      false);
 			break;
+		}
 
+		/* validate that the request id points to a valid skb */
 		rc = validate_tx_req_id(tx_ring, req_id);
 		if (rc)
 			break;
@@ -1896,9 +1897,14 @@ static int ena_clean_xdp_irq(struct ena_ring *xdp_ring, u32 budget)
 
 		rc = ena_com_tx_comp_req_id_get(xdp_ring->ena_com_io_cq,
 						&req_id);
-		if (rc)
+		if (rc) {
+			if (unlikely(rc == -EINVAL))
+				handle_invalid_req_id(xdp_ring, req_id, NULL,
+						      true);
 			break;
+		}
 
+		/* validate that the request id points to a valid xdp_frame */
 		rc = validate_xdp_req_id(xdp_ring, req_id);
 		if (rc)
 			break;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V2 net 2/3] net: ena: Fix wrong rx request id by resetting device
  2022-01-02  7:37 [PATCH V2 net 0/3] ENA driver bug fixes Arthur Kiyanovski
  2022-01-02  7:37 ` [PATCH V2 net 1/3] net: ena: Fix undefined state when tx request id is out of bounds Arthur Kiyanovski
@ 2022-01-02  7:37 ` Arthur Kiyanovski
  2022-01-02  7:37 ` [PATCH V2 net 3/3] net: ena: Fix error handling when calculating max IO queues number Arthur Kiyanovski
  2022-01-02 12:50 ` [PATCH V2 net 0/3] ENA driver bug fixes patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Arthur Kiyanovski @ 2022-01-02  7:37 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, netdev
  Cc: Arthur Kiyanovski, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, Saeed Bshara, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Belgazal, Netanel, Saidi, Ali,
	Herrenschmidt, Benjamin, Dagan, Noam, Agroskin, Shay,
	Arinzon, David

A wrong request id received from the device is a sign that
something is wrong with it, therefore trigger a device reset.

Also add some debug info to the "Page is NULL" print to make
it easier to debug.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 2274063e34cc..52a8c60b7e29 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1428,6 +1428,7 @@ static struct sk_buff *ena_rx_skb(struct ena_ring *rx_ring,
 				  u16 *next_to_clean)
 {
 	struct ena_rx_buffer *rx_info;
+	struct ena_adapter *adapter;
 	u16 len, req_id, buf = 0;
 	struct sk_buff *skb;
 	void *page_addr;
@@ -1440,8 +1441,14 @@ static struct sk_buff *ena_rx_skb(struct ena_ring *rx_ring,
 	rx_info = &rx_ring->rx_buffer_info[req_id];
 
 	if (unlikely(!rx_info->page)) {
-		netif_err(rx_ring->adapter, rx_err, rx_ring->netdev,
-			  "Page is NULL\n");
+		adapter = rx_ring->adapter;
+		netif_err(adapter, rx_err, rx_ring->netdev,
+			  "Page is NULL. qid %u req_id %u\n", rx_ring->qid, req_id);
+		ena_increase_stat(&rx_ring->rx_stats.bad_req_id, 1, &rx_ring->syncp);
+		adapter->reset_reason = ENA_REGS_RESET_INV_RX_REQ_ID;
+		/* Make sure reset reason is set before triggering the reset */
+		smp_mb__before_atomic();
+		set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 		return NULL;
 	}
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V2 net 3/3] net: ena: Fix error handling when calculating max IO queues number
  2022-01-02  7:37 [PATCH V2 net 0/3] ENA driver bug fixes Arthur Kiyanovski
  2022-01-02  7:37 ` [PATCH V2 net 1/3] net: ena: Fix undefined state when tx request id is out of bounds Arthur Kiyanovski
  2022-01-02  7:37 ` [PATCH V2 net 2/3] net: ena: Fix wrong rx request id by resetting device Arthur Kiyanovski
@ 2022-01-02  7:37 ` Arthur Kiyanovski
  2022-01-02 12:50 ` [PATCH V2 net 0/3] ENA driver bug fixes patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Arthur Kiyanovski @ 2022-01-02  7:37 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski, netdev
  Cc: Arthur Kiyanovski, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, Saeed Bshara, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Belgazal, Netanel, Saidi, Ali,
	Herrenschmidt, Benjamin, Dagan, Noam, Agroskin, Shay,
	Arinzon, David

The role of ena_calc_max_io_queue_num() is to return the number
of queues supported by the device, which means the return value
should be >=0.

The function that calls ena_calc_max_io_queue_num(), checks
the return value. If it is 0, it means the device reported
it supports 0 IO queues. This case is considered an error
and is handled by the calling function accordingly.

However the current implementation of ena_calc_max_io_queue_num()
is wrong, since when it detects the device supports 0 IO queues,
it returns -EFAULT.

In such a case the calling function doesn't detect the error,
and therefore doesn't handle it.

This commit changes ena_calc_max_io_queue_num() to return 0
in case the device reported it supports 0 queues, allowing the
calling function to properly handle the error case.

Fixes: 736ce3f414cc ("net: ena: make ethtool -l show correct max number of queues")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 52a8c60b7e29..c72f0c7ff4aa 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -4026,10 +4026,6 @@ static u32 ena_calc_max_io_queue_num(struct pci_dev *pdev,
 	max_num_io_queues = min_t(u32, max_num_io_queues, io_tx_cq_num);
 	/* 1 IRQ for mgmnt and 1 IRQs for each IO direction */
 	max_num_io_queues = min_t(u32, max_num_io_queues, pci_msix_vec_count(pdev) - 1);
-	if (unlikely(!max_num_io_queues)) {
-		dev_err(&pdev->dev, "The device doesn't have io queues\n");
-		return -EFAULT;
-	}
 
 	return max_num_io_queues;
 }
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V2 net 0/3] ENA driver bug fixes
  2022-01-02  7:37 [PATCH V2 net 0/3] ENA driver bug fixes Arthur Kiyanovski
                   ` (2 preceding siblings ...)
  2022-01-02  7:37 ` [PATCH V2 net 3/3] net: ena: Fix error handling when calculating max IO queues number Arthur Kiyanovski
@ 2022-01-02 12:50 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-01-02 12:50 UTC (permalink / raw)
  To: Arthur Kiyanovski
  Cc: davem, kuba, netdev, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, netanel, alisaidi, benh, ndagan, shayagr, darinzon

Hello:

This series was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Sun, 2 Jan 2022 07:37:25 +0000 you wrote:
> Patchset V2 chages:
> -------------------
> Updated SHA1 of Fixes tag in patch 3/3 to be 12 digits long
> 
> 
> Original cover letter:
> 
> [...]

Here is the summary with links:
  - [V2,net,1/3] net: ena: Fix undefined state when tx request id is out of bounds
    https://git.kernel.org/netdev/net/c/c255a34e02ef
  - [V2,net,2/3] net: ena: Fix wrong rx request id by resetting device
    https://git.kernel.org/netdev/net/c/cb3d4f98f0b2
  - [V2,net,3/3] net: ena: Fix error handling when calculating max IO queues number
    https://git.kernel.org/netdev/net/c/5055dc0348b8

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-02 12:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-02  7:37 [PATCH V2 net 0/3] ENA driver bug fixes Arthur Kiyanovski
2022-01-02  7:37 ` [PATCH V2 net 1/3] net: ena: Fix undefined state when tx request id is out of bounds Arthur Kiyanovski
2022-01-02  7:37 ` [PATCH V2 net 2/3] net: ena: Fix wrong rx request id by resetting device Arthur Kiyanovski
2022-01-02  7:37 ` [PATCH V2 net 3/3] net: ena: Fix error handling when calculating max IO queues number Arthur Kiyanovski
2022-01-02 12:50 ` [PATCH V2 net 0/3] ENA driver bug fixes patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).