[PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response
@ 2023-08-03 20:20 Nick Child
  2023-08-03 20:20 ` [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail Nick Child
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Nick Child @ 2023-08-03 20:20 UTC (permalink / raw)
  To: netdev; +Cc: haren, ricklind, danymadden, tlfalcon, bjking1, Nick Child

Ensure that all offsets in a login response buffer are within the size
of the allocated response buffer. Any offsets or lengths that surpass
the allocation are likely the result of an incomplete response buffer.
In these cases, a full reset is necessary.

When attempting to login, the ibmvnic device will allocate a response
buffer and pass a reference to the VIOS. The VIOS will then send the
ibmvnic device a LOGIN_RSP CRQ to signal that the buffer has been filled
with data. If the ibmvnic device does not get a response in 20 seconds,
the old buffer is freed and a new login request is sent. With 2
outstanding requests, any LOGIN_RSP CRQ's could be for the older
login request. If this is the case then the login response buffer (which
is for the newer login request) could be incomplete and contain invalid
data. Therefore, we must enforce strict sanity checks on the response
buffer values.

Testing has shown that the `off_rxadd_buff_size` value is filled in last
by the VIOS and will be the smoking gun for these circumstances.

Until VIOS can implement a mechanism for tracking outstanding response
buffers and a method for mapping a LOGIN_RSP CRQ to a particular login
response buffer, the best ibmvnic can do in this situation is perform a
full reset.

Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---

Hello!
This patchset is all relevant to recent bugs which came up regarding
the ibmvnic login process. Specifically, when this process times out.

ibmvnic devices are virtual devices which need to "login" to a physical
NIC at the end of its initialization process. This invloves sending a
command to the VIOS (virtual input output server, essentially the server
that this client is logging into) requesting it to fill out a DMA mapped
repsonse buffer. Once done, the VIOS sends a response informing the
client that the buffer has been filled with data.

If the VIOS does not send a response in 20 seconds then the client tries
again. If this happens then several bugs can occur. This is usually due
to the fact that there are more than one outstanding requests and no
mechanism for mapping a response CRQ to a given response buffer. Until
that mechanism is created, this patchset aims to harden this timeout
recovery process so that the device does not get stuck in an inopperable
state.

 drivers/net/ethernet/ibm/ibmvnic.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 763d613adbcc..996f8037c266 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5397,6 +5397,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 	int num_rx_pools;
 	u64 *size_array;
 	int i;
+	u32 rsp_len;

 	/* CHECK: Test/set of login_pending does not need to be atomic
 	 * because only ibmvnic_tasklet tests/clears this.
@@ -5447,6 +5448,23 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 		ibmvnic_reset(adapter, VNIC_RESET_FATAL);
 		return -EIO;
 	}
+
+	rsp_len = be32_to_cpu(login_rsp->len);
+	if (be32_to_cpu(login->login_rsp_len) < rsp_len ||
+	    rsp_len <= be32_to_cpu(login_rsp->off_txsubm_subcrqs) ||
+	    rsp_len <= be32_to_cpu(login_rsp->off_rxadd_subcrqs) ||
+	    rsp_len <= be32_to_cpu(login_rsp->off_rxadd_buff_size) ||
+	    rsp_len <= be32_to_cpu(login_rsp->off_supp_tx_desc)) {
+		/* This can happen if a login request times out and there are
+		 * 2 outstanding login requests sent, the LOGIN_RSP crq
+		 * could have been for the older login request. So we are
+		 * parsing the newer response buffer which may be incomplete
+		 */
+		dev_err(dev, "FATAL: Login rsp offsets/lengths invalid\n");
+		ibmvnic_reset(adapter, VNIC_RESET_FATAL);
+		return -EIO;
+	}
+
 	size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
 		be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
 	/* variable buffer sizes are not supported, so just read the
-- 
2.39.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail
  2023-08-03 20:20 [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Nick Child
@ 2023-08-03 20:20 ` Nick Child
  2023-08-05  7:19   ` Simon Horman
  2023-08-03 20:20 ` [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions Nick Child
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Nick Child @ 2023-08-03 20:20 UTC (permalink / raw)
  To: netdev; +Cc: haren, ricklind, danymadden, tlfalcon, bjking1, Nick Child

If the LOGIN CRQ fails to send then we must DMA unmap the response
buffer. Previously, if the CRQ failed then the memory was freed without
DMA unmapping.

Fixes: c98d9cc4170d ("ibmvnic: send_login should check for crq errors")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 996f8037c266..fdc0bac029ad 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4830,11 +4830,14 @@ static int send_login(struct ibmvnic_adapter *adapter)
 	if (rc) {
 		adapter->login_pending = false;
 		netdev_err(adapter->netdev, "Failed to send login, rc=%d\n", rc);
-		goto buf_rsp_map_failed;
+		goto buf_send_failed;
 	}
 
 	return 0;
 
+buf_send_failed:
+	dma_unmap_single(dev, rsp_buffer_token, rsp_buffer_size,
+			 DMA_FROM_DEVICE);
 buf_rsp_map_failed:
 	kfree(login_rsp_buffer);
 	adapter->login_rsp_buf = NULL;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions
  2023-08-03 20:20 [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Nick Child
  2023-08-03 20:20 ` [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail Nick Child
@ 2023-08-03 20:20 ` Nick Child
  2023-08-05  7:19   ` Simon Horman
  2023-08-03 20:20 ` [PATCH net 4/5] ibmvnic: Do partial reset on login failure Nick Child
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Nick Child @ 2023-08-03 20:20 UTC (permalink / raw)
  To: netdev; +Cc: haren, ricklind, danymadden, tlfalcon, bjking1, Nick Child

Rather than leaving the DMA unmapping of the login buffers to the
login response handler, move this work into the login release functions.
Previously, these functions were only used for freeing the allocated
buffers. This could lead to issues if there are more than one
outstanding login buffer requests, which is possible if a login request
times out.

If a login request times out, then there is another call to send login.
The send login function makes a call to the login buffer release
function. In the past, this freed the buffers but did not DMA unmap.
Therefore, the VIOS could still write to the old login (now freed)
buffer. It is for this reason that it is a good idea to leave the DMA
unmap call to the login buffers release function.

Since the login buffer release functions now handle DMA unmapping,
remove the duplicate DMA unmapping in handle_login_rsp().

Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index fdc0bac029ad..718af76fd711 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1588,12 +1588,22 @@ static int ibmvnic_login(struct net_device *netdev)
 
 static void release_login_buffer(struct ibmvnic_adapter *adapter)
 {
+	if (!adapter->login_buf)
+		return;
+
+	dma_unmap_single(&adapter->vdev->dev, adapter->login_buf_token,
+			 adapter->login_buf_sz, DMA_TO_DEVICE);
 	kfree(adapter->login_buf);
 	adapter->login_buf = NULL;
 }
 
 static void release_login_rsp_buffer(struct ibmvnic_adapter *adapter)
 {
+	if (!adapter->login_rsp_buf)
+		return;
+
+	dma_unmap_single(&adapter->vdev->dev, adapter->login_rsp_buf_token,
+			 adapter->login_rsp_buf_sz, DMA_FROM_DEVICE);
 	kfree(adapter->login_rsp_buf);
 	adapter->login_rsp_buf = NULL;
 }
@@ -5411,11 +5421,6 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 	}
 	adapter->login_pending = false;
 
-	dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz,
-			 DMA_TO_DEVICE);
-	dma_unmap_single(dev, adapter->login_rsp_buf_token,
-			 adapter->login_rsp_buf_sz, DMA_FROM_DEVICE);
-
 	/* If the number of queues requested can't be allocated by the
 	 * server, the login response will return with code 1. We will need
 	 * to resend the login buffer with fewer queues requested.
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 4/5] ibmvnic: Do partial reset on login failure
  2023-08-03 20:20 [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Nick Child
  2023-08-03 20:20 ` [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail Nick Child
  2023-08-03 20:20 ` [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions Nick Child
@ 2023-08-03 20:20 ` Nick Child
  2023-08-05  7:20   ` Simon Horman
  2023-08-03 20:20 ` [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets Nick Child
  2023-08-05  7:18 ` [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Simon Horman
  4 siblings, 1 reply; 11+ messages in thread
From: Nick Child @ 2023-08-03 20:20 UTC (permalink / raw)
  To: netdev; +Cc: haren, ricklind, danymadden, tlfalcon, bjking1, Nick Child

Perform a partial reset before sending a login request if any of the
following are true:
 1. If a previous request times out. This can be dangerous because the
 	VIOS could still receive the old login request at any point after
 	the timeout. Therefore, it is best to re-register the CRQ's  and
 	sub-CRQ's before retrying.
 2. If the previous request returns an error that is not described in
 	PAPR. PAPR provides procedures if the login returns with partial
 	success or aborted return codes (section L.5.1) but other values
	do not have a defined procedure. Previously, these conditions
	just returned error from the login function rather than trying
	to resolve the issue.
 	This can cause further issues since most callers of the login
 	function are not prepared to handle an error when logging in. This
 	improper cleanup can lead to the device being permanently DOWN'd.
 	For example, if the VIOS believes that the device is already logged
 	in then it will return INVALID_STATE (-7). If we never re-register
 	CRQ's then it will always think that the device is already logged
 	in. This leaves the device inoperable.

The partial reset involves freeing the sub-CRQs, freeing the CRQ then
registering and initializing a new CRQ and sub-CRQs. This essentially
restarts all communication with VIOS to allow for a fresh login attempt
that will be unhindered by any previous failed attempts.

Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 46 ++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 718af76fd711..8fd9639665a0 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *,
 static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *,
 					struct ibmvnic_sub_crq_queue *);
 static int ibmvnic_poll(struct napi_struct *napi, int data);
+static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter);
+static inline void reinit_init_done(struct ibmvnic_adapter *adapter);
 static void send_query_map(struct ibmvnic_adapter *adapter);
 static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8);
 static int send_request_unmap(struct ibmvnic_adapter *, u8);
@@ -1527,11 +1529,9 @@ static int ibmvnic_login(struct net_device *netdev)
 
 		if (!wait_for_completion_timeout(&adapter->init_done,
 						 timeout)) {
-			netdev_warn(netdev, "Login timed out, retrying...\n");
-			retry = true;
-			adapter->init_done_rc = 0;
-			retry_count++;
-			continue;
+			netdev_warn(netdev, "Login timed out\n");
+			adapter->login_pending = false;
+			goto partial_reset;
 		}
 
 		if (adapter->init_done_rc == ABORTED) {
@@ -1576,7 +1576,41 @@ static int ibmvnic_login(struct net_device *netdev)
 		} else if (adapter->init_done_rc) {
 			netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n",
 				    adapter->init_done_rc);
-			return -EIO;
+
+partial_reset:
+			/* adapter login failed, so free any CRQs or sub-CRQs
+			 * and register again before attempting to login again.
+			 * If we don't do this then the VIOS may think that
+			 * we are already logged in and reject any subsequent
+			 * attempts
+			 */
+			netdev_warn(netdev,
+				    "Freeing and re-registering CRQs before attempting to login again\n");
+			retry = true;
+			adapter->init_done_rc = 0;
+			retry_count++;
+			release_sub_crqs(adapter, true);
+			reinit_init_done(adapter);
+			release_crq_queue(adapter);
+			/* If we don't sleep here then we risk an unnecessary
+			 * failover event from the VIOS. This is a known VIOS
+			 * issue caused by a vnic device freeing and registering
+			 * a CRQ too quickly.
+			 */
+			msleep(1500);
+			rc = init_crq_queue(adapter);
+			if (rc) {
+				netdev_err(netdev, "login recovery: init CRQ failed %d\n",
+					   rc);
+				return -EIO;
+			}
+
+			rc = ibmvnic_reset_init(adapter, false);
+			if (rc) {
+				netdev_err(netdev, "login recovery: Reset init failed %d\n",
+					   rc);
+				return -EIO;
+			}
 		}
 	} while (retry);
 
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets
  2023-08-03 20:20 [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Nick Child
                   ` (2 preceding siblings ...)
  2023-08-03 20:20 ` [PATCH net 4/5] ibmvnic: Do partial reset on login failure Nick Child
@ 2023-08-03 20:20 ` Nick Child
  2023-08-05  7:20   ` Simon Horman
  2023-08-08  2:13   ` Jakub Kicinski
  2023-08-05  7:18 ` [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Simon Horman
  4 siblings, 2 replies; 11+ messages in thread
From: Nick Child @ 2023-08-03 20:20 UTC (permalink / raw)
  To: netdev; +Cc: haren, ricklind, danymadden, tlfalcon, bjking1, Nick Child

If a login request fails, the recovery process should be protected
against parallel resets. It is a known issue that freeing and
registering CRQ's in quick succession can result in a failover CRQ from
the VIOS. Processing a failover during login recovery is dangerous for
two reasons:
 1. This will result in two parallel initialization processes, this can
 cause serious issues during login.
 2. It is possible that the failover CRQ is received but never executed.
 We get notified of a pending failover through a transport event CRQ.
 The reset is not performed until a INIT CRQ request is received.
 Previously, if CRQ init fails during login recovery, then the ibmvnic
 irq is freed and the login process returned error. If failover_pending
 is true (a transport event was received), then the ibmvnic device
 would never be able to process the reset since it cannot receive the
 CRQ_INIT request due to the irq being freed. This leaved the device
 in a inoperable state.

Therefore, the login failure recovery process must be hardened against
these possible issues. Possible failovers (due to quick CRQ free and
init) must be avoided and any issues during re-initialization should be
dealt with instead of being propagated up the stack. This logic is
similar to that of ibmvnic_probe().

Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 67 ++++++++++++++++++++----------
 1 file changed, 46 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 8fd9639665a0..77df62511574 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -116,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
 static void free_long_term_buff(struct ibmvnic_adapter *adapter,
 				struct ibmvnic_long_term_buff *ltb);
 static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter);
+static void flush_reset_queue(struct ibmvnic_adapter *adapter);
 
 struct ibmvnic_stat {
 	char name[ETH_GSTRING_LEN];
@@ -1508,7 +1509,7 @@ static const char *adapter_state_to_string(enum vnic_state state)
 static int ibmvnic_login(struct net_device *netdev)
 {
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
-	unsigned long timeout = msecs_to_jiffies(20000);
+	unsigned long flags, timeout = msecs_to_jiffies(20000);
 	int retry_count = 0;
 	int retries = 10;
 	bool retry;
@@ -1590,27 +1591,52 @@ static int ibmvnic_login(struct net_device *netdev)
 			adapter->init_done_rc = 0;
 			retry_count++;
 			release_sub_crqs(adapter, true);
-			reinit_init_done(adapter);
-			release_crq_queue(adapter);
-			/* If we don't sleep here then we risk an unnecessary
-			 * failover event from the VIOS. This is a known VIOS
-			 * issue caused by a vnic device freeing and registering
-			 * a CRQ too quickly.
+			/* Much of this is similar logic as ibmvnic_probe(),
+			 * we are essentially re-initializing communication
+			 * with the server. We really should not run any
+			 * resets/failovers here because this is already a form
+			 * of reset and we do not want parallel resets occurring
 			 */
-			msleep(1500);
-			rc = init_crq_queue(adapter);
-			if (rc) {
-				netdev_err(netdev, "login recovery: init CRQ failed %d\n",
-					   rc);
-				return -EIO;
-			}
+			do {
+				reinit_init_done(adapter);
+				/* Clear any failovers we got in the previous
+				 * pass since we are re-initializing the CRQ
+				 */
+				adapter->failover_pending = false;
+				release_crq_queue(adapter);
+				/* If we don't sleep here then we risk an
+				 * unnecessary failover event from the VIOS.
+				 * This is a known VIOS issue caused by a vnic
+				 * device freeing and registering a CRQ too
+				 * quickly.
+				 */
+				msleep(1500);
+				/* Avoid any resets, since we are currently
+				 * resetting.
+				 */
+				spin_lock_irqsave(&adapter->rwi_lock, flags);
+				flush_reset_queue(adapter);
+				spin_unlock_irqrestore(&adapter->rwi_lock,
+						       flags);
+
+				rc = init_crq_queue(adapter);
+				if (rc) {
+					netdev_err(netdev, "login recovery: init CRQ failed %d\n",
+						   rc);
+					return -EIO;
+				}
 
-			rc = ibmvnic_reset_init(adapter, false);
-			if (rc) {
-				netdev_err(netdev, "login recovery: Reset init failed %d\n",
-					   rc);
-				return -EIO;
-			}
+				rc = ibmvnic_reset_init(adapter, false);
+				if (rc)
+					netdev_err(netdev, "login recovery: Reset init failed %d\n",
+						   rc);
+				/* IBMVNIC_CRQ_INIT will return EAGAIN if it
+				 * fails, since ibmvnic_reset_init will free
+				 * irq's in failure, we won't be able to receive
+				 * new CRQs so we need to keep trying. probe()
+				 * handles this similarly.
+				 */
+			} while (rc == -EAGAIN);
 		}
 	} while (retry);
 
@@ -1903,7 +1929,6 @@ static int ibmvnic_open(struct net_device *netdev)
 	int rc;
 
 	ASSERT_RTNL();
-
 	/* If device failover is pending or we are about to reset, just set
 	 * device state and return. Device operation will be handled by reset
 	 * routine.
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response
  2023-08-03 20:20 [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Nick Child
                   ` (3 preceding siblings ...)
  2023-08-03 20:20 ` [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets Nick Child
@ 2023-08-05  7:18 ` Simon Horman
  4 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2023-08-05  7:18 UTC (permalink / raw)
  To: Nick Child; +Cc: netdev, haren, ricklind, danymadden, tlfalcon, bjking1

On Thu, Aug 03, 2023 at 03:20:06PM -0500, Nick Child wrote:
> Ensure that all offsets in a login response buffer are within the size
> of the allocated response buffer. Any offsets or lengths that surpass
> the allocation are likely the result of an incomplete response buffer.
> In these cases, a full reset is necessary.
> 
> When attempting to login, the ibmvnic device will allocate a response
> buffer and pass a reference to the VIOS. The VIOS will then send the
> ibmvnic device a LOGIN_RSP CRQ to signal that the buffer has been filled
> with data. If the ibmvnic device does not get a response in 20 seconds,
> the old buffer is freed and a new login request is sent. With 2
> outstanding requests, any LOGIN_RSP CRQ's could be for the older
> login request. If this is the case then the login response buffer (which
> is for the newer login request) could be incomplete and contain invalid
> data. Therefore, we must enforce strict sanity checks on the response
> buffer values.
> 
> Testing has shown that the `off_rxadd_buff_size` value is filled in last
> by the VIOS and will be the smoking gun for these circumstances.
> 
> Until VIOS can implement a mechanism for tracking outstanding response
> buffers and a method for mapping a LOGIN_RSP CRQ to a particular login
> response buffer, the best ibmvnic can do in this situation is perform a
> full reset.
> 
> Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>

> ---
> 
> Hello!
> This patchset is all relevant to recent bugs which came up regarding
> the ibmvnic login process. Specifically, when this process times out.
> 
> ibmvnic devices are virtual devices which need to "login" to a physical
> NIC at the end of its initialization process. This invloves sending a
> command to the VIOS (virtual input output server, essentially the server
> that this client is logging into) requesting it to fill out a DMA mapped
> repsonse buffer. Once done, the VIOS sends a response informing the
> client that the buffer has been filled with data.
> 
> If the VIOS does not send a response in 20 seconds then the client tries
> again. If this happens then several bugs can occur. This is usually due
> to the fact that there are more than one outstanding requests and no
> mechanism for mapping a response CRQ to a given response buffer. Until
> that mechanism is created, this patchset aims to harden this timeout
> recovery process so that the device does not get stuck in an inopperable
> state.

This sort of information really belongs in a cover letter for the patchset.
And in any case, it's nice to have a cover letter if there is more
than one patch in the series.

>  drivers/net/ethernet/ibm/ibmvnic.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 763d613adbcc..996f8037c266 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -5397,6 +5397,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
>  	int num_rx_pools;
>  	u64 *size_array;
>  	int i;
> +	u32 rsp_len;

nit: It's preferred in Networking code to arrange local variables in
     reverse xmas tree order - longest line to shortest.

     I know this file doesn't follow that very closely.
     But still, it would be slightly nicer, at least in this case.

...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail
  2023-08-03 20:20 ` [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail Nick Child
@ 2023-08-05  7:19   ` Simon Horman
  0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2023-08-05  7:19 UTC (permalink / raw)
  To: Nick Child; +Cc: netdev, haren, ricklind, danymadden, tlfalcon, bjking1

On Thu, Aug 03, 2023 at 03:20:07PM -0500, Nick Child wrote:
> If the LOGIN CRQ fails to send then we must DMA unmap the response
> buffer. Previously, if the CRQ failed then the memory was freed without
> DMA unmapping.
> 
> Fixes: c98d9cc4170d ("ibmvnic: send_login should check for crq errors")
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions
  2023-08-03 20:20 ` [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions Nick Child
@ 2023-08-05  7:19   ` Simon Horman
  0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2023-08-05  7:19 UTC (permalink / raw)
  To: Nick Child; +Cc: netdev, haren, ricklind, danymadden, tlfalcon, bjking1

On Thu, Aug 03, 2023 at 03:20:08PM -0500, Nick Child wrote:
> Rather than leaving the DMA unmapping of the login buffers to the
> login response handler, move this work into the login release functions.
> Previously, these functions were only used for freeing the allocated
> buffers. This could lead to issues if there are more than one
> outstanding login buffer requests, which is possible if a login request
> times out.
> 
> If a login request times out, then there is another call to send login.
> The send login function makes a call to the login buffer release
> function. In the past, this freed the buffers but did not DMA unmap.
> Therefore, the VIOS could still write to the old login (now freed)
> buffer. It is for this reason that it is a good idea to leave the DMA
> unmap call to the login buffers release function.
> 
> Since the login buffer release functions now handle DMA unmapping,
> remove the duplicate DMA unmapping in handle_login_rsp().
> 
> Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 4/5] ibmvnic: Do partial reset on login failure
  2023-08-03 20:20 ` [PATCH net 4/5] ibmvnic: Do partial reset on login failure Nick Child
@ 2023-08-05  7:20   ` Simon Horman
  0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2023-08-05  7:20 UTC (permalink / raw)
  To: Nick Child; +Cc: netdev, haren, ricklind, danymadden, tlfalcon, bjking1

On Thu, Aug 03, 2023 at 03:20:09PM -0500, Nick Child wrote:
> Perform a partial reset before sending a login request if any of the
> following are true:
>  1. If a previous request times out. This can be dangerous because the
>  	VIOS could still receive the old login request at any point after
>  	the timeout. Therefore, it is best to re-register the CRQ's  and
>  	sub-CRQ's before retrying.
>  2. If the previous request returns an error that is not described in
>  	PAPR. PAPR provides procedures if the login returns with partial
>  	success or aborted return codes (section L.5.1) but other values
> 	do not have a defined procedure. Previously, these conditions
> 	just returned error from the login function rather than trying
> 	to resolve the issue.
>  	This can cause further issues since most callers of the login
>  	function are not prepared to handle an error when logging in. This
>  	improper cleanup can lead to the device being permanently DOWN'd.
>  	For example, if the VIOS believes that the device is already logged
>  	in then it will return INVALID_STATE (-7). If we never re-register
>  	CRQ's then it will always think that the device is already logged
>  	in. This leaves the device inoperable.
> 
> The partial reset involves freeing the sub-CRQs, freeing the CRQ then
> registering and initializing a new CRQ and sub-CRQs. This essentially
> restarts all communication with VIOS to allow for a fresh login attempt
> that will be unhindered by any previous failed attempts.
> 
> Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets
  2023-08-03 20:20 ` [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets Nick Child
@ 2023-08-05  7:20   ` Simon Horman
  2023-08-08  2:13   ` Jakub Kicinski
  1 sibling, 0 replies; 11+ messages in thread
From: Simon Horman @ 2023-08-05  7:20 UTC (permalink / raw)
  To: Nick Child; +Cc: netdev, haren, ricklind, danymadden, tlfalcon, bjking1

On Thu, Aug 03, 2023 at 03:20:10PM -0500, Nick Child wrote:
> If a login request fails, the recovery process should be protected
> against parallel resets. It is a known issue that freeing and
> registering CRQ's in quick succession can result in a failover CRQ from
> the VIOS. Processing a failover during login recovery is dangerous for
> two reasons:
>  1. This will result in two parallel initialization processes, this can
>  cause serious issues during login.
>  2. It is possible that the failover CRQ is received but never executed.
>  We get notified of a pending failover through a transport event CRQ.
>  The reset is not performed until a INIT CRQ request is received.
>  Previously, if CRQ init fails during login recovery, then the ibmvnic
>  irq is freed and the login process returned error. If failover_pending
>  is true (a transport event was received), then the ibmvnic device
>  would never be able to process the reset since it cannot receive the
>  CRQ_INIT request due to the irq being freed. This leaved the device
>  in a inoperable state.
> 
> Therefore, the login failure recovery process must be hardened against
> these possible issues. Possible failovers (due to quick CRQ free and
> init) must be avoided and any issues during re-initialization should be
> dealt with instead of being propagated up the stack. This logic is
> similar to that of ibmvnic_probe().
> 
> Fixes: dff515a3e71d ("ibmvnic: Harden device login requests")
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets
  2023-08-03 20:20 ` [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets Nick Child
  2023-08-05  7:20   ` Simon Horman
@ 2023-08-08  2:13   ` Jakub Kicinski
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2023-08-08  2:13 UTC (permalink / raw)
  To: Nick Child; +Cc: netdev, haren, ricklind, danymadden, tlfalcon, bjking1

On Thu,  3 Aug 2023 15:20:10 -0500 Nick Child wrote:
> +			do {
> +				reinit_init_done(adapter);
> +				/* Clear any failovers we got in the previous
> +				 * pass since we are re-initializing the CRQ
> +				 */
> +				adapter->failover_pending = false;
> +				release_crq_queue(adapter);
> +				/* If we don't sleep here then we risk an
> +				 * unnecessary failover event from the VIOS.
> +				 * This is a known VIOS issue caused by a vnic
> +				 * device freeing and registering a CRQ too
> +				 * quickly.
> +				 */
> +				msleep(1500);
> +				/* Avoid any resets, since we are currently
> +				 * resetting.
> +				 */
> +				spin_lock_irqsave(&adapter->rwi_lock, flags);
> +				flush_reset_queue(adapter);
> +				spin_unlock_irqrestore(&adapter->rwi_lock,
> +						       flags);
> +
> +				rc = init_crq_queue(adapter);
> +				if (rc) {
> +					netdev_err(netdev, "login recovery: init CRQ failed %d\n",
> +						   rc);
> +					return -EIO;
> +				}
>  
> -			rc = ibmvnic_reset_init(adapter, false);
> -			if (rc) {
> -				netdev_err(netdev, "login recovery: Reset init failed %d\n",
> -					   rc);
> -				return -EIO;
> -			}
> +				rc = ibmvnic_reset_init(adapter, false);
> +				if (rc)
> +					netdev_err(netdev, "login recovery: Reset init failed %d\n",
> +						   rc);
> +				/* IBMVNIC_CRQ_INIT will return EAGAIN if it
> +				 * fails, since ibmvnic_reset_init will free
> +				 * irq's in failure, we won't be able to receive
> +				 * new CRQs so we need to keep trying. probe()
> +				 * handles this similarly.
> +				 */
> +			} while (rc == -EAGAIN);

Isn't this potentially an infinite loop? Can we limit the max number of
iterations here or something already makes this loop safe?

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-08-08  2:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-03 20:20 [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Nick Child
2023-08-03 20:20 ` [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail Nick Child
2023-08-05  7:19   ` Simon Horman
2023-08-03 20:20 ` [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions Nick Child
2023-08-05  7:19   ` Simon Horman
2023-08-03 20:20 ` [PATCH net 4/5] ibmvnic: Do partial reset on login failure Nick Child
2023-08-05  7:20   ` Simon Horman
2023-08-03 20:20 ` [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets Nick Child
2023-08-05  7:20   ` Simon Horman
2023-08-08  2:13   ` Jakub Kicinski
2023-08-05  7:18 ` [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).