* [PATCH net-next] ibmvnic: Return error code on TX scrq flush fail
@ 2024-04-11 20:34 Nick Child
2024-04-14 10:23 ` Simon Horman
0 siblings, 1 reply; 2+ messages in thread
From: Nick Child @ 2024-04-11 20:34 UTC (permalink / raw)
To: netdev; +Cc: haren, ricklind, mmc, Nick Child
In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then
it will inform upper level networking functions to disable tx
queues. H_CLOSED signals that the connection with the vnic server is
down and a transport event is expected to recover the device.
Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success.
Therefore, the queues would remain active until ibmvnic_cleanup() is
called within do_reset().
The problem is that do_reset() depends on the RTNL lock. If several
ibmvnic devices are resetting then there can be a long wait time until
the last device can grab the lock. During this time the tx/rx queues
still appear active to upper level functions.
FYI, we do make a call to netif_carrier_off() outside the RTNL lock but
its calls to dev_deactivate() are also dependent on the RTNL lock.
As a result, large amounts of retransmissions were observed in a short
period of time, eventually leading to ETIMEOUT. This was specifically
seen with HNV devices, likely because of even more RTNL dependencies.
Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is
propagated to the xmit function to allow for an earlier (and lock-less)
response to a transport event.
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 30c47b8470ad..f5177f370354 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2371,7 +2371,7 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
else
ind_bufp->index = 0;
- return 0;
+ return rc;
}
static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
--
2.39.3
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH net-next] ibmvnic: Return error code on TX scrq flush fail
2024-04-11 20:34 [PATCH net-next] ibmvnic: Return error code on TX scrq flush fail Nick Child
@ 2024-04-14 10:23 ` Simon Horman
0 siblings, 0 replies; 2+ messages in thread
From: Simon Horman @ 2024-04-14 10:23 UTC (permalink / raw)
To: Nick Child; +Cc: netdev, haren, ricklind, mmc
On Thu, Apr 11, 2024 at 03:34:35PM -0500, Nick Child wrote:
> In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then
> it will inform upper level networking functions to disable tx
> queues. H_CLOSED signals that the connection with the vnic server is
> down and a transport event is expected to recover the device.
>
> Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success.
> Therefore, the queues would remain active until ibmvnic_cleanup() is
> called within do_reset().
>
> The problem is that do_reset() depends on the RTNL lock. If several
> ibmvnic devices are resetting then there can be a long wait time until
> the last device can grab the lock. During this time the tx/rx queues
> still appear active to upper level functions.
>
> FYI, we do make a call to netif_carrier_off() outside the RTNL lock but
> its calls to dev_deactivate() are also dependent on the RTNL lock.
>
> As a result, large amounts of retransmissions were observed in a short
> period of time, eventually leading to ETIMEOUT. This was specifically
> seen with HNV devices, likely because of even more RTNL dependencies.
>
> Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is
> propagated to the xmit function to allow for an earlier (and lock-less)
> response to a transport event.
>
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 30c47b8470ad..f5177f370354 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2371,7 +2371,7 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
> ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
> else
> ind_bufp->index = 0;
> - return 0;
> + return rc;
> }
>
> static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
Hi Nick,
I notice that some, but not all, cases the return value of
ibmvnic_tx_scrq_flush() is not checked. Should that also be
addressed?
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-04-14 10:23 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-11 20:34 [PATCH net-next] ibmvnic: Return error code on TX scrq flush fail Nick Child
2024-04-14 10:23 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).