Linux-HyperV List
 help / color / mirror / Atom feed
From: Praveen Kumar <kumarpraveen@linux.microsoft.com>
To: souradeep chakrabarti <schakrabarti@linux.microsoft.com>,
	kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, longli@microsoft.com,
	sharmaajay@microsoft.com, leon@kernel.org, cai.huoqing@linux.dev,
	ssengar@linux.microsoft.com, vkuznets@redhat.com,
	tglx@linutronix.de, linux-hyperv@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org
Cc: stable@vger.kernel.org, schakrabarti@microsoft.com
Subject: Re: [PATCH 2/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive
Date: Mon, 26 Jun 2023 19:43:07 +0530	[thread overview]
Message-ID: <578faf91-35e6-d946-d9ec-c949e57eadef@linux.microsoft.com> (raw)
In-Reply-To: <1687771224-27162-1-git-send-email-schakrabarti@linux.microsoft.com>

On 6/26/2023 2:50 PM, souradeep chakrabarti wrote:
> From: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
> 
> This is the second part of the fix.
> 
> Also this patch adds a new attribute in mana_context, which gets set when
> mana_hwc_send_request() hits a timeout because of host unresponsiveness.
> This flag then helps to avoid the timeouts in successive calls.
> 
> Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for
> Microsoft Azure Network Adapter)
> Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
> ---
> V2 -> V3:
> * Removed the initialization of vf_unload_timeout
> * Splitted the patch in two.
> * Fixed extra space from the commit message.
> ---
>  drivers/net/ethernet/microsoft/mana/gdma_main.c  |  4 +++-
>  drivers/net/ethernet/microsoft/mana/hw_channel.c | 12 +++++++++++-
>  include/net/mana/mana.h                          |  2 ++
>  3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 8f3f78b68592..6411f01be0d9 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -946,10 +946,12 @@ int mana_gd_deregister_device(struct gdma_dev *gd)
>  	struct gdma_context *gc = gd->gdma_context;
>  	struct gdma_general_resp resp = {};
>  	struct gdma_general_req req = {};
> +	struct mana_context *ac;
>  	int err;
>  
>  	if (gd->pdid == INVALID_PDID)
>  		return -EINVAL;
> +	ac = gd->driver_data;
>  
>  	mana_gd_init_req_hdr(&req.hdr, GDMA_DEREGISTER_DEVICE, sizeof(req),
>  			     sizeof(resp));
> @@ -957,7 +959,7 @@ int mana_gd_deregister_device(struct gdma_dev *gd)
>  	req.hdr.dev_id = gd->dev_id;
>  
>  	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
> -	if (err || resp.hdr.status) {
> +	if ((err || resp.hdr.status) && !ac->vf_unload_timeout) {
>  		dev_err(gc->dev, "Failed to deregister device: %d, 0x%x\n",
>  			err, resp.hdr.status);

With !ac->vf_unload_timeout option, this message may not be correctly showing err, status. Probably you want to add explicit information during timeouts so that it give right information ? Or have the err, status field properly updated.

>  		if (!err)
> diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c b/drivers/net/ethernet/microsoft/mana/hw_channel.c
> index 9d1507eba5b9..492cb2c6e2cb 100644
> --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c
> +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c
> @@ -1,8 +1,10 @@
>  // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
>  /* Copyright (c) 2021, Microsoft Corporation. */
>  
> +#include "asm-generic/errno.h"
>  #include <net/mana/gdma.h>
>  #include <net/mana/hw_channel.h>
> +#include <net/mana/mana.h>
>  
>  static int mana_hwc_get_msg_index(struct hw_channel_context *hwc, u16 *msg_id)
>  {
> @@ -786,12 +788,19 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
>  	struct hwc_wq *txq = hwc->txq;
>  	struct gdma_req_hdr *req_msg;
>  	struct hwc_caller_ctx *ctx;
> +	struct mana_context *ac;
>  	u32 dest_vrcq = 0;
>  	u32 dest_vrq = 0;
>  	u16 msg_id;
>  	int err;
>  
>  	mana_hwc_get_msg_index(hwc, &msg_id);
> +	ac = hwc->gdma_dev->driver_data;

Is there a case where gdma_dev be invalid here ? If so, lets check the state and then proceed further ?

> +	if (ac->vf_unload_timeout) {
> +		dev_err(hwc->dev, "HWC: vport is already unloaded.\n");
> +		err = -ETIMEDOUT;
> +		goto out;
> +	}
>  
>  	tx_wr = &txq->msg_buf->reqs[msg_id];
>  
> @@ -825,9 +834,10 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
>  		goto out;
>  	}
>  
> -	if (!wait_for_completion_timeout(&ctx->comp_event, 30 * HZ)) {
> +	if (!wait_for_completion_timeout(&ctx->comp_event, 5 * HZ)) {

IMHO we should have macros instead of magic numbers (5 , 30 or so). But would like others to comment here.

>  		dev_err(hwc->dev, "HWC: Request timed out!\n");
>  		err = -ETIMEDOUT;
> +		ac->vf_unload_timeout = true;
>  		goto out;
>  	}
>  
> diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
> index 9eef19972845..5f5affdca1eb 100644
> --- a/include/net/mana/mana.h
> +++ b/include/net/mana/mana.h
> @@ -358,6 +358,8 @@ struct mana_context {
>  
>  	u16 num_ports;
>  
> +	bool vf_unload_timeout;
> +
>  	struct mana_eq *eqs;
>  
>  	struct net_device *ports[MAX_PORTS_IN_MANA_DEV];


  reply	other threads:[~2023-06-26 14:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-26  9:17 [PATCH 0/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive souradeep chakrabarti
2023-06-26  9:18 ` [PATCH 1/2 " souradeep chakrabarti
2023-06-26  9:18   ` souradeep chakrabarti
2023-06-26 14:20     ` Praveen Kumar
2023-06-27  8:44       ` Souradeep Chakrabarti
2023-06-26 15:35     ` Michael Kelley (LINUX)
2023-06-26  9:20   ` [PATCH 2/2 " souradeep chakrabarti
2023-06-26 14:13     ` Praveen Kumar [this message]
2023-06-26 20:32       ` Haiyang Zhang
2023-06-27  8:42       ` Souradeep Chakrabarti
2023-06-26 15:53     ` Haiyang Zhang
2023-06-26 16:02       ` Haiyang Zhang
2023-06-27  8:45       ` Souradeep Chakrabarti
2023-06-26 13:05   ` [PATCH 1/2 " Simon Horman
2023-06-26 20:06     ` Dexuan Cui
2023-06-26 20:47       ` Stephen Hemminger
2023-06-27  8:35         ` Souradeep Chakrabarti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578faf91-35e6-d946-d9ec-c949e57eadef@linux.microsoft.com \
    --to=kumarpraveen@linux.microsoft.com \
    --cc=cai.huoqing@linux.dev \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=leon@kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=schakrabarti@linux.microsoft.com \
    --cc=schakrabarti@microsoft.com \
    --cc=sharmaajay@microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox