From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S937265AbdADVHr (ORCPT <rfc822;w@1wt.eu>);
        Wed, 4 Jan 2017 16:07:47 -0500
Received: from out2-smtp.messagingengine.com ([66.111.4.26]:60338 "EHLO
        out2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S937227AbdADVHm (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2017 16:07:42 -0500
X-ME-Sender: <xms:2GNtWHyg1fzY6ufnWC0JOF_1Gok7weNtHKWaNTX2l97BJHpqRZKk5g>
X-Sasl-enc: hqF+DzQNZx9GQnPvyNe/dNcIDrWBMRKHZPoVy5SDMA3Y 1483563992
Date: Wed, 4 Jan 2017 21:50:52 +0100
From: Greg KH <greg@kroah.com>
To: Long Li <longli@exchange.microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>,
        Haiyang Zhang <haiyangz@microsoft.com>, devel@linuxdriverproject.org,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Retry infinitely for hypercall
Message-ID: <20170104205052.GA17747@kroah.com>
References: <1483569571-26024-1-git-send-email-longli@exchange.microsoft.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1483569571-26024-1-git-send-email-longli@exchange.microsoft.com>
User-Agent: Mutt/1.7.2 (2016-11-26)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Jan 04, 2017 at 02:39:31PM -0800, Long Li wrote:
> From: Long Li <longli@microsoft.com>
> 
> Hyper-v host guarantees that a hypercall will succeed. Retry infinitely to avoid returning transient failures to upper layer.

Please wrap your changelog at the proper column.

And what happens when the hypercall does not succeed?  How is the kernel
going to recover from that?

> 
> Signed-off-by: Long Li <longli@microsoft.com>
> ---
>  drivers/hv/connection.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 6ce8b87..4bcb099 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -439,7 +439,6 @@ int vmbus_post_msg(void *buffer, size_t buflen)
>  {
>  	union hv_connection_id conn_id;
>  	int ret = 0;
> -	int retries = 0;
>  	u32 usec = 1;
>  
>  	conn_id.asu32 = 0;
> @@ -447,10 +446,10 @@ int vmbus_post_msg(void *buffer, size_t buflen)
>  
>  	/*
>  	 * hv_post_message() can have transient failures because of
> -	 * insufficient resources. Retry the operation a couple of
> -	 * times before giving up.
> +	 * insufficient resources. We retry infinitely on these failures
> +	 * because host guarantees hypercall will eventually succeed.
>  	 */
> -	while (retries < 20) {
> +	while (1) {
>  		ret = hv_post_message(conn_id, 1, buffer, buflen);
>  
>  		switch (ret) {
> @@ -459,11 +458,11 @@ int vmbus_post_msg(void *buffer, size_t buflen)
>  			 * We could get this if we send messages too
>  			 * frequently.
>  			 */
> -			ret = -EAGAIN;
> -			break;

Document you are falling through please, otherwise someone will "fix"
this later.

>  		case HV_STATUS_INSUFFICIENT_MEMORY:
>  		case HV_STATUS_INSUFFICIENT_BUFFERS:
> -			ret = -ENOMEM;
> +			/*
> +			 * Temporary failure out of resources
> +			 */
>  			break;
>  		case HV_STATUS_SUCCESS:
>  			return ret;
> @@ -472,12 +471,12 @@ int vmbus_post_msg(void *buffer, size_t buflen)
>  			return -EINVAL;
>  		}
>  
> -		retries++;
>  		udelay(usec);
>  		if (usec < 2048)
>  			usec *= 2;
>  	}
> -	return ret;
> +	/* Impossible to get here */
> +	BUG_ON(1);

If it is impossible, why do you have this line at all?

What is this trying to solve?  Do you need to increase the time spent
waiting?  We all know things break, please allow the kernel to stay
alive if at all possible.

thanks,

greg k-h