All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: KY Srinivasan <kys@microsoft.com>
Cc: Jiri Slaby <jirislaby@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"devel@linuxdriverproject.org" <devel@linuxdriverproject.org>,
	"virtualization@lists.osdl.org" <virtualization@lists.osdl.org>
Subject: Re: [PATCH 2/3]: Staging: hv: Use native wait primitives
Date: Tue, 15 Feb 2011 08:29:33 -0800	[thread overview]
Message-ID: <20110215162933.GA30626@suse.de> (raw)
In-Reply-To: <FB42D5CCD7B5934EB1827DB5ED9B850E07085D7F@TK5EX14MBXC104.redmond.corp.microsoft.com>

On Tue, Feb 15, 2011 at 04:22:20PM +0000, KY Srinivasan wrote:
> 
> 
> > -----Original Message-----
> > From: Greg KH [mailto:gregkh@suse.de]
> > Sent: Tuesday, February 15, 2011 9:03 AM
> > To: KY Srinivasan
> > Cc: Jiri Slaby; linux-kernel@vger.kernel.org; devel@linuxdriverproject.org;
> > virtualization@lists.osdl.org
> > Subject: Re: [PATCH 2/3]: Staging: hv: Use native wait primitives
> > 
> > On Tue, Feb 15, 2011 at 01:35:56PM +0000, KY Srinivasan wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jiri Slaby [mailto:jirislaby@gmail.com]
> > > > Sent: Tuesday, February 15, 2011 4:21 AM
> > > > To: KY Srinivasan
> > > > Cc: gregkh@suse.de; linux-kernel@vger.kernel.org;
> > > > devel@linuxdriverproject.org; virtualization@lists.osdl.org
> > > > Subject: Re: [PATCH 2/3]: Staging: hv: Use native wait primitives
> > > >
> > > > On 02/11/2011 06:59 PM, K. Y. Srinivasan wrote:
> > > > > In preperation for getting rid of the osd layer; change
> > > > > the code to use native wait interfaces. As part of this,
> > > > > fixed the buggy implementation in the osd_wait_primitive
> > > > > where the condition was cleared potentially after the
> > > > > condition was signalled.
> > > > ...
> > > > > @@ -566,7 +567,11 @@ int vmbus_establish_gpadl(struct vmbus_channel
> > > > *channel, void *kbuffer,
> > > > >
> > > > >  		}
> > > > >  	}
> > > > > -	osd_waitevent_wait(msginfo->waitevent);
> > > > > +	wait_event_timeout(msginfo->waitevent,
> > > > > +				msginfo->wait_condition,
> > > > > +				msecs_to_jiffies(1000));
> > > > > +	BUG_ON(msginfo->wait_condition == 0);
> > > >
> > > > The added BUG_ONs all over the code look scary. These shouldn't be
> > > > BUG_ONs at all. You should maybe warn and bail out, but not kill the
> > > > whole machine.
> > >
> > > This is Linux code running as a guest on a Windows host; and so the guest
> > cannot
> > > tolerate a failure of the host. In the cases where I have chosen to BUG_ON,
> > there
> > > is no reasonable recovery possible when the host is non-functional (as
> > determined
> > > by a non-responsive host).
> > 
> > If you have a non-responsive host, wouldn't that imply that this guest
> > code wouldn't run at all?  :)
> 
> The fact  that on a particular transaction the host has not responded within an expected
> time interval does not necessarily  mean that the guest code would not be running. There may be 
> issues on the host side that may be either transient or permanent that may cause problems like
> this. Keep in mind, HyperV is a type 1 hypervisor that would schedule all VMs including the host
> and so, guest would get scheduled.
> 
> > 
> > Having BUG_ON() in drivers is not a good idea either way.  Please remove
> > these in future patches.
> 
> In situations where there is not a reasonable rollback strategy (for
> instance in one of the cases, we are granting access to the guest
> physical pages to the host) we really have only 2 options:
> 
> 1) Wait until the host responds. This wait could potentially be unbounded
> and in fact this  was the way the code was to begin with. One of the reviewers
> had suggested that unbounded wait was to be corrected.
> 2) Wait for a specific period and if the host does not respond
> within a reasonable period, kill the guest since there is no recovery
> possible.

Killing the guest is a very serious thing, causing all sorts of possible
problems with it, right?

> I chose option 2, as part of addressing some of the prior review
> comments. If the consensus now is to go back to option 1, I am fine with that;

Unbounded waits aren't ok either, you need some sort of timeout.

But, as this is a bit preferable to dieing, I suggest doing this, and
comment the heck out of it to explain all of this for anyone who reads
it.

thanks,

greg k-h

  reply	other threads:[~2011-02-15 16:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-11 17:59 [PATCH 2/3]: Staging: hv: Use native wait primitives K. Y. Srinivasan
2011-02-11 17:59 ` K. Y. Srinivasan
2011-02-15  9:20 ` Jiri Slaby
2011-02-15 13:35   ` KY Srinivasan
2011-02-15 14:03     ` Greg KH
2011-02-15 16:22       ` KY Srinivasan
2011-02-15 16:29         ` Greg KH [this message]
2011-02-15 17:52           ` KY Srinivasan
2011-02-15 18:44             ` Dmitry Torokhov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110215162933.GA30626@suse.de \
    --to=gregkh@suse.de \
    --cc=devel@linuxdriverproject.org \
    --cc=jirislaby@gmail.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=virtualization@lists.osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.