From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1426385AbcFHOrt (ORCPT <rfc822;w@1wt.eu>);
	Wed, 8 Jun 2016 10:47:49 -0400
Received: from userp1040.oracle.com ([156.151.31.81]:36315 "EHLO
	userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1423165AbcFHOrq (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 8 Jun 2016 10:47:46 -0400
Date: Wed, 8 Jun 2016 10:47:39 -0400
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Bob Liu <bob.liu@oracle.com>
Cc: linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org,
        roger.pau@citrix.com
Subject: Re: [PATCH 1/2] xen-blkfront: don't call talk_to_blkback when
 already connected to blkback
Message-ID: <20160608144739.GD17027@char.us.oracle.com>
References: <1464685157-30738-1-git-send-email-bob.liu@oracle.com>
 <20160531203307.GC23808@char.us.oracle.com>
 <574E7763.9060001@oracle.com>
 <20160607152524.GA10281@localhost.localdomain>
 <5757BF4E.9080307@oracle.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5757BF4E.9080307@oracle.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Source-IP: userv0022.oracle.com [156.151.31.74]
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Jun 08, 2016 at 02:46:38PM +0800, Bob Liu wrote:
> 
> On 06/07/2016 11:25 PM, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jun 01, 2016 at 01:49:23PM +0800, Bob Liu wrote:
> >>
> >> On 06/01/2016 04:33 AM, Konrad Rzeszutek Wilk wrote:
> >>> On Tue, May 31, 2016 at 04:59:16PM +0800, Bob Liu wrote:
> >>>> Sometimes blkfont may receive twice blkback_changed() notification after
> >>>> migration, then talk_to_blkback() will be called twice too and confused
> >>>> xen-blkback.
> >>>
> >>> Could you enlighten the patch description by having some form of
> >>> state transition here? I am curious how you got the frontend
> >>> to get in XenbusStateConnected (via blkif_recover right) and then
> >>> the backend triggering the update once more?
> >>>
> >>> Or is just a simple race - the backend moves from XenbusStateConnected->
> >>> XenbusStateConnected - which retriggers the frontend to hit in
> >>> blkback_changed the XenbusStateConnected state and go in there?
> >>> (That would be in conenct_ring changing the state). But I don't
> >>> see how the frontend_changed code get there as we have:
> >>>
> >>>  770                 /*
> >>>  771                  * Ensure we connect even when two watches fire in
> >>>  772                  * close succession and we miss the intermediate value
> >>>  773                  * of frontend_state.
> >>>  774                  */
> >>>  775                 if (dev->state == XenbusStateConnected)
> >>>  776                         break;
> >>>  777 
> >>>
> >>> ?
> >>>
> >>> Now what about 'blkfront_connect' being called on the second time?
> >>>
> >>> Ah, info->connected is probably by then in BLKIF_STATE_CONNECTED
> >>> (as blkif_recover changed) and we just reread the size of the disk.
> >>>
> >>> Is that how about the flow goes?
> >>
> >> 	blkfront 					blkback
> >> blkfront_resume()                                   
> >>  > talk_to_blkback()
> >>   > Set blkfront to XenbusStateInitialised
> >> 						Front changed()
> >>   						 > Connect()
> >> 						  > Set blkback to XenbusStateConnected
> >>
> >> blkback_changed()
> >>  > Skip talk_to_blkback()
> >>    because frontstate == XenbusStateInitialised
> >>  > blkfront_connect()
> >>   > Set blkfront to XenbusStateConnected
> >>
> >>
> >> ------------------------------------------------------------------
> >> But sometimes blkfront receives
> >> blkback_changed() event more than once!
> > 
> > I think I know why. The udev scripts that get invoked when when
> > we attach a disk are a bit custom. As such I think they just
> > revalidate the size leading to this.
> > 
> > And this 'poke-at-XenbusStateConnected' state multiple times
> > is allowed. It is used to signal disk changes (or just to revalidate).
> > Hence it does not matter why really - we need to deal with this.
> > 
> > I modified your patch a bit and are testing it:
> > 
> 
> Looks much better, thank you very much!

Great! I also had it tested overnight and there was no hitch will send it
out soon.