From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jeremy Fitzhardinge <jeremy@goop.org>
Subject: Re: Re: de-BKLing blkfront
Date: Wed, 21 Jul 2010 16:32:13 -0700
Message-ID: <4C47837D.2050609@goop.org>
References: <4C474F5D.80204@goop.org>	
	<1279746741.3752.377.camel@agari.van.xensource.com>	
	<4C4768F4.2020602@goop.org>
	<1279751211.3752.501.camel@agari.van.xensource.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <1279751211.3752.501.camel@agari.van.xensource.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Daniel Stodden <daniel.stodden@citrix.com>
Cc: "Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>, Alexander McKinney <armckinn@mckinneys.ca>, Tom Kopec <tek@acm.org>
List-Id: xen-devel@lists.xenproject.org

On 07/21/2010 03:26 PM, Daniel Stodden wrote:
> On Wed, 2010-07-21 at 17:39 -0400, Jeremy Fitzhardinge wrote:
>   
>> On 07/21/2010 02:12 PM, Daniel Stodden wrote:
>>     
>>> On Wed, 2010-07-21 at 15:49 -0400, Jeremy Fitzhardinge wrote:
>>>   
>>>       
>>>> When I was preparing the latest set of blkfront patches to send upstream
>>>> to Jens Axboe, he pointed out there were conflicts with what he
>>>> currently has queued.
>>>>
>>>> It turns out the conflict was from pushing the BKL (lock/unlock_kernel)
>>>> into the open and release functions.  I did the merge keeping them
>>>> around all the new stuff you added to those functions, but I wonder if
>>>> its actually necessary.  Do we rely on open/release being globally
>>>> serialized in there?
>>>>
>>>> I've pushed what I have into the upstream/blkfront branch in xen.git.
>>>>     
>>>>         
>>> Whatever it was, a BLK presumably fixed it.
>>>   
>>>       
>> There's an ongoing project to remove the BKL; part of that is to remove
>> implicit use of the BKL from the core kernel and push uses down to
>> drivers which need it.  That basically means mechanically adding
>> lock_kernel/unlock_kernel pairs to driver functions as they're removed
>> from the core kernel.  blkfront got hit with that at some point (haven't
>> identified precisely where), so we have the option to remove those
>> lock/unlocks if they're not necessary.
>>     
> Ah, understood. But for a while I used to dig through bdev open/release
> stuff quite regularly. I never was aware of any potential big lock on
> that path to push down.
>
> Do you think it's worth to look into the lock usage now removed in more
> detail? Would take a hint regarding which code was affected.
>
>   
>>> Anyway, it should not be necessary any more.
>>>
>>> Next I made the common mistake of looking into my code again, and
>>> immediately found I would send an unlucky opener transiently spinning in
>>> blkdev_get. Sometimes I wonder what I'm thinking.
>>>   
>>>       
>> What's the effect of that?
>>     
> Uhm, false alarm. Better just ignore me.
>
> I stuck that ERESTARTSYS stuff into blkif_open. It's not strictly
> needed, but looked like a good idea. I picked it up from the device
> mapper code.
>
> It's telling the block layer to retry if it managed to get a ref on a
> disk which just happened to be del_gendisk'd on a different thread. So
> it will retry and either succeed with a new disk, or fail right on the
> next attempt.
>
> I would have bet there would be a case where blkfront clears
> disk->private_data before removing the disk. But apparently not.
> Otherwise a thread hitting that phase would peg the CPU until the
> removal succeeded elsewhere. 
>
> Sorry for the noise.
>
>   
>>> Hold on for a patch to both.
>>>       
> So rather not. As far as I can tell, feel free to just drop the bkl
> code.
>
>   
>> Thanks.  BTW, did you get a change to look into those barrier comments?
>>     
> Nope, sorry. In fact I dropped that one. Getting more urgent?
>   

Well, it would be nice to address it for the next merge window, if
there's something to address.

There's also a mysterious IO-related hang I've been seeing moderately
frequently.  Something will end up waiting on IO while holding a lock,
and eventually the system grinds to a halt as things want to do block IO.

What appears to be happening is that an IO request is getting lost
somewhere, I think, either because blkfront is dropping it or perhaps
blkback/tap (but this has been observed on 2.6.18-based systems too, so
I suspect its a frontend issue.  Perhaps an IO really getting lost, or
perhaps its just an event; I'm not sure.  Or maybe there's something
screwy with barriers.

Tom Kopek and Alexander McKinney seem to be able to reproduce it
reliably, and have a blocktrace of a hang showing a bunch of outstanding IO.

Anyway, I'd been meaning to mention it to you before...

    J