From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: Re: de-BKLing blkfront Date: Wed, 21 Jul 2010 16:32:13 -0700 Message-ID: <4C47837D.2050609@goop.org> References: <4C474F5D.80204@goop.org> <1279746741.3752.377.camel@agari.van.xensource.com> <4C4768F4.2020602@goop.org> <1279751211.3752.501.camel@agari.van.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1279751211.3752.501.camel@agari.van.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Daniel Stodden Cc: "Xen-devel@lists.xensource.com" , Alexander McKinney , Tom Kopec List-Id: xen-devel@lists.xenproject.org On 07/21/2010 03:26 PM, Daniel Stodden wrote: > On Wed, 2010-07-21 at 17:39 -0400, Jeremy Fitzhardinge wrote: > >> On 07/21/2010 02:12 PM, Daniel Stodden wrote: >> >>> On Wed, 2010-07-21 at 15:49 -0400, Jeremy Fitzhardinge wrote: >>> >>> >>>> When I was preparing the latest set of blkfront patches to send upstream >>>> to Jens Axboe, he pointed out there were conflicts with what he >>>> currently has queued. >>>> >>>> It turns out the conflict was from pushing the BKL (lock/unlock_kernel) >>>> into the open and release functions. I did the merge keeping them >>>> around all the new stuff you added to those functions, but I wonder if >>>> its actually necessary. Do we rely on open/release being globally >>>> serialized in there? >>>> >>>> I've pushed what I have into the upstream/blkfront branch in xen.git. >>>> >>>> >>> Whatever it was, a BLK presumably fixed it. >>> >>> >> There's an ongoing project to remove the BKL; part of that is to remove >> implicit use of the BKL from the core kernel and push uses down to >> drivers which need it. That basically means mechanically adding >> lock_kernel/unlock_kernel pairs to driver functions as they're removed >> from the core kernel. blkfront got hit with that at some point (haven't >> identified precisely where), so we have the option to remove those >> lock/unlocks if they're not necessary. >> > Ah, understood. But for a while I used to dig through bdev open/release > stuff quite regularly. I never was aware of any potential big lock on > that path to push down. > > Do you think it's worth to look into the lock usage now removed in more > detail? Would take a hint regarding which code was affected. > > >>> Anyway, it should not be necessary any more. >>> >>> Next I made the common mistake of looking into my code again, and >>> immediately found I would send an unlucky opener transiently spinning in >>> blkdev_get. Sometimes I wonder what I'm thinking. >>> >>> >> What's the effect of that? >> > Uhm, false alarm. Better just ignore me. > > I stuck that ERESTARTSYS stuff into blkif_open. It's not strictly > needed, but looked like a good idea. I picked it up from the device > mapper code. > > It's telling the block layer to retry if it managed to get a ref on a > disk which just happened to be del_gendisk'd on a different thread. So > it will retry and either succeed with a new disk, or fail right on the > next attempt. > > I would have bet there would be a case where blkfront clears > disk->private_data before removing the disk. But apparently not. > Otherwise a thread hitting that phase would peg the CPU until the > removal succeeded elsewhere. > > Sorry for the noise. > > >>> Hold on for a patch to both. >>> > So rather not. As far as I can tell, feel free to just drop the bkl > code. > > >> Thanks. BTW, did you get a change to look into those barrier comments? >> > Nope, sorry. In fact I dropped that one. Getting more urgent? > Well, it would be nice to address it for the next merge window, if there's something to address. There's also a mysterious IO-related hang I've been seeing moderately frequently. Something will end up waiting on IO while holding a lock, and eventually the system grinds to a halt as things want to do block IO. What appears to be happening is that an IO request is getting lost somewhere, I think, either because blkfront is dropping it or perhaps blkback/tap (but this has been observed on 2.6.18-based systems too, so I suspect its a frontend issue. Perhaps an IO really getting lost, or perhaps its just an event; I'm not sure. Or maybe there's something screwy with barriers. Tom Kopek and Alexander McKinney seem to be able to reproduce it reliably, and have a blocktrace of a hang showing a bunch of outstanding IO. Anyway, I'd been meaning to mention it to you before... J