From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Gunthorpe Subject: Re: [PATCH 3/5] mm, notifier: Catch sleeping/blocking for !blockable Date: Thu, 15 Aug 2019 09:35:56 -0300 Message-ID: <20190815123556.GB21596@ziepe.ca> References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-4-daniel.vetter@ffwll.ch> <20190815000029.GC11200@ziepe.ca> <20190815070249.GB7444@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20190815070249.GB7444@phenom.ffwll.local> Sender: linux-kernel-owner@vger.kernel.org To: LKML , linux-mm@kvack.org, DRI Development , Intel Graphics Development , Andrew Morton , Michal Hocko , David Rientjes , Christian =?utf-8?B?S8O2bmln?= , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Daniel Vetter List-Id: intel-gfx@lists.freedesktop.org On Thu, Aug 15, 2019 at 09:02:49AM +0200, Daniel Vetter wrote: > On Wed, Aug 14, 2019 at 09:00:29PM -0300, Jason Gunthorpe wrote: > > On Wed, Aug 14, 2019 at 10:20:25PM +0200, Daniel Vetter wrote: > > > We need to make sure implementations don't cheat and don't have a > > > possible schedule/blocking point deeply burried where review can't > > > catch it. > > > > > > I'm not sure whether this is the best way to make sure all the > > > might_sleep() callsites trigger, and it's a bit ugly in the code flow. > > > But it gets the job done. > > > > > > Inspired by an i915 patch series which did exactly that, because the > > > rules haven't been entirely clear to us. > > > > I thought lockdep already was able to detect: > > > > spin_lock() > > might_sleep(); > > spin_unlock() > > > > Am I mistaken? If yes, couldn't this patch just inject a dummy lockdep > > spinlock? > > Hm ... assuming I didn't get lost in the maze I think might_sleep (well > ___might_sleep) doesn't do any lockdep checking at all. And we want > might_sleep, since that catches a lot more than lockdep. Don't know how it works, but it sure looks like it does: This: spin_lock(&file->uobjects_lock); down_read(&file->hw_destroy_rwsem); up_read(&file->hw_destroy_rwsem); spin_unlock(&file->uobjects_lock); Causes: [ 33.324729] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1444 [ 33.325599] in_atomic(): 1, irqs_disabled(): 0, pid: 247, name: ibv_devinfo [ 33.326115] 3 locks held by ibv_devinfo/247: [ 33.326556] #0: 000000009edf8379 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_open+0xff/0x5f0 [ib_uverbs] [ 33.327657] #1: 000000005e0eddf1 (&uverbs_dev->lists_mutex){+.+.}, at: ib_uverbs_open+0x16c/0x5f0 [ib_uverbs] [ 33.328682] #2: 00000000505f509e (&(&file->uobjects_lock)->rlock){+.+.}, at: ib_uverbs_open+0x31a/0x5f0 [ib_uverbs] And this: spin_lock(&file->uobjects_lock); might_sleep(); spin_unlock(&file->uobjects_lock); Causes: [ 16.867211] BUG: sleeping function called from invalid context at drivers/infiniband/core/uverbs_main.c:1095 [ 16.867776] in_atomic(): 1, irqs_disabled(): 0, pid: 245, name: ibv_devinfo [ 16.868098] 3 locks held by ibv_devinfo/245: [ 16.868383] #0: 000000004c5954ff (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_open+0xf8/0x600 [ib_uverbs] [ 16.868938] #1: 0000000020a6fae2 (&uverbs_dev->lists_mutex){+.+.}, at: ib_uverbs_open+0x16c/0x600 [ib_uverbs] [ 16.869568] #2: 00000000036e6a97 (&(&file->uobjects_lock)->rlock){+.+.}, at: ib_uverbs_open+0x317/0x600 [ib_uverbs] I think this is done in some very expensive way, so it probably only works when lockdep is enabled.. Jason