From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Gunthorpe Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end() Date: Thu, 15 Aug 2019 11:37:59 -0300 Message-ID: <20190815143759.GG21596@ziepe.ca> References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-3-daniel.vetter@ffwll.ch> <20190814134558.fe659b1a9a169c0150c3e57c@linux-foundation.org> <20190815084429.GE9477@dhcp22.suse.cz> <20190815130415.GD21596@ziepe.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Daniel Vetter Cc: Michal Hocko , Andrew Morton , LKML , Linux MM , DRI Development , Intel Graphics Development , Peter Zijlstra , Ingo Molnar , David Rientjes , Christian =?utf-8?B?S8O2bmln?= , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Masahiro Yamada , Wei Wang , Andy Shevchenko , Thomas Gleixner , Jann Horn , Feng Tang , Kees Cook Randy Dunlap List-Id: intel-gfx@lists.freedesktop.org On Thu, Aug 15, 2019 at 03:12:11PM +0200, Daniel Vetter wrote: > On Thu, Aug 15, 2019 at 3:04 PM Jason Gunthorpe wrote: > > > > On Thu, Aug 15, 2019 at 10:44:29AM +0200, Michal Hocko wrote: > > > > > As the oom reaper is the primary guarantee of the oom handling forward > > > progress it cannot be blocked on anything that might depend on blockable > > > memory allocations. These are not really easy to track because they > > > might be indirect - e.g. notifier blocks on a lock which other context > > > holds while allocating memory or waiting for a flusher that needs memory > > > to perform its work. > > > > But lockdep *does* track all this and fs_reclaim_acquire() was created > > to solve exactly this problem. > > > > fs_reclaim is a lock and it flows through all the usual lockdep > > schemes like any other lock. Any time the page allocator wants to do > > something the would deadlock with reclaim it takes the lock. > > > > Failure is expressed by a deadlock cycle in the lockdep map, and > > lockdep can handle arbitary complexity through layers of locks, work > > queues, threads, etc. > > > > What is missing? > > Lockdep doens't seen everything by far. E.g. a wait_event will be > caught by the annotations here, but not by lockdep. Sure, but the wait_event might be OK if its progress isn't contingent on fs_reclaim, ie triggered from interrupt, so why ban it? > And since we're talking about mmu notifiers here and gpus/dma engines. > We have dma_fence_wait, which can wait for any hw/driver in the system > that takes part in shared/zero-copy buffer processing. Which at least > on the graphics side is everything. This pulls in enormous amounts of > deadlock potential that lockdep simply is blind about and will never > see. It seems very risky to entagle a notifier widely like that. It looks pretty sure that notifiers are fs_reclaim, so at a minimum that wait_event can't be contingent on anything that is doing GFP_KERNEL or it will deadlock - and blockable doesn't make that sleep safe. Avoiding an uncertain wait_event under notifiers would seem to be the only reasonable design here.. Jason