From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Whitcroft Subject: Re: [PATCH 0/1] [RFC] DRM locking issues during early open Date: Thu, 19 Apr 2012 17:41:13 +0100 Message-ID: <20120419164113.GA3467@shadowen.org> References: <1334852525-14950-1-git-send-email-apw@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Dave Airlie Cc: David Airlie , dri-devel@lists.freedesktop.org, Jesse Barnes , Bryce Harrington , linux-kernel@vger.kernel.org List-Id: dri-devel@lists.freedesktop.org On Thu, Apr 19, 2012 at 05:30:03PM +0100, Dave Airlie wrote: > On Thu, Apr 19, 2012 at 5:22 PM, Andy Whitcroft w= rote: > > We have been carrying a (rather poor) patch for an issue we identif= ied in > > the DRM driver. =A0This issue is triggered when a DRM device is ini= tialising > > and userspace attempts to open it, typically in response to the sys= fs > > device added event. =A0Basically we allocate the minor numbers maki= ng > > the device available, and then call the drm load callback. =A0Until= this > > completes the device is really not ready and these early opens typi= cally > > lead to oopses. > > > > We have been using the following patch to avoid this by marking the= minors > > as in error until the load method has completed. =A0This avoids the= early > > open by simply erroring out the opens with EAGAIN. =A0Obviously we = should > > be delaying the open until the load method complete. > > > > I include the existing patch for completness (it is not really read= y for > > merging) to illustrate the issue. =A0I think it is logical that the= wait > > should simply be delayed until the load has completed. =A0I am prop= osing > > to include a wait queue associated with the idr cache for the drm m= inors > > which we can use to allow open callers to wait_event_interruptible(= ) on. > > I'll be putting together a prototype shortly and will follow up wit= h it. > > > > Thoughts? >=20 > Couldn't we just delay registering things until the driver is ready t= o > accept an open? >=20 > Granted the midlayer of drm doesn't make that easy, It seems that we need the dri minor allocated before we hit the load function as things are done right now. > thanks for sending this out, it keeps falling off my radar, I don't > think I've ever seen this reported on RHEL/Fedora, which makes me > wonder what we are doing that makes us lucky. We never hit it until we started doing things earlier and quicker. I f= irst found it in the prettification of boot so we were keen to get plymouth running as soon as possible. That lead to random panics and me finding this bug. The window is tiny as far as I know and it tends to be speci= fic machines and specific package combinations which trigger it reliably. I suspect that a proper fix would allow delaying the registration as yo= u suggest but in the interim a wait would at least avoid the issues we ar= e seeing. I will see how awful it looks. -apw