From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH] drm/i915: optionally ban context on first hang Date: Tue, 10 Sep 2013 16:59:31 +0300 Message-ID: <20130910135931.GO11428@intel.com> References: <1378819010-5173-1-git-send-email-mika.kuoppala@intel.com> <20130910132651.GC5555@nuc-i3427.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 69B6DE71E7 for ; Tue, 10 Sep 2013 06:59:46 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20130910132651.GC5555@nuc-i3427.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Chris Wilson , Mika Kuoppala , intel-gfx@lists.freedesktop.org, Ben Widawsky , Paul Berry List-Id: intel-gfx@lists.freedesktop.org On Tue, Sep 10, 2013 at 02:26:51PM +0100, Chris Wilson wrote: > On Tue, Sep 10, 2013 at 04:16:50PM +0300, Mika Kuoppala wrote: > > Current policy is to ban context if it manages to hang > > gpu in a certain time windows. Paul Berry asked if more > > strict policy could be available for use cases where > > the application doesn't know if the rendering command stream > > sent to gpu is valid or not. > > = > > Provide an option, flag on context creation time, to let > > userspace to set more strict policy for handling gpu hangs for > > this context. If context with this flag set ever hangs the gpu, > > it will be permanently banned from accessing the GPU. > > All subsequent batch submissions will return -EIO. > > = > > Requested-by: Paul Berry > > Cc: Paul Berry > > Cc: Ben Widawsky > > Signed-off-by: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/i915_dma.c | 3 +++ > > drivers/gpu/drm/i915/i915_drv.h | 3 +++ > > drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++- > > drivers/gpu/drm/i915/i915_gem_context.c | 12 +++++++++--- > > include/uapi/drm/i915_drm.h | 5 +++++ > > 5 files changed, 28 insertions(+), 4 deletions(-) > > = > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i91= 5_dma.c > > index 3de6050..4353458 100644 > > --- a/drivers/gpu/drm/i915/i915_dma.c > > +++ b/drivers/gpu/drm/i915/i915_dma.c > > @@ -1003,6 +1003,9 @@ static int i915_getparam(struct drm_device *dev, = void *data, > > case I915_PARAM_HAS_EXEC_HANDLE_LUT: > > value =3D 1; > > break; > > + case I915_PARAM_HAS_CONTEXT_BAN: > > + value =3D 1; > > + break; > = > As we add the flags, we have a better method for detecting whether the > context accepts the flags (just request that a first-ban context be > created and mark the failure as unsupported), and so the getparam is > redundant. > = > > struct drm_i915_gem_context_create { > > /* output: id of new context*/ > > __u32 ctx_id; > > __u32 pad; > > + __u64 flags; > > }; > = > I thought that the size of the ioctl was part of the ABI, but it does > look like extending it as you have done here is valid. TIL. Yeah, it does look like drm_ioctl() does allow it, but only for driver ioctls. For drm core ioctls the kernel still accepts the ioctl, but it gets the size from the kernel's ioctl->cmd. So depeding on the case the kernel may read garbage from userspace, overwrite some other userspace data, not touch some of the data userspace was offering, or just give back -EFAULT. I guess that's all fine since userspace that does stuff like that is already buggy. -- = Ville Syrj=E4l=E4 Intel OTC