From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [PATCH 2/2] drm/radeon: fix deadlock when bo is associated to different handle Date: Wed, 28 Nov 2012 10:38:20 -0500 Message-ID: <20121128153819.GA1765@gmail.com> References: <1354039626-19920-1-git-send-email-j.glisse@gmail.com> <1354039626-19920-2-git-send-email-j.glisse@gmail.com> <50B5E70F.1030205@vodafone.de> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail-qc0-f177.google.com (mail-qc0-f177.google.com [209.85.216.177]) by gabe.freedesktop.org (Postfix) with ESMTP id 33D90E63E9 for ; Wed, 28 Nov 2012 07:41:31 -0800 (PST) Received: by mail-qc0-f177.google.com with SMTP id u28so9650702qcs.36 for ; Wed, 28 Nov 2012 07:41:30 -0800 (PST) Content-Disposition: inline In-Reply-To: <50B5E70F.1030205@vodafone.de> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: Christian =?iso-8859-1?Q?K=F6nig?= Cc: Jerome Glisse , dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org On Wed, Nov 28, 2012 at 11:27:27AM +0100, Christian K=F6nig wrote: > On 27.11.2012 19:07, j.glisse@gmail.com wrote: > >From: Jerome Glisse > > > >There is a rare case, that seems to only happen accross suspend/resume > >cycle, where a bo is associated with several different handle. This > >lead to a deadlock in ttm buffer reservation path. This could only > >happen with flinked(globaly exported) object. Userspace should not > >reopen multiple time a globaly exported object. > > > >However the kernel should handle gracefully this corner case and not > >keep rejecting the userspace command stream. This is the object of > >this patch. > > > >Fix suspend/resume issue where user see following message : > >[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35! > > > >Signed-off-by: Jerome Glisse > = > See comment below. > = > >--- > > drivers/gpu/drm/radeon/radeon_cs.c | 53 ++++++++++++++++++++++--------= -------- > > 1 file changed, 31 insertions(+), 22 deletions(-) > > > >diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon= /radeon_cs.c > >index 41672cc..064e64d 100644 > >--- a/drivers/gpu/drm/radeon/radeon_cs.c > >+++ b/drivers/gpu/drm/radeon/radeon_cs.c > >@@ -54,39 +54,48 @@ static int radeon_cs_parser_relocs(struct radeon_cs_= parser *p) > > return -ENOMEM; > > } > > for (i =3D 0; i < p->nrelocs; i++) { > >- struct drm_radeon_cs_reloc *r; > >- > >+ struct drm_radeon_cs_reloc *reloc; > >+ > >+ /* One bo could be associated with several different handle. > >+ * Only happen for flinked bo that are open several time. > >+ * > >+ * FIXME: > >+ * Maybe we should consider an alternative to idr for gem > >+ * object to insure a 1:1 uniq mapping btw handle and gem > >+ * object. > >+ */ > > duplicate =3D false; > >- r =3D (struct drm_radeon_cs_reloc *)&chunk->kdata[i*4]; > >+ reloc =3D (struct drm_radeon_cs_reloc *)&chunk->kdata[i*4]; > >+ p->relocs[i].handle =3D 0; > >+ p->relocs[i].flags =3D reloc->flags; > >+ p->relocs[i].gobj =3D drm_gem_object_lookup(ddev, > >+ p->filp, > >+ reloc->handle); > >+ if (p->relocs[i].gobj =3D=3D NULL) { > >+ DRM_ERROR("gem object lookup failed 0x%x\n", > >+ reloc->handle); > >+ return -ENOENT; > >+ } > >+ p->relocs[i].robj =3D gem_to_radeon_bo(p->relocs[i].gobj); > >+ p->relocs[i].lobj.bo =3D p->relocs[i].robj; > >+ p->relocs[i].lobj.wdomain =3D reloc->write_domain; > >+ p->relocs[i].lobj.rdomain =3D reloc->read_domains; > >+ p->relocs[i].lobj.tv.bo =3D &p->relocs[i].robj->tbo; > >+ > > for (j =3D 0; j < i; j++) { > >- if (r->handle =3D=3D p->relocs[j].handle) { > >+ if (p->relocs[i].lobj.bo =3D=3D p->relocs[j].lobj.bo) { > > p->relocs_ptr[i] =3D &p->relocs[j]; > > duplicate =3D true; > > break; > > } > > } > >+ > > if (!duplicate) { > >- p->relocs[i].gobj =3D drm_gem_object_lookup(ddev, > >- p->filp, > >- r->handle); > >- if (p->relocs[i].gobj =3D=3D NULL) { > >- DRM_ERROR("gem object lookup failed 0x%x\n", > >- r->handle); > >- return -ENOENT; > >- } > > p->relocs_ptr[i] =3D &p->relocs[i]; > >- p->relocs[i].robj =3D gem_to_radeon_bo(p->relocs[i].gobj); > >- p->relocs[i].lobj.bo =3D p->relocs[i].robj; > >- p->relocs[i].lobj.wdomain =3D r->write_domain; > >- p->relocs[i].lobj.rdomain =3D r->read_domains; > >- p->relocs[i].lobj.tv.bo =3D &p->relocs[i].robj->tbo; > >- p->relocs[i].handle =3D r->handle; > >- p->relocs[i].flags =3D r->flags; > >+ p->relocs[i].handle =3D reloc->handle; > > radeon_bo_list_add_object(&p->relocs[i].lobj, > > &p->validated); > >- > >- } else > >- p->relocs[i].handle =3D 0; > = > I'm not sure if the memory p->relocs is pointing to is zero > initialized, so we should at least initialize whatever member we use > to find the duplicates. Also I think we don't need the handle in > this structure any more if we don't use it for comparison (but not > 100% sure without testing it). No need to initialize p->relocs[i].lobj.bo which is the one use to find duplicate. When a duplicate is found p->relocs_ptr[i] points to first relocation with the duplicate bo. p->relocs[i].lobj.bo is always initialized before looking for duplicate. I kept the handle around because its usefull for debuging. But it could as well be removed and just added back whenever someone is doing debugging. Cheers, Jerome > = > >+ } > > } > > return radeon_bo_list_validate(&p->validated); > > } > =