dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780
@ 2013-07-10  9:22 Markus Trippelsdorf
  2013-07-10  9:29 ` Maarten Lankhorst
  0 siblings, 1 reply; 7+ messages in thread
From: Markus Trippelsdorf @ 2013-07-10  9:22 UTC (permalink / raw)
  To: dri-devel; +Cc: Dave Airlie, Jerome Glisse

By simply copy/pasting a big document under LibreOffice my system hangs
itself up. Only a hard reset gets it working again.
see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551

I've bisected the issue to:

commit ecff665f5e3f1c6909353e00b9420e45ae23d995
Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Date:   Thu Jun 27 13:48:17 2013 +0200

    drm/ttm: make ttm reservation calls behave like reservation calls
    
    This commit converts the source of the val_seq counter to
    the ww_mutex api. The reservation objects are converted later,
    because there is still a lockdep splat in nouveau that has to
    resolved first.
    
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

-- 
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780
  2013-07-10  9:22 Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780 Markus Trippelsdorf
@ 2013-07-10  9:29 ` Maarten Lankhorst
  2013-07-10  9:46   ` Markus Trippelsdorf
  0 siblings, 1 reply; 7+ messages in thread
From: Maarten Lankhorst @ 2013-07-10  9:29 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Dave Airlie, Jerome Glisse, dri-devel

Op 10-07-13 11:22, Markus Trippelsdorf schreef:
> By simply copy/pasting a big document under LibreOffice my system hangs
> itself up. Only a hard reset gets it working again.
> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>
> I've bisected the issue to:
>
> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
> Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
> Date:   Thu Jun 27 13:48:17 2013 +0200
>
>     drm/ttm: make ttm reservation calls behave like reservation calls
>     
>     This commit converts the source of the val_seq counter to
>     the ww_mutex api. The reservation objects are converted later,
>     because there is still a lockdep splat in nouveau that has to
>     resolved first.
>     
>     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>     Signed-off-by: Dave Airlie <airlied@redhat.com>
Hey,

Can you try current head with CONFIG_PROVE_LOCKING set and post the lockdep splat from dmesg, if any? If there is any locking issue lockdep should warn about it.
Lockdep will turn itself off after the first splat, so if the lockdep splat happens before running the affected parts those will have to be fixed first.

~Maarten

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780
  2013-07-10  9:29 ` Maarten Lankhorst
@ 2013-07-10  9:46   ` Markus Trippelsdorf
  2013-07-10  9:56     ` Maarten Lankhorst
  0 siblings, 1 reply; 7+ messages in thread
From: Markus Trippelsdorf @ 2013-07-10  9:46 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: Dave Airlie, Jerome Glisse, dri-devel

On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
> > By simply copy/pasting a big document under LibreOffice my system hangs
> > itself up. Only a hard reset gets it working again.
> > see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
> >
> > I've bisected the issue to:
> >
> > commit ecff665f5e3f1c6909353e00b9420e45ae23d995
> > Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
> > Date:   Thu Jun 27 13:48:17 2013 +0200
> >
> >     drm/ttm: make ttm reservation calls behave like reservation calls
> >     
> >     This commit converts the source of the val_seq counter to
> >     the ww_mutex api. The reservation objects are converted later,
> >     because there is still a lockdep splat in nouveau that has to
> >     resolved first.
> >     
> >     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> >     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
> >     Signed-off-by: Dave Airlie <airlied@redhat.com>
> Hey,
> 
> Can you try current head with CONFIG_PROVE_LOCKING set and post the
> lockdep splat from dmesg, if any? If there is any locking issue
> lockdep should warn about it.  Lockdep will turn itself off after the
> first splat, so if the lockdep splat happens before running the
> affected parts those will have to be fixed first.

There was an unrelated EDAC lockdep splat, so I simply disabled it.

This is what I get:

Jul 10 11:40:44 x4 kernel: ================================================
Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
Jul 10 11:40:44 x4 kernel: ------------------------------------------------
Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
Jul 10 11:40:53 x4 kernel: Emergency Sync complete

-- 
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780
  2013-07-10  9:46   ` Markus Trippelsdorf
@ 2013-07-10  9:56     ` Maarten Lankhorst
  2013-07-10 10:03       ` Markus Trippelsdorf
  0 siblings, 1 reply; 7+ messages in thread
From: Maarten Lankhorst @ 2013-07-10  9:56 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Dave Airlie, Jerome Glisse, dri-devel

Op 10-07-13 11:46, Markus Trippelsdorf schreef:
> On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
>> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
>>> By simply copy/pasting a big document under LibreOffice my system hangs
>>> itself up. Only a hard reset gets it working again.
>>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>>>
>>> I've bisected the issue to:
>>>
>>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
>>> Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
>>> Date:   Thu Jun 27 13:48:17 2013 +0200
>>>
>>>     drm/ttm: make ttm reservation calls behave like reservation calls
>>>     
>>>     This commit converts the source of the val_seq counter to
>>>     the ww_mutex api. The reservation objects are converted later,
>>>     because there is still a lockdep splat in nouveau that has to
>>>     resolved first.
>>>     
>>>     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>>>     Signed-off-by: Dave Airlie <airlied@redhat.com>
>> Hey,
>>
>> Can you try current head with CONFIG_PROVE_LOCKING set and post the
>> lockdep splat from dmesg, if any? If there is any locking issue
>> lockdep should warn about it.  Lockdep will turn itself off after the
>> first splat, so if the lockdep splat happens before running the
>> affected parts those will have to be fixed first.
> There was an unrelated EDAC lockdep splat, so I simply disabled it.
>
> This is what I get:
>
> Jul 10 11:40:44 x4 kernel: ================================================
> Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
> Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
> Jul 10 11:40:44 x4 kernel: ------------------------------------------------
> Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
> Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
> Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
> Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
> Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
> Jul 10 11:40:53 x4 kernel: Emergency Sync complete
>
Thanks, exactly what I thought. I missed a backoff somewhere..

Does the below patch fix it?

---
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 0219d26..2020bf4 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -377,6 +377,7 @@ int radeon_bo_list_validate(struct ww_acquire_ctx *ticket,
 					domain = lobj->alt_domain;
 					goto retry;
 				}
+				ttm_eu_backoff_reservation(ticket, head);
 				return r;
 			}
 		}

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780
  2013-07-10  9:56     ` Maarten Lankhorst
@ 2013-07-10 10:03       ` Markus Trippelsdorf
  2013-07-10 10:26         ` [PATCH] drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate Maarten Lankhorst
  0 siblings, 1 reply; 7+ messages in thread
From: Markus Trippelsdorf @ 2013-07-10 10:03 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: Dave Airlie, Jerome Glisse, dri-devel

On 2013.07.10 at 11:56 +0200, Maarten Lankhorst wrote:
> Op 10-07-13 11:46, Markus Trippelsdorf schreef:
> > On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
> >> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
> >>> By simply copy/pasting a big document under LibreOffice my system hangs
> >>> itself up. Only a hard reset gets it working again.
> >>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
> >>>
> >>> I've bisected the issue to:
> >>>
> >>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
> >>> Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
> >>> Date:   Thu Jun 27 13:48:17 2013 +0200
> >>>
> >>>     drm/ttm: make ttm reservation calls behave like reservation calls
> >>>     
> >>>     This commit converts the source of the val_seq counter to
> >>>     the ww_mutex api. The reservation objects are converted later,
> >>>     because there is still a lockdep splat in nouveau that has to
> >>>     resolved first.
> >>>     
> >>>     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> >>>     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
> >>>     Signed-off-by: Dave Airlie <airlied@redhat.com>
> >> Hey,
> >>
> >> Can you try current head with CONFIG_PROVE_LOCKING set and post the
> >> lockdep splat from dmesg, if any? If there is any locking issue
> >> lockdep should warn about it.  Lockdep will turn itself off after the
> >> first splat, so if the lockdep splat happens before running the
> >> affected parts those will have to be fixed first.
> > There was an unrelated EDAC lockdep splat, so I simply disabled it.
> >
> > This is what I get:
> >
> > Jul 10 11:40:44 x4 kernel: ================================================
> > Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
> > Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
> > Jul 10 11:40:44 x4 kernel: ------------------------------------------------
> > Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
> > Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
> > Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
> > Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
> > Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
> > Jul 10 11:40:53 x4 kernel: Emergency Sync complete
> >
> Thanks, exactly what I thought. I missed a backoff somewhere..
> 
> Does the below patch fix it?

Yes. Thank you for your quick reply.

-- 
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate
  2013-07-10 10:03       ` Markus Trippelsdorf
@ 2013-07-10 10:26         ` Maarten Lankhorst
  2013-07-11 19:38           ` Alex Deucher
  0 siblings, 1 reply; 7+ messages in thread
From: Maarten Lankhorst @ 2013-07-10 10:26 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Jerome Glisse, dri-devel, Markus Trippelsdorf

Op 10-07-13 12:03, Markus Trippelsdorf schreef:
> On 2013.07.10 at 11:56 +0200, Maarten Lankhorst wrote:
>> Op 10-07-13 11:46, Markus Trippelsdorf schreef:
>>> On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
>>>> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
>>>>> By simply copy/pasting a big document under LibreOffice my system hangs
>>>>> itself up. Only a hard reset gets it working again.
>>>>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>>>>>
>>>>> I've bisected the issue to:
>>>>>
>>>>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
>>>>> Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
>>>>> Date:   Thu Jun 27 13:48:17 2013 +0200
>>>>>
>>>>>     drm/ttm: make ttm reservation calls behave like reservation calls
>>>>>     
>>>>>     This commit converts the source of the val_seq counter to
>>>>>     the ww_mutex api. The reservation objects are converted later,
>>>>>     because there is still a lockdep splat in nouveau that has to
>>>>>     resolved first.
>>>>>     
>>>>>     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>>>     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>>>>>     Signed-off-by: Dave Airlie <airlied@redhat.com>
>>>> Hey,
>>>>
>>>> Can you try current head with CONFIG_PROVE_LOCKING set and post the
>>>> lockdep splat from dmesg, if any? If there is any locking issue
>>>> lockdep should warn about it.  Lockdep will turn itself off after the
>>>> first splat, so if the lockdep splat happens before running the
>>>> affected parts those will have to be fixed first.
>>> There was an unrelated EDAC lockdep splat, so I simply disabled it.
>>>
>>> This is what I get:
>>>
>>> Jul 10 11:40:44 x4 kernel: ================================================
>>> Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
>>> Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
>>> Jul 10 11:40:44 x4 kernel: ------------------------------------------------
>>> Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
>>> Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
>>> Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
>>> Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
>>> Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
>>> Jul 10 11:40:53 x4 kernel: Emergency Sync complete
>>>
>> Thanks, exactly what I thought. I missed a backoff somewhere..
>>
>> Does the below patch fix it?
> Yes. Thank you for your quick reply.

8<------
If radeon_cs_parser_relocs fails ttm_eu_backoff_reservation doesn't get called.
This left open a bug where ttm_eu_reserve_buffers succeeded but the bo's were
not unlocked afterwards:

Jul 10 11:40:44 x4 kernel: ================================================
Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
Jul 10 11:40:44 x4 kernel: ------------------------------------------------
Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
Jul 10 11:40:53 x4 kernel: Emergency Sync complete

This is a regression caused by commit ecff665f5e.
"drm/ttm: make ttm reservation calls behave like reservation calls"

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 0219d26..2020bf4 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -377,6 +377,7 @@ int radeon_bo_list_validate(struct ww_acquire_ctx *ticket,
 					domain = lobj->alt_domain;
 					goto retry;
 				}
+				ttm_eu_backoff_reservation(ticket, head);
 				return r;
 			}
 		}

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate
  2013-07-10 10:26         ` [PATCH] drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate Maarten Lankhorst
@ 2013-07-11 19:38           ` Alex Deucher
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Deucher @ 2013-07-11 19:38 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Dave Airlie, Jerome Glisse, Markus Trippelsdorf, dri-devel

I've picked up the patch for my fixes queue.  Thanks!

Alex

On Wed, Jul 10, 2013 at 6:26 AM, Maarten Lankhorst
<maarten.lankhorst@canonical.com> wrote:
> Op 10-07-13 12:03, Markus Trippelsdorf schreef:
>> On 2013.07.10 at 11:56 +0200, Maarten Lankhorst wrote:
>>> Op 10-07-13 11:46, Markus Trippelsdorf schreef:
>>>> On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
>>>>> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
>>>>>> By simply copy/pasting a big document under LibreOffice my system hangs
>>>>>> itself up. Only a hard reset gets it working again.
>>>>>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>>>>>>
>>>>>> I've bisected the issue to:
>>>>>>
>>>>>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
>>>>>> Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
>>>>>> Date:   Thu Jun 27 13:48:17 2013 +0200
>>>>>>
>>>>>>     drm/ttm: make ttm reservation calls behave like reservation calls
>>>>>>
>>>>>>     This commit converts the source of the val_seq counter to
>>>>>>     the ww_mutex api. The reservation objects are converted later,
>>>>>>     because there is still a lockdep splat in nouveau that has to
>>>>>>     resolved first.
>>>>>>
>>>>>>     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>>>>     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>>>>>>     Signed-off-by: Dave Airlie <airlied@redhat.com>
>>>>> Hey,
>>>>>
>>>>> Can you try current head with CONFIG_PROVE_LOCKING set and post the
>>>>> lockdep splat from dmesg, if any? If there is any locking issue
>>>>> lockdep should warn about it.  Lockdep will turn itself off after the
>>>>> first splat, so if the lockdep splat happens before running the
>>>>> affected parts those will have to be fixed first.
>>>> There was an unrelated EDAC lockdep splat, so I simply disabled it.
>>>>
>>>> This is what I get:
>>>>
>>>> Jul 10 11:40:44 x4 kernel: ================================================
>>>> Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
>>>> Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
>>>> Jul 10 11:40:44 x4 kernel: ------------------------------------------------
>>>> Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
>>>> Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
>>>> Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
>>>> Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
>>>> Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
>>>> Jul 10 11:40:53 x4 kernel: Emergency Sync complete
>>>>
>>> Thanks, exactly what I thought. I missed a backoff somewhere..
>>>
>>> Does the below patch fix it?
>> Yes. Thank you for your quick reply.
>
> 8<------
> If radeon_cs_parser_relocs fails ttm_eu_backoff_reservation doesn't get called.
> This left open a bug where ttm_eu_reserve_buffers succeeded but the bo's were
> not unlocked afterwards:
>
> Jul 10 11:40:44 x4 kernel: ================================================
> Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
> Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
> Jul 10 11:40:44 x4 kernel: ------------------------------------------------
> Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
> Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
> Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
> Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
> Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
> Jul 10 11:40:53 x4 kernel: Emergency Sync complete
>
> This is a regression caused by commit ecff665f5e.
> "drm/ttm: make ttm reservation calls behave like reservation calls"
>
> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index 0219d26..2020bf4 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -377,6 +377,7 @@ int radeon_bo_list_validate(struct ww_acquire_ctx *ticket,
>                                         domain = lobj->alt_domain;
>                                         goto retry;
>                                 }
> +                               ttm_eu_backoff_reservation(ticket, head);
>                                 return r;
>                         }
>                 }
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-07-11 19:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-10  9:22 Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780 Markus Trippelsdorf
2013-07-10  9:29 ` Maarten Lankhorst
2013-07-10  9:46   ` Markus Trippelsdorf
2013-07-10  9:56     ` Maarten Lankhorst
2013-07-10 10:03       ` Markus Trippelsdorf
2013-07-10 10:26         ` [PATCH] drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate Maarten Lankhorst
2013-07-11 19:38           ` Alex Deucher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).