dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/radeon: signal all fences after lockup to avoid endless waiting in GEM_WAIT
@ 2013-10-02 13:35 Marek Olšák
  0 siblings, 0 replies; 13+ messages in thread
From: Marek Olšák @ 2013-10-02 13:35 UTC (permalink / raw)
  To: dri-devel

From: Marek Olšák <marek.olsak@amd.com>

After a lockup, fences are not signalled sometimes, causing
the GEM_WAIT_IDLE ioctl to never return, which sometimes results
in an X server freeze.

This fixes only one of many deadlocks which can occur during a lockup.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
---
 drivers/gpu/drm/radeon/radeon_device.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 841d0e0..7b97baa 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1552,6 +1552,11 @@ int radeon_gpu_reset(struct radeon_device *rdev)
 	radeon_save_bios_scratch_regs(rdev);
 	/* block TTM */
 	resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
+
+	mutex_lock(&rdev->ring_lock);
+	radeon_fence_driver_force_completion(rdev);
+	mutex_unlock(&rdev->ring_lock);
+
 	radeon_pm_suspend(rdev);
 	radeon_suspend(rdev);
 
-- 
1.8.1.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: [PATCH] drm/radeon: signal all fences after lockup to avoid endless waiting in GEM_WAIT
@ 2013-10-02 13:52 Christian König
  2013-10-02 13:59 ` Marek Olšák
  0 siblings, 1 reply; 13+ messages in thread
From: Christian König @ 2013-10-02 13:52 UTC (permalink / raw)
  To: Marek Olšák; +Cc: dri-devel

NAK, after recovering from a lockup the first thing we do is signalling all remaining fences with an IB test.

If we don't recover we indeed signal all fences manually.

Signalling all fences regardless of the outcome of the reset creates problems with both types of partial resets.

Christian.

Marek Olšák <maraeo@gmail.com> schrieb:

>From: Marek Olšák <marek.olsak@amd.com>
>
>After a lockup, fences are not signalled sometimes, causing
>the GEM_WAIT_IDLE ioctl to never return, which sometimes results
>in an X server freeze.
>
>This fixes only one of many deadlocks which can occur during a lockup.
>
>Signed-off-by: Marek Olšák <marek.olsak@amd.com>
>---
> drivers/gpu/drm/radeon/radeon_device.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
>index 841d0e0..7b97baa 100644
>--- a/drivers/gpu/drm/radeon/radeon_device.c
>+++ b/drivers/gpu/drm/radeon/radeon_device.c
>@@ -1552,6 +1552,11 @@ int radeon_gpu_reset(struct radeon_device *rdev)
> 	radeon_save_bios_scratch_regs(rdev);
> 	/* block TTM */
> 	resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
>+
>+	mutex_lock(&rdev->ring_lock);
>+	radeon_fence_driver_force_completion(rdev);
>+	mutex_unlock(&rdev->ring_lock);
>+
> 	radeon_pm_suspend(rdev);
> 	radeon_suspend(rdev);
> 
>-- 
>1.8.1.2
>
>_______________________________________________
>dri-devel mailing list
>dri-devel@lists.freedesktop.org
>http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [PATCH] drm/radeon: signal all fences after lockup to avoid endless waiting in GEM_WAIT
@ 2013-10-02 14:50 Christian König
  2013-10-03  0:45 ` Marek Olšák
  0 siblings, 1 reply; 13+ messages in thread
From: Christian König @ 2013-10-02 14:50 UTC (permalink / raw)
  To: Marek Olšák; +Cc: dri-devel

Possible, but I would rather guess that this doesn't work because the IB test runs into a deadlock situation and so the GPU reset never fully completes.

Can you reproduce the problem?

If you want to make GPU resets more reliable I would rather suggest to remove the ring lock dependency.
Then we should try to give all the fence wait functions a (reliable) timeout and move reset handling a layer up into the ioctl functions. But for this you need to rip out the old PM code first.

Christian.

Marek Olšák <maraeo@gmail.com> schrieb:

>I'm afraid signalling the fences with an IB test is not reliable.
>
>Marek
>
>On Wed, Oct 2, 2013 at 3:52 PM, Christian König <deathsimple@vodafone.de> wrote:
>> NAK, after recovering from a lockup the first thing we do is signalling all remaining fences with an IB test.
>>
>> If we don't recover we indeed signal all fences manually.
>>
>> Signalling all fences regardless of the outcome of the reset creates problems with both types of partial resets.
>>
>> Christian.
>>
>> Marek Olšák <maraeo@gmail.com> schrieb:
>>
>>>From: Marek Olšák <marek.olsak@amd.com>
>>>
>>>After a lockup, fences are not signalled sometimes, causing
>>>the GEM_WAIT_IDLE ioctl to never return, which sometimes results
>>>in an X server freeze.
>>>
>>>This fixes only one of many deadlocks which can occur during a lockup.
>>>
>>>Signed-off-by: Marek Olšák <marek.olsak@amd.com>
>>>---
>>> drivers/gpu/drm/radeon/radeon_device.c | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>>diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
>>>index 841d0e0..7b97baa 100644
>>>--- a/drivers/gpu/drm/radeon/radeon_device.c
>>>+++ b/drivers/gpu/drm/radeon/radeon_device.c
>>>@@ -1552,6 +1552,11 @@ int radeon_gpu_reset(struct radeon_device *rdev)
>>>       radeon_save_bios_scratch_regs(rdev);
>>>       /* block TTM */
>>>       resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
>>>+
>>>+      mutex_lock(&rdev->ring_lock);
>>>+      radeon_fence_driver_force_completion(rdev);
>>>+      mutex_unlock(&rdev->ring_lock);
>>>+
>>>       radeon_pm_suspend(rdev);
>>>       radeon_suspend(rdev);
>>>
>>>--
>>>1.8.1.2
>>>
>>>_______________________________________________
>>>dri-devel mailing list
>>>dri-devel@lists.freedesktop.org
>>>http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-10-14  9:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-02 13:35 [PATCH] drm/radeon: signal all fences after lockup to avoid endless waiting in GEM_WAIT Marek Olšák
  -- strict thread matches above, loose matches on Subject: below --
2013-10-02 13:52 Christian König
2013-10-02 13:59 ` Marek Olšák
2013-10-02 14:50 Christian König
2013-10-03  0:45 ` Marek Olšák
2013-10-07 11:08   ` Christian König
2013-10-08 16:21     ` Christian König
2013-10-09 10:36       ` Marek Olšák
2013-10-09 11:09         ` Christian König
2013-10-09 12:04           ` Marek Olšák
2013-10-13 12:47             ` Christian König
2013-10-13 20:16               ` Marek Olšák
2013-10-14  9:19                 ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).