* [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling
@ 2019-04-30 9:44 Chris Wilson
2019-04-30 9:44 ` [PATCH 2/2] drm/i915: Cancel retire_worker on parking Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Chris Wilson @ 2019-04-30 9:44 UTC (permalink / raw)
To: intel-gfx
When the system is idling, contention for struct_mutex should be low and
so we will be more efficient to wait for a contended mutex than
reschedule.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem_pm.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
index 3554d55dae35..3b6e8d5be8e1 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -47,13 +47,7 @@ static void idle_work_handler(struct work_struct *work)
struct drm_i915_private *i915 =
container_of(work, typeof(*i915), gem.idle_work.work);
- if (!mutex_trylock(&i915->drm.struct_mutex)) {
- /* Currently busy, come back later */
- mod_delayed_work(i915->wq,
- &i915->gem.idle_work,
- msecs_to_jiffies(50));
- return;
- }
+ mutex_lock(&i915->drm.struct_mutex);
intel_wakeref_lock(&i915->gt.wakeref);
if (!intel_wakeref_active(&i915->gt.wakeref))
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] drm/i915: Cancel retire_worker on parking
2019-04-30 9:44 [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Chris Wilson
@ 2019-04-30 9:44 ` Chris Wilson
2019-04-30 13:22 ` Tvrtko Ursulin
2019-04-30 12:33 ` [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Tvrtko Ursulin
2019-04-30 13:49 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
2 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2019-04-30 9:44 UTC (permalink / raw)
To: intel-gfx
Replace the racy continuation check within retire_work with a definite
kill-switch on idling. The race was being exposed by gem_concurrent_blit
where the retire_worker would be terminated too early leaving us
spinning in debugfs/i915_drop_caches with nothing flushing the
retirement queue.
Although that the igt is trying to idle from one child while submitting
from another may be a contributing factor as to why it runs so slowly...
Testcase: igt/gem_concurrent_blit
Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_gem_pm.c | 27 ++++++++++---------
.../gpu/drm/i915/selftests/mock_gem_device.c | 3 +--
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
index 3b6e8d5be8e1..88be810758ae 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -46,15 +46,23 @@ static void idle_work_handler(struct work_struct *work)
{
struct drm_i915_private *i915 =
container_of(work, typeof(*i915), gem.idle_work.work);
+ bool restart = true;
+ cancel_delayed_work_sync(&i915->gem.retire_work);
mutex_lock(&i915->drm.struct_mutex);
intel_wakeref_lock(&i915->gt.wakeref);
- if (!intel_wakeref_active(&i915->gt.wakeref))
+ if (!intel_wakeref_active(&i915->gt.wakeref)) {
i915_gem_park(i915);
+ restart = false;
+ }
intel_wakeref_unlock(&i915->gt.wakeref);
mutex_unlock(&i915->drm.struct_mutex);
+ if (restart)
+ queue_delayed_work(i915->wq,
+ &i915->gem.retire_work,
+ round_jiffies_up_relative(HZ));
}
static void retire_work_handler(struct work_struct *work)
@@ -68,10 +76,9 @@ static void retire_work_handler(struct work_struct *work)
mutex_unlock(&i915->drm.struct_mutex);
}
- if (intel_wakeref_active(&i915->gt.wakeref))
- queue_delayed_work(i915->wq,
- &i915->gem.retire_work,
- round_jiffies_up_relative(HZ));
+ queue_delayed_work(i915->wq,
+ &i915->gem.retire_work,
+ round_jiffies_up_relative(HZ));
}
static int pm_notifier(struct notifier_block *nb,
@@ -159,15 +166,9 @@ void i915_gem_suspend(struct drm_i915_private *i915)
* reset the GPU back to its idle, low power state.
*/
GEM_BUG_ON(i915->gt.awake);
- cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
-
- drain_delayed_work(&i915->gem.retire_work);
+ flush_delayed_work(&i915->gem.idle_work);
- /*
- * As the idle_work is rearming if it detects a race, play safe and
- * repeat the flush until it is definitely idle.
- */
- drain_delayed_work(&i915->gem.idle_work);
+ cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
i915_gem_drain_freed_objects(i915);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index e4033d0576c4..ce54f8dc13cc 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -58,8 +58,7 @@ static void mock_device_release(struct drm_device *dev)
i915_gem_contexts_lost(i915);
mutex_unlock(&i915->drm.struct_mutex);
- drain_delayed_work(&i915->gem.retire_work);
- drain_delayed_work(&i915->gem.idle_work);
+ flush_delayed_work(&i915->gem.idle_work);
i915_gem_drain_workqueue(i915);
mutex_lock(&i915->drm.struct_mutex);
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling
2019-04-30 9:44 [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Chris Wilson
2019-04-30 9:44 ` [PATCH 2/2] drm/i915: Cancel retire_worker on parking Chris Wilson
@ 2019-04-30 12:33 ` Tvrtko Ursulin
2019-04-30 13:49 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
2 siblings, 0 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2019-04-30 12:33 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 30/04/2019 10:44, Chris Wilson wrote:
> When the system is idling, contention for struct_mutex should be low and
> so we will be more efficient to wait for a contended mutex than
> reschedule.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_gem_pm.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> index 3554d55dae35..3b6e8d5be8e1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -47,13 +47,7 @@ static void idle_work_handler(struct work_struct *work)
> struct drm_i915_private *i915 =
> container_of(work, typeof(*i915), gem.idle_work.work);
>
> - if (!mutex_trylock(&i915->drm.struct_mutex)) {
> - /* Currently busy, come back later */
> - mod_delayed_work(i915->wq,
> - &i915->gem.idle_work,
> - msecs_to_jiffies(50));
> - return;
> - }
> + mutex_lock(&i915->drm.struct_mutex);
>
> intel_wakeref_lock(&i915->gt.wakeref);
> if (!intel_wakeref_active(&i915->gt.wakeref))
>
I don't see any real downsides to this indeed.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Possible tweak could be to leave this as is, maybe just not go for the
reduced idle timer on re-schedule, but add a cancel_delayed_work on the
unparking side of things. That way any mutex activity without actual
device unparking would only slightly delay going idle, while idle_work
would retain it's minimal disturbance of the mutex.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] drm/i915: Cancel retire_worker on parking
2019-04-30 9:44 ` [PATCH 2/2] drm/i915: Cancel retire_worker on parking Chris Wilson
@ 2019-04-30 13:22 ` Tvrtko Ursulin
0 siblings, 0 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2019-04-30 13:22 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 30/04/2019 10:44, Chris Wilson wrote:
> Replace the racy continuation check within retire_work with a definite
> kill-switch on idling. The race was being exposed by gem_concurrent_blit
> where the retire_worker would be terminated too early leaving us
> spinning in debugfs/i915_drop_caches with nothing flushing the
> retirement queue.
>
> Although that the igt is trying to idle from one child while submitting
> from another may be a contributing factor as to why it runs so slowly...
>
> Testcase: igt/gem_concurrent_blit
> Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_pm.c | 27 ++++++++++---------
> .../gpu/drm/i915/selftests/mock_gem_device.c | 3 +--
> 2 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> index 3b6e8d5be8e1..88be810758ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -46,15 +46,23 @@ static void idle_work_handler(struct work_struct *work)
> {
> struct drm_i915_private *i915 =
> container_of(work, typeof(*i915), gem.idle_work.work);
> + bool restart = true;
>
> + cancel_delayed_work_sync(&i915->gem.retire_work);
> mutex_lock(&i915->drm.struct_mutex);
Wouldn't it be better to cancel_delayed_work and then
i915_retire_requests under the lock?
With cancel_delayed_work_sync outside struct_mutex it sounds it could
miss a retire pass.
>
> intel_wakeref_lock(&i915->gt.wakeref);
> - if (!intel_wakeref_active(&i915->gt.wakeref))
> + if (!intel_wakeref_active(&i915->gt.wakeref)) {
> i915_gem_park(i915);
> + restart = false;
> + }
> intel_wakeref_unlock(&i915->gt.wakeref);
>
> mutex_unlock(&i915->drm.struct_mutex);
> + if (restart)
> + queue_delayed_work(i915->wq,
> + &i915->gem.retire_work,
> + round_jiffies_up_relative(HZ));
> }
>
> static void retire_work_handler(struct work_struct *work)
> @@ -68,10 +76,9 @@ static void retire_work_handler(struct work_struct *work)
> mutex_unlock(&i915->drm.struct_mutex);
> }
>
> - if (intel_wakeref_active(&i915->gt.wakeref))
> - queue_delayed_work(i915->wq,
> - &i915->gem.retire_work,
> - round_jiffies_up_relative(HZ));
> + queue_delayed_work(i915->wq,
> + &i915->gem.retire_work,
> + round_jiffies_up_relative(HZ));
So retire runs until idle stops it - that sounds okay.
> }
>
> static int pm_notifier(struct notifier_block *nb,
> @@ -159,15 +166,9 @@ void i915_gem_suspend(struct drm_i915_private *i915)
> * reset the GPU back to its idle, low power state.
> */
> GEM_BUG_ON(i915->gt.awake);
> - cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> -
> - drain_delayed_work(&i915->gem.retire_work);
> + flush_delayed_work(&i915->gem.idle_work);
>
> - /*
> - * As the idle_work is rearming if it detects a race, play safe and
> - * repeat the flush until it is definitely idle.
> - */
> - drain_delayed_work(&i915->gem.idle_work);
> + cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
>
> i915_gem_drain_freed_objects(i915);
>
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index e4033d0576c4..ce54f8dc13cc 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -58,8 +58,7 @@ static void mock_device_release(struct drm_device *dev)
> i915_gem_contexts_lost(i915);
> mutex_unlock(&i915->drm.struct_mutex);
>
> - drain_delayed_work(&i915->gem.retire_work);
> - drain_delayed_work(&i915->gem.idle_work);
> + flush_delayed_work(&i915->gem.idle_work);
> i915_gem_drain_workqueue(i915);
>
> mutex_lock(&i915->drm.struct_mutex);
>
I am now thinking debugfs does not have to do things indirectly via
flush and drain. How about it calls what it needs directly? Unless I am
missing something that could be done separate to this patch and would
also fix the drop_caches spinning problem.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Wait for the struct_mutex on idling
2019-04-30 9:44 [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Chris Wilson
2019-04-30 9:44 ` [PATCH 2/2] drm/i915: Cancel retire_worker on parking Chris Wilson
2019-04-30 12:33 ` [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Tvrtko Ursulin
@ 2019-04-30 13:49 ` Patchwork
2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2019-04-30 13:49 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915: Wait for the struct_mutex on idling
URL : https://patchwork.freedesktop.org/series/60098/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_6017 -> Patchwork_12907
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_12907 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_12907, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/60098/revisions/1/mbox/
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_12907:
### IGT changes ###
#### Possible regressions ####
* igt@i915_selftest@live_execlists:
- fi-kbl-r: [PASS][1] -> [INCOMPLETE][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6017/fi-kbl-r/igt@i915_selftest@live_execlists.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12907/fi-kbl-r/igt@i915_selftest@live_execlists.html
Known issues
------------
Here are the changes found in Patchwork_12907 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@i915_selftest@live_workarounds:
- fi-snb-2600: [PASS][3] -> [INCOMPLETE][4] ([fdo#105411])
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6017/fi-snb-2600/igt@i915_selftest@live_workarounds.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12907/fi-snb-2600/igt@i915_selftest@live_workarounds.html
* igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u: [PASS][5] -> [DMESG-WARN][6] ([fdo#103841])
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6017/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12907/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html
#### Possible fixes ####
* igt@i915_selftest@live_contexts:
- fi-bdw-gvtdvm: [DMESG-FAIL][7] ([fdo#110235]) -> [PASS][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6017/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12907/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html
- fi-skl-gvtdvm: [DMESG-FAIL][9] ([fdo#110235]) -> [PASS][10]
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6017/fi-skl-gvtdvm/igt@i915_selftest@live_contexts.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12907/fi-skl-gvtdvm/igt@i915_selftest@live_contexts.html
[fdo#103841]: https://bugs.freedesktop.org/show_bug.cgi?id=103841
[fdo#105411]: https://bugs.freedesktop.org/show_bug.cgi?id=105411
[fdo#110235]: https://bugs.freedesktop.org/show_bug.cgi?id=110235
Participating hosts (53 -> 44)
------------------------------
Missing (9): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-skl-6770hq fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-icl-y fi-byt-clapper
Build changes
-------------
* Linux: CI_DRM_6017 -> Patchwork_12907
CI_DRM_6017: 69c3a37af9430650d1fc2a5555d4d0786898694d @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4971: fc5e0467eb6913d21ad932aa8a31c77fdb5a9c77 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_12907: c144d3af59602fcb4ddb218a788c961e47317432 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
c144d3af5960 drm/i915: Cancel retire_worker on parking
f2b3d409c989 drm/i915: Wait for the struct_mutex on idling
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12907/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-04-30 13:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-30 9:44 [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Chris Wilson
2019-04-30 9:44 ` [PATCH 2/2] drm/i915: Cancel retire_worker on parking Chris Wilson
2019-04-30 13:22 ` Tvrtko Ursulin
2019-04-30 12:33 ` [PATCH 1/2] drm/i915: Wait for the struct_mutex on idling Tvrtko Ursulin
2019-04-30 13:49 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox