All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Make vm eviction uninterruptible
@ 2014-04-05 20:08 Ben Widawsky
  2014-04-05 20:34 ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Ben Widawsky @ 2014-04-05 20:08 UTC (permalink / raw)
  To: Intel GFX

Our current code cannot handle a failure to evict well. You'll get at
the very least the following splat, but usually a lot worse fallout after:

[  134.819441] ------------[ cut here ]------------
[  134.819467] WARNING: CPU: 3 PID: 442 at drivers/gpu/drm/i915/i915_gem_evict.c:230 i915_gem_evict_vm+0x8a/0x1c0 [i915]()
[  134.819471] Modules linked in: i915 drm_kms_helper drm intel_gtt agpgart i2c_algo_bit ext4 crc16 mbcache jbd2 x86_pkg_temp_thermal coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper cryptd microcode serio_raw i2c_i801 fan thermal battery e1000e acpi_cpufreq evdev ptp ac acpi_pad pps_core processor lpc_ich mfd_core snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_timer snd soundcore sd_mod crc_t10dif crct10dif_common ahci libahci libata ehci_pci ehci_hcd usbcore scsi_mod usb_common
[  134.819565] CPU: 3 PID: 442 Comm: glxgears Not tainted 3.14.0-BEN+ #480
[  134.819568] Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1, BIOS BDW-E1R1.86C.0063.R01.1402110503 02/11/2014
[  134.819571]  0000000000000009 ffff88009b10fa80 ffffffff8159e6a5 0000000000000000
[  134.819577]  ffff88009b10fab8 ffffffff8104895d ffff880145c353c0 ffff880145f400f8
[  134.819584]  0000000000000000 ffff8800a274d300 ffff88009b10fb78 ffff88009b10fac8
[  134.819590] Call Trace:
[  134.819599]  [<ffffffff8159e6a5>] dump_stack+0x4e/0x7a
[  134.819607]  [<ffffffff8104895d>] warn_slowpath_common+0x7d/0xa0
[  134.819635]  [<ffffffff81048a3a>] warn_slowpath_null+0x1a/0x20
[  134.819656]  [<ffffffffa050c82a>] i915_gem_evict_vm+0x8a/0x1c0 [i915]
[  134.819677]  [<ffffffffa050a26b>] ppgtt_release+0x17b/0x1e0 [i915]
[  134.819693]  [<ffffffffa050a34d>] i915_gem_context_free+0x7d/0x180 [i915]
[  134.819707]  [<ffffffffa050a48c>] context_idr_cleanup+0x3c/0x40 [i915]
[  134.819715]  [<ffffffff81332d14>] idr_for_each+0x104/0x1a0
[  134.819730]  [<ffffffffa050a450>] ? i915_gem_context_free+0x180/0x180 [i915]
[  134.819735]  [<ffffffff815a27fc>] ? mutex_lock_nested+0x28c/0x3d0
[  134.819761]  [<ffffffffa057da75>] ? i915_driver_preclose+0x25/0x50 [i915]
[  134.819778]  [<ffffffffa050b715>] i915_gem_context_close+0x35/0xa0 [i915]
[  134.819802]  [<ffffffffa057da80>] i915_driver_preclose+0x30/0x50 [i915]
[  134.819816]  [<ffffffffa03e6a7d>] drm_release+0x5d/0x5f0 [drm]
[  134.819822]  [<ffffffff811aae3a>] __fput+0xea/0x240
[  134.819827]  [<ffffffff811aafde>] ____fput+0xe/0x10
[  134.819832]  [<ffffffff810701bc>] task_work_run+0xac/0xe0
[  134.819837]  [<ffffffff8104b82f>] do_exit+0x2cf/0xcf0
[  134.819844]  [<ffffffff815a5dac>] ? _raw_spin_unlock_irq+0x2c/0x60
[  134.819849]  [<ffffffff8104c2dc>] do_group_exit+0x4c/0xc0
[  134.819855]  [<ffffffff8105ef11>] get_signal_to_deliver+0x2d1/0x920
[  134.819861]  [<ffffffff81002408>] do_signal+0x48/0x620
[  134.819867]  [<ffffffff811aa0d9>] ? do_readv_writev+0x169/0x220
[  134.819873]  [<ffffffff8109e33d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[  134.819879]  [<ffffffff811c9b9d>] ? __fget_light+0x13d/0x160
[  134.819886]  [<ffffffff815a744c>] ? sysret_signal+0x5/0x47
[  134.819892]  [<ffffffff81002a45>] do_notify_resume+0x65/0x80
[  134.819897]  [<ffffffff815a76da>] int_signal+0x12/0x17
[  134.819901] ---[ end trace dbf4da2122c3d683 ]---

At first I was going to call this a bandage to the problem. However,
upon further thought, I rather like the idea of making evictions atomic,
and less prone to failure anyway. The reason it can still somewhat be
considered a band-aid however is GPU hangs. It would be nice if we had
some way to interrupt the process when the GPU is hung. I'll leave it
for a follow patch though.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 75fca63..91da738 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -225,10 +225,17 @@ int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
 		i915_gem_retire_requests(vm->dev);
 	}
 
-	list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+	list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list) {
+		struct drm_i915_private *dev_priv = vm->dev->dev_private;
+		bool old_intr = dev_priv->mm.interruptible;
+		dev_priv->mm.interruptible = false;
+
 		if (vma->pin_count == 0)
 			WARN_ON(i915_vma_unbind(vma));
 
+		dev_priv->mm.interruptible = old_intr;
+	}
+
 	return 0;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-04-09 12:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-05 20:08 [PATCH] drm/i915: Make vm eviction uninterruptible Ben Widawsky
2014-04-05 20:34 ` Chris Wilson
2014-04-06  2:45   ` Ben Widawsky
2014-04-06 18:35     ` Ben Widawsky
2014-04-07  9:42       ` Chris Wilson
2014-04-07 12:15         ` Daniel Vetter
2014-04-07 12:30           ` Chris Wilson
2014-04-07 18:58             ` Ben Widawsky
2014-04-07 21:50               ` Daniel Vetter
2014-04-07 21:58                 ` Ben Widawsky
2014-04-08  6:50                   ` Chris Wilson
2014-04-09  4:09                     ` Ben Widawsky
2014-04-09  6:16                       ` Chris Wilson
2014-04-08  6:53                   ` Daniel Vetter
2014-04-09  4:11                     ` Ben Widawsky
2014-04-09 12:54                       ` Daniel Vetter
2014-04-07 21:17         ` [PATCH] drm/i915: Make vm eviction uninterruptible in preclose Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.