All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] robustify reset state transitions
@ 2012-11-12 22:07 Daniel Vetter
  2012-11-12 22:07 ` [PATCH 1/5] drm/i915: move dev_priv->mm out of line Daniel Vetter
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Daniel Vetter @ 2012-11-12 22:07 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

Hi all,

So I've noticed again that the hangman test was failing on some machines here,
and tracked it down to the new lockless wait code. Closer inspection showed that
we've relied on the single dev->struct_mutex ordering things correctly between
waiters and the reset code. But with that lock grabbing gone, the entire reset
could happen before the waiter wakes up and hence the waiter never sees a
non-zeor wedged value. Which means it'll go right back to sleep, waiting for a
seqno which just go cleared out by the reset code.

Looking at the code I've declared the entire thing to ad-hoc and revamped it,
adding comments explaining what's going on all over the place and auditing for
tiny races everywhere. Hopefully I've caugth them all, at least the machines
that previously hung after reset are now happily going through a few hundres
reset cycles!

Comments, flames and especially review highly welcome.

For fun (hey, let me have it!) I've thrown in some "let's move stuff around a
bit" patches at the beginning ;-)

Cheers, Daniel

Daniel Vetter (5):
  drm/i915: move dev_priv->mm out of line
  drm/i915: extract hangcheck/reset/error_state state into substruct
  drm/i915: move wedged to the other gpu error handling stuff
  drm/i915: clear up wedged transitions
  drm/i915: create a race-free reset detection

 drivers/gpu/drm/i915/i915_debugfs.c     |  12 +-
 drivers/gpu/drm/i915/i915_dma.c         |   9 +-
 drivers/gpu/drm/i915/i915_drv.c         |   8 +-
 drivers/gpu/drm/i915/i915_drv.h         | 274 ++++++++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem.c         | 110 +++++++------
 drivers/gpu/drm/i915/i915_irq.c         |  89 +++++++----
 drivers/gpu/drm/i915/intel_display.c    |   4 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |   8 +-
 8 files changed, 297 insertions(+), 217 deletions(-)

-- 
1.7.11.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-11-13 16:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-12 22:07 [PATCH 0/5] robustify reset state transitions Daniel Vetter
2012-11-12 22:07 ` [PATCH 1/5] drm/i915: move dev_priv->mm out of line Daniel Vetter
2012-11-12 22:07 ` [PATCH 2/5] drm/i915: extract hangcheck/reset/error_state state into substruct Daniel Vetter
2012-11-12 22:07 ` [PATCH 3/5] drm/i915: move wedged to the other gpu error handling stuff Daniel Vetter
2012-11-12 22:07 ` [PATCH 4/5] drm/i915: clear up wedged transitions Daniel Vetter
2012-11-13  8:56   ` Chris Wilson
2012-11-13 10:12     ` Daniel Vetter
2012-11-13 16:40       ` [PATCH 1/2] drm/i915: fix reset handling in the throttle ioctl Daniel Vetter
2012-11-13 16:40         ` [PATCH 2/2] drm/i915: clear up wedged transitions Daniel Vetter
2012-11-12 22:07 ` [PATCH 5/5] drm/i915: create a race-free reset detection Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.