intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Daniel Vetter <daniel@ffwll.ch>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 0/3] drm/i915: Handle hanging during nonblocking modeset correctly.
Date: Mon, 30 Jan 2017 15:42:17 +0100	[thread overview]
Message-ID: <5d18a22c-a816-e035-c9a2-b753f082c345@linux.intel.com> (raw)
In-Reply-To: <20170130081751.mj2dt6cytmh7se7b@phenom.ffwll.local>

Op 30-01-17 om 09:17 schreef Daniel Vetter:
> On Fri, Jan 27, 2017 at 03:08:45PM +0000, Chris Wilson wrote:
>> On Fri, Jan 27, 2017 at 03:58:08PM +0100, Daniel Vetter wrote:
>>> On Fri, Jan 27, 2017 at 02:31:55PM +0000, Chris Wilson wrote:
>>>> On Fri, Jan 27, 2017 at 03:21:29PM +0100, Daniel Vetter wrote:
>>>>> On Fri, Jan 27, 2017 at 09:30:50AM +0000, Chris Wilson wrote:
>>>>>> On Thu, Jan 26, 2017 at 04:59:21PM +0100, Maarten Lankhorst wrote:
>>>>>>> When writing some testcases for nonblocking modesets. I found out that the
>>>>>>> infinite wait on the old fb was causing issues.
>>>>>> The crux of the issue here is the locked wait for old dependencies and
>>>>>> the inability to inject the intel_prepare_reset disabling of all planes.
>>>>>> There are a couple of locked waits on struct_mutex within the modeset
>>>>>> locks for intel_overlay and if we happen to be using the display plane
>>>>>> for the first time.
>>>>>>
>>>>>> The first I suggested solving using fences to track dependencies and
>>>>>> keep the order between atomic states. Cancelling the outstanding
>>>>>> modesets, replacing with a disable and then on restore jumping to the
>>>>>> final state look doable. It also requires avoiding the struct_mutex for
>>>>>> disabling, which is quite easy. To avoid the wait under struct_mutex,
>>>>>> we've talked about switching to mmio, but for starters we could move the
>>>>>> wait from inside intel_overlay into the fence for the atomic operation.
>>>>>> (But's that a little more surgery than we would like for intel_overlay I
>>>>>> guess - dig out Ville's patches for overlay planes?) And to prevent the
>>>>>> wait under struct_mutex for pin_to_display_plane, my plane is to move
>>>>>> that to an async fenced operation that is then naturally waited upon by
>>>>>> the atomic modeset.
>>>>> A bit more a hack, but a different idea, and I think hack for gen234.0 is
>>>>> ok:
>>>>>
>>>>> We complete all the requests before we start the hw reset with fence.error
>>>>> = -EIO. But we do this only when we need to get at the display locks. A
>>>>> slightly more elegant solution would be to trylock modeset locks, and if
>>>>> one of them fails (and only then) complete all requests with -EIO to get
>>>>> the concurrent modeset to proceed before we reset the hardware. That's
>>>>> essentially the logic we had before all the reworks, and it worked. But I
>>>>> didn't look at how scary that all would be to make it work again ...
>>>> The modeset lock may not just be waiting on our requests (even on pnv we
>>>> can expect that there are already users celebrating that pnv+nouveau
>>>> finally works ;) and that the display is not the only user/observer of
>>>> those requests. Using the requests to break the modeset lock just feels
>>>> like the wrong approach.
>>> It's a cycle, and we need to break it somewhere. Another option might be
>>> to break the cycle the same way we do it for gem locks: Wake up everyone
>>> and restart the modeset ioctl. Since the trouble only happens for
>>> synchronous modesets where we hold the locks while waiting for fences, we
>>> can also break out of that and restart. And I also don't think that would
>>> leak to other drivers, after all our gem locking restart dances also don't
>>> leak to other drivers - it's just our own driver's lock which are affected
>>> by these special wakupe semantics.
>> It's a queue of nonblocking modesets that we need to worry about, afaik.
>> Moving the wait for blocking modeset outside of modeset lock is easily
>> achievable (and avoiding the other waits under both the modeset + 
>> struct_mutex I have at least an idea for). So the challenge is how to
>> inject all-planes-off for gen3 and then allow the queue to continue again
>> afterwards.
> Hm right, I missed the nonblocking updates which don't take locks. But
> assuming we do the display reset for gpu resets as a full modeset (i.e.
> going through ->atomic_commit) it should still work out correctly:
>
> Starting state: gpu is hung, nonblocking modeset waiting for some requests
> to complete.
Missing one evil detail here, else things would have moved forward..

A unrelated thread performs a blocking commit, and holds all locks until the nonblocking modeset completes.
> 1. hangcheck kicks in, fires off reset work.
>
> 2. We complete all requests with fence.error = -EIO and wake up any
> waiters. That means no re-queueing for older platforms, but oh well.
>
> 3. We grab all the display locks. Nothing happens yet.
>
> 4. We reset the chip, display dies.
>
> 5. We run ->atomic_commit to restore things. This will also force the
> nonblocking commit worker to complete before this display restore touches
> anything.
>
> The only trouble I see is that the nonblocking worker can still touch the
> display block while we kill it, which isn't awesome. But we can fix that
> by waiting for all pending nonblocking commits in step 3 manually (without
> calling into atomic_commit), as long as we do step 2.
>
> So completing everything with EIO unconditionally still seems like the
> simplest option that actually works for pre-g4x ...
> -Daniel


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-01-30 14:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-26 15:59 [PATCH 0/3] drm/i915: Handle hanging during nonblocking modeset correctly Maarten Lankhorst
2017-01-26 15:59 ` [PATCH 1/3] drm/atomic: Bump timeout for waiting for hw_done to 90s in swap_state Maarten Lankhorst
2017-01-26 15:59 ` [PATCH 2/3] drm/i915: Set a timeout when waiting for fence on the old fb Maarten Lankhorst
2017-01-26 15:59 ` [PATCH 3/3] drm/i915: Skip modeset locking when atomic pageflips are used Maarten Lankhorst
2017-01-26 16:39 ` [PATCH 0/3] drm/i915: Handle hanging during nonblocking modeset correctly Ville Syrjälä
2017-01-27  9:30 ` [Intel-gfx] " Chris Wilson
2017-01-27 14:21   ` Daniel Vetter
2017-01-27 14:31     ` Chris Wilson
2017-01-27 14:58       ` [Intel-gfx] " Daniel Vetter
2017-01-27 15:08         ` Chris Wilson
2017-01-30  8:17           ` Daniel Vetter
2017-01-30 14:42             ` Maarten Lankhorst [this message]
2017-01-31  7:46               ` [Intel-gfx] " Daniel Vetter
2017-01-31  9:11                 ` Maarten Lankhorst
2017-01-30 15:25             ` [PATCH] drm/i915: Skip modeset locking when atomic pageflips are used Maarten Lankhorst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5d18a22c-a816-e035-c9a2-b753f082c345@linux.intel.com \
    --to=maarten.lankhorst@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).