public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tomas Elf <tomas.elf@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>, Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: [PATCH 6/9] drm/i915: driver based PASID handling
Date: Thu, 08 Oct 2015 11:28:47 +0100	[thread overview]
Message-ID: <5616455F.7090801@intel.com> (raw)
In-Reply-To: <20151007161424.GV3383@phenom.ffwll.local>

On 07/10/2015 17:14, Daniel Vetter wrote:
> On Wed, Oct 07, 2015 at 08:16:42AM -0700, Jesse Barnes wrote:
>> On 10/07/2015 06:00 AM, David Woodhouse wrote:
>>> On Fri, 2015-09-04 at 09:59 -0700, Jesse Barnes wrote:
>>>> +
>>>> +       ret = handle_mm_fault(mm, vma, address,
>>>> +                             desc.wr_req ? FAULT_FLAG_WRITE : 0);
>>>> +       if (ret & VM_FAULT_ERROR) {
>>>> +               gpu_mm_segv(tsk, address, SEGV_ACCERR); /* ? */
>>>> +               goto out_unlock;
>>>> +       }
>>>> +
>>>
>>> Hm, do you need to force the SEGV there, in what ought to be generic
>>> IOMMU code?
>>>
>>> Can you instead just let the fault handler return an appropriate
>>> failure code to the IOMMU request queue and then deal with the
>>> resulting error on the i915 device side?
>>
>> I'm not sure if we get enough info on the i915 side to handle it
>> reasonably, we'll have to test that out.
>
> We do know precisely which context blew up, but without the TDR work we
> can't yet just kill the offender selective without affecting the other
> active gpu contexts.

Could someone clarify what this means from the TDR point of view, 
please? When you say "context blew up" I'm guessing that you mean that 
come context caused the fault handler to get involved somehow?

Does this imply that the offending context will hang and the driver will 
have to detect this hang? If so, then yes - if we have the per-engine 
hang recovery mode as part of the upcoming TDR work in place then we 
could handle it by stepping over the offending batch buffer and moving 
on with a minimum of side-effects on the rest of the driver/GPU.

Or does this imply that we have some new integration path to deal with? 
(something that I should be aware of when upstreaming TDR?)

Sorry if I missed something blatantly obvious in the patch ;).

Thanks,
Tomas

>
> But besides that I really don't see a reason why we need to kill the
> process if the gpu faults. After all if a thread sigfaults then signal
> goes to that thread and not some random one (or the one thread that forked
> the thread that blew up). And we do have interfaces to tell userspace that
> something bad happened with the gpu work it submitted.
>
> Chris made a similar patch for userptr and I didn't like that one either.
> Worst case userspace has a special SEGV handler and then things really go
> down badly when that handler gets triggered at an unexpected place.
> -Daniel
>
>
>>> That way, you should hopefully get to gracefully cope with reporting
>>> errors for a specific *context*, rather than killing the whole process.
>>
>> It would be best to get per-context error info, but killing the process
>> may be unavoidable (just as if a single thread clobbers memory in your
>> process).
>>
>> Jesse
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-10-08 10:28 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 16:58 [RFC] Page table sharing and bufferless execbuf Jesse Barnes
2015-09-04 16:58 ` [PATCH 1/9] mm: move mmu_find_ops to mmu_notifier.c Jesse Barnes
2015-09-04 16:58 ` [PATCH 2/9] signal: export force_sig_info Jesse Barnes
2015-09-04 16:58 ` [PATCH 3/9] android/sync: add sync_fence_create_dma Jesse Barnes
2015-09-04 16:58 ` [PATCH 4/9] android/sync: hack: enable fence signaling in Android Native Sync implementation Jesse Barnes
2015-09-04 16:58 ` [PATCH 5/9] drm/i915: add create_context2 ioctl Jesse Barnes
2015-09-04 16:59 ` [PATCH 6/9] drm/i915: driver based PASID handling Jesse Barnes
2015-10-07 13:00   ` David Woodhouse
2015-10-07 15:16     ` Jesse Barnes
2015-10-07 16:14       ` Daniel Vetter
2015-10-07 16:28         ` Jesse Barnes
2015-10-07 17:17           ` David Woodhouse
2015-10-07 17:26             ` Jesse Barnes
2015-10-08  8:12               ` Daniel Vetter
2015-10-08 10:28         ` Tomas Elf [this message]
2015-10-08 11:29         ` Tomas Elf
2015-10-08 22:46           ` David Woodhouse
2015-10-09  7:28             ` Daniel Vetter
2015-10-09  7:52               ` Daniel Vetter
2015-10-09  7:56               ` David Woodhouse
2015-10-09  8:47                 ` Daniel Vetter
2015-10-09  9:07                   ` David Woodhouse
2015-10-09 16:20                     ` Jesse Barnes
2015-10-08 15:57   ` Chris Wilson
2015-10-09  7:24     ` David Woodhouse
2015-09-04 16:59 ` [PATCH 7/9] drm/i915: add fences to the request struct Jesse Barnes
2015-10-09 13:29   ` David Woodhouse
2015-10-09 16:11     ` Jesse Barnes
2015-09-04 16:59 ` [PATCH 8/9] drm/i915: Android sync points for i915 v4 (obsolete) Jesse Barnes
2015-09-04 16:59 ` [PATCH 9/9] drm/i915: add bufferless execbuf ioctl Jesse Barnes
2015-09-04 17:37   ` Chris Wilson
2015-09-04 19:09     ` Jesse Barnes
2015-10-08 10:34   ` David Woodhouse
2015-09-04 17:23 ` [RFC] Page table sharing and bufferless execbuf Chris Wilson
2015-09-04 19:08   ` Jesse Barnes
2015-09-26 14:55 ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5616455F.7090801@intel.com \
    --to=tomas.elf@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=dwmw2@infradead.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jbarnes@virtuousgeek.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox