From: Andi Shyti <andi.shyti@linux.intel.com>
To: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
Jani Nikula <jani.nikula@linux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
Tvrtko Ursulin <tursulin@ursulin.net>,
Chris Wilson <chris.p.wilson@linux.intel.com>,
Matthew Auld <matthew.auld@intel.com>,
Andi Shyti <andi.shyti@linux.intel.com>,
Nirmoy Das <nirmoy.das@intel.com>,
Jonathan Cavitt <jonathan.cavitt@intel.com>
Subject: Re: [PATCH] drm/i915/gt: Fix potential UAF by revoke of fence registers
Date: Tue, 4 Jun 2024 02:48:43 +0200 [thread overview]
Message-ID: <Zl5ka7WSVAv3THW6@ashyti-mobl2.lan> (raw)
In-Reply-To: <20240603195446.297690-2-janusz.krzysztofik@linux.intel.com>
Hi Janusz,
On Mon, Jun 03, 2024 at 09:54:45PM +0200, Janusz Krzysztofik wrote:
> CI has been sporadically reporting the following issue triggered by
> igt@i915_selftest@live@hangcheck on ADL-P and similar machines:
>
> <6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence
> ...
> <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
> <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
> <3> [414.070354] Unable to pin Y-tiled fence; err:-4
> <3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active))
> ...
> <4>[ 609.603992] ------------[ cut here ]------------
> <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301!
> <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1
> <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
> <4>[ 609.604010] Workqueue: i915 __i915_gem_free_work [i915]
> <4>[ 609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915]
> ...
> <4>[ 609.604271] Call Trace:
> <4>[ 609.604273] <TASK>
> ...
> <4>[ 609.604716] __i915_vma_evict+0x2e9/0x550 [i915]
> <4>[ 609.604852] __i915_vma_unbind+0x7c/0x160 [i915]
> <4>[ 609.604977] force_unbind+0x24/0xa0 [i915]
> <4>[ 609.605098] i915_vma_destroy+0x2f/0xa0 [i915]
> <4>[ 609.605210] __i915_gem_object_pages_fini+0x51/0x2f0 [i915]
> <4>[ 609.605330] __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915]
> <4>[ 609.605440] process_scheduled_works+0x351/0x690
> ...
>
> In the past, there were similar failures reported by CI from other IGT
> tests, observed on other platforms.
>
> Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity
> before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for
> idleness of vma->active via fence_update(). That commit introduced
> vma->fence->active in order for the fence_update() to be able to wait
> selectively on that one instead of vma->active since only idleness of
> fence registers was needed. But then, another commit 0d86ee35097a
> ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to
> fence_update() in i915_vma_revoke_fence() with only fence_write(), and
> also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front.
> No justification was provided on why we might then expect idleness of
> vma->fence->active without first waiting on it.
>
> The issue can be potentially caused by a race among revocation of fence
> registers on one side and sequential execution of signal callbacks invoked
> on completion of a request that was using them on the other, still
> processed in parallel to revocation of those fence registers. Fix it by
> waiting for idleness of vma->fence->active in i915_vma_revoke_fence().
>
> Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal")
> Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.8+
Just wondering whether we really need the stable kernel here.
We have just an alleged failure reported on a selftest. I think
we can drop the stable requirement.
Otherwise,
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Thanks,
Andi
next prev parent reply other threads:[~2024-06-04 0:48 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-03 19:54 [PATCH] drm/i915/gt: Fix potential UAF by revoke of fence registers Janusz Krzysztofik
2024-06-03 20:42 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2024-06-03 20:50 ` ✓ Fi.CI.BAT: success " Patchwork
2024-06-04 0:48 ` Andi Shyti [this message]
2024-06-04 15:27 ` [PATCH] " Janusz Krzysztofik
2024-06-06 2:33 ` Andi Shyti
2024-06-04 10:01 ` ✗ Fi.CI.IGT: failure for " Patchwork
2024-06-04 14:44 ` Janusz Krzysztofik
2024-06-06 5:07 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Fix potential UAF by revoke of fence registers (rev2) Patchwork
2024-06-06 5:14 ` ✓ Fi.CI.BAT: success " Patchwork
2024-06-06 9:16 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-06-06 16:05 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Fix potential UAF by revoke of fence registers (rev3) Patchwork
2024-06-06 16:15 ` ✓ Fi.CI.BAT: success " Patchwork
2024-06-07 4:44 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-06-17 15:22 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Fix potential UAF by revoke of fence registers (rev4) Patchwork
2024-06-17 15:30 ` ✗ Fi.CI.BAT: failure " Patchwork
2024-06-17 19:01 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Fix potential UAF by revoke of fence registers (rev5) Patchwork
2024-06-17 19:10 ` ✓ Fi.CI.BAT: success " Patchwork
2024-06-18 8:34 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-06-19 10:54 ` [PATCH] drm/i915/gt: Fix potential UAF by revoke of fence registers Andi Shyti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zl5ka7WSVAv3THW6@ashyti-mobl2.lan \
--to=andi.shyti@linux.intel.com \
--cc=chris.p.wilson@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=janusz.krzysztofik@linux.intel.com \
--cc=jonathan.cavitt@intel.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=nirmoy.das@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=tursulin@ursulin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox