* [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence()
@ 2026-04-14 10:55 popov.nkv
2026-04-14 13:25 ` Christian König
0 siblings, 1 reply; 5+ messages in thread
From: popov.nkv @ 2026-04-14 10:55 UTC (permalink / raw)
To: Zack Rusin
Cc: Vladimir Popov, bcm-kernel-feedback-list, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, Deepak Rawat, Sinclair Yeh,
Thomas Hellstrom, dri-devel, linux-kernel, linux-media,
linaro-mm-sig, lvc-project, stable
From: Vladimir Popov <popov.nkv@gmail.com>
If vmw_execbuf_fence_commands() call fails in
vmw_kms_helper_validation_finish(), it sets *p_fence = NULL. If
ctx->bo_list is not empty, the caller, vmw_kms_helper_validation_finish(),
passes the fence through a chain of functions to dma_fence_is_array(),
which causes a NULL pointer dereference in dma_fence_is_array():
vmw_kms_helper_validation_finish() // pass NULL fence
vmw_validation_done()
vmw_validation_bo_fence()
ttm_eu_fence_buffer_objects() // pass NULL fence
dma_resv_add_fence()
dma_fence_is_container()
dma_fence_is_array() // NULL deref
Fix this by adding a NULL check in vmw_validation_bo_fence(): if the fence
is NULL, fall back to ttm_eu_backoff_reservation()to safely release
the buffer object reservations without attempting to add a NULL fence to
dma_resv. This is safe because when fence is NULL, vmw_fallback_wait()
has already been called inside vmw_execbuf_fence_commands() to synchronize
the GPU.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 038ecc503236 ("drm/vmwgfx: Add a validation module v2")
Cc: stable@vger.kernel.org
Signed-off-by: Vladimir Popov <popov.nkv@gmail.com>
---
drivers/gpu/drm/vmwgfx/vmwgfx_validation.h | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
index 353d837907d8..fc04555ca505 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
@@ -127,16 +127,23 @@ vmw_validation_bo_reserve(struct vmw_validation_context *ctx,
* vmw_validation_bo_fence - Unreserve and fence buffer objects registered
* with a validation context
* @ctx: The validation context
+ * @fence: Fence with which to fence all buffer objects taking part in the
+ * command submission.
*
* This function unreserves the buffer objects previously reserved using
- * vmw_validation_bo_reserve, and fences them with a fence object.
+ * vmw_validation_bo_reserve, and fences them with a fence object if the
+ * given fence object is not NULL.
*/
static inline void
vmw_validation_bo_fence(struct vmw_validation_context *ctx,
struct vmw_fence_obj *fence)
{
- ttm_eu_fence_buffer_objects(&ctx->ticket, &ctx->bo_list,
- (void *) fence);
+ /* fence is able to be NULL if vmw_execbuf_fence_commands() fails */
+ if (fence)
+ ttm_eu_fence_buffer_objects(&ctx->ticket, &ctx->bo_list,
+ (void *)fence);
+ else
+ ttm_eu_backoff_reservation(&ctx->ticket, &ctx->bo_list);
}
/**
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence()
2026-04-14 10:55 [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence() popov.nkv
@ 2026-04-14 13:25 ` Christian König
2026-04-15 1:08 ` Zack Rusin
0 siblings, 1 reply; 5+ messages in thread
From: Christian König @ 2026-04-14 13:25 UTC (permalink / raw)
To: popov.nkv, Zack Rusin
Cc: bcm-kernel-feedback-list, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Deepak Rawat, Sinclair Yeh, Thomas Hellstrom, dri-devel,
linux-kernel, linux-media, linaro-mm-sig, lvc-project, stable
On 4/14/26 12:55, popov.nkv@gmail.com wrote:
> From: Vladimir Popov <popov.nkv@gmail.com>
>
> If vmw_execbuf_fence_commands() call fails in
> vmw_kms_helper_validation_finish(), it sets *p_fence = NULL. If
> ctx->bo_list is not empty, the caller, vmw_kms_helper_validation_finish(),
> passes the fence through a chain of functions to dma_fence_is_array(),
> which causes a NULL pointer dereference in dma_fence_is_array():
>
> vmw_kms_helper_validation_finish() // pass NULL fence
> vmw_validation_done()
> vmw_validation_bo_fence()
> ttm_eu_fence_buffer_objects() // pass NULL fence
> dma_resv_add_fence()
> dma_fence_is_container()
> dma_fence_is_array() // NULL deref
Well good catch, but that is clearly not the right fix.
I'm not an expert for the vmwgfx code but in case of an error vmw_validation_revert() should be called an not vmw_kms_helper_validation_finish().
Regards,
Christian.
>
> Fix this by adding a NULL check in vmw_validation_bo_fence(): if the fence
> is NULL, fall back to ttm_eu_backoff_reservation()to safely release
> the buffer object reservations without attempting to add a NULL fence to
> dma_resv. This is safe because when fence is NULL, vmw_fallback_wait()
> has already been called inside vmw_execbuf_fence_commands() to synchronize
> the GPU.
>
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>
> Fixes: 038ecc503236 ("drm/vmwgfx: Add a validation module v2")
> Cc: stable@vger.kernel.org
> Signed-off-by: Vladimir Popov <popov.nkv@gmail.com>
> ---
> drivers/gpu/drm/vmwgfx/vmwgfx_validation.h | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> index 353d837907d8..fc04555ca505 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> @@ -127,16 +127,23 @@ vmw_validation_bo_reserve(struct vmw_validation_context *ctx,
> * vmw_validation_bo_fence - Unreserve and fence buffer objects registered
> * with a validation context
> * @ctx: The validation context
> + * @fence: Fence with which to fence all buffer objects taking part in the
> + * command submission.
> *
> * This function unreserves the buffer objects previously reserved using
> - * vmw_validation_bo_reserve, and fences them with a fence object.
> + * vmw_validation_bo_reserve, and fences them with a fence object if the
> + * given fence object is not NULL.
> */
> static inline void
> vmw_validation_bo_fence(struct vmw_validation_context *ctx,
> struct vmw_fence_obj *fence)
> {
> - ttm_eu_fence_buffer_objects(&ctx->ticket, &ctx->bo_list,
> - (void *) fence);
> + /* fence is able to be NULL if vmw_execbuf_fence_commands() fails */
> + if (fence)
> + ttm_eu_fence_buffer_objects(&ctx->ticket, &ctx->bo_list,
> + (void *)fence);
> + else
> + ttm_eu_backoff_reservation(&ctx->ticket, &ctx->bo_list);
> }
>
> /**
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence()
2026-04-14 13:25 ` Christian König
@ 2026-04-15 1:08 ` Zack Rusin
2026-04-15 7:56 ` Christian König
0 siblings, 1 reply; 5+ messages in thread
From: Zack Rusin @ 2026-04-15 1:08 UTC (permalink / raw)
To: Christian König
Cc: popov.nkv, bcm-kernel-feedback-list, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
lvc-project, stable, Ian Forbes
[-- Attachment #1: Type: text/plain, Size: 1843 bytes --]
On Tue, Apr 14, 2026 at 9:25 AM Christian König
<christian.koenig@amd.com> wrote:
>
> On 4/14/26 12:55, popov.nkv@gmail.com wrote:
> > From: Vladimir Popov <popov.nkv@gmail.com>
> >
> > If vmw_execbuf_fence_commands() call fails in
> > vmw_kms_helper_validation_finish(), it sets *p_fence = NULL. If
> > ctx->bo_list is not empty, the caller, vmw_kms_helper_validation_finish(),
> > passes the fence through a chain of functions to dma_fence_is_array(),
> > which causes a NULL pointer dereference in dma_fence_is_array():
> >
> > vmw_kms_helper_validation_finish() // pass NULL fence
> > vmw_validation_done()
> > vmw_validation_bo_fence()
> > ttm_eu_fence_buffer_objects() // pass NULL fence
> > dma_resv_add_fence()
> > dma_fence_is_container()
> > dma_fence_is_array() // NULL deref
>
> Well good catch, but that is clearly not the right fix.
>
> I'm not an expert for the vmwgfx code but in case of an error vmw_validation_revert() should be called an not vmw_kms_helper_validation_finish().
To me the patch looks correct. This path is explicitly for submission
failure and does BO backoff plus vmw_validation_res_unreserve(ctx,
true). The backoff=true branch skips committing dirty-state /
backup-MOB changes, which is only correct if commands were not
committed. Here the commands have already been submitted; only fence
creation failed. So I think unlocking BO reservations without
attaching a fence, then letting vmw_validation_done() keep taking the
success path for resources is correct.
iirc the same helper is used by execbuf, and the shared-helper fix
correctly covers both paths so this is probably not only a kms issue.
Untangling this code would make sense because it's confusing, but
that's not something I'd expect Vladimir to do :)
z
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence()
2026-04-15 1:08 ` Zack Rusin
@ 2026-04-15 7:56 ` Christian König
2026-04-16 3:37 ` Zack Rusin
0 siblings, 1 reply; 5+ messages in thread
From: Christian König @ 2026-04-15 7:56 UTC (permalink / raw)
To: Zack Rusin
Cc: popov.nkv, bcm-kernel-feedback-list, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
lvc-project, stable, Ian Forbes
On 4/15/26 03:08, Zack Rusin wrote:
> On Tue, Apr 14, 2026 at 9:25 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>> On 4/14/26 12:55, popov.nkv@gmail.com wrote:
>>> From: Vladimir Popov <popov.nkv@gmail.com>
>>>
>>> If vmw_execbuf_fence_commands() call fails in
>>> vmw_kms_helper_validation_finish(), it sets *p_fence = NULL. If
>>> ctx->bo_list is not empty, the caller, vmw_kms_helper_validation_finish(),
>>> passes the fence through a chain of functions to dma_fence_is_array(),
>>> which causes a NULL pointer dereference in dma_fence_is_array():
>>>
>>> vmw_kms_helper_validation_finish() // pass NULL fence
>>> vmw_validation_done()
>>> vmw_validation_bo_fence()
>>> ttm_eu_fence_buffer_objects() // pass NULL fence
>>> dma_resv_add_fence()
>>> dma_fence_is_container()
>>> dma_fence_is_array() // NULL deref
>>
>> Well good catch, but that is clearly not the right fix.
>>
>> I'm not an expert for the vmwgfx code but in case of an error vmw_validation_revert() should be called an not vmw_kms_helper_validation_finish().
>
> To me the patch looks correct. This path is explicitly for submission
> failure and does BO backoff plus vmw_validation_res_unreserve(ctx,
> true). The backoff=true branch skips committing dirty-state /
> backup-MOB changes, which is only correct if commands were not
> committed. Here the commands have already been submitted; only fence
> creation failed. So I think unlocking BO reservations without
> attaching a fence, then letting vmw_validation_done() keep taking the
> success path for resources is correct.
Ah! I would just avoid adding more TTM exec code dependencies.
We also have the always signaled stub fence for such use cases. How about that change here:
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index e1f18020170a..8dcb8cd19e29 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -3843,7 +3843,7 @@ int vmw_execbuf_fence_commands(struct drm_file *file_priv,
if (unlikely(ret != 0 && !synced)) {
(void) vmw_fallback_wait(dev_priv, false, false, sequence,
false, VMW_FENCE_WAIT_TIMEOUT);
- *p_fence = NULL;
+ *p_fence = dma_fence_get_stub();
}
return ret;
> iirc the same helper is used by execbuf, and the shared-helper fix
> correctly covers both paths so this is probably not only a kms issue.
>
> Untangling this code would make sense because it's confusing, but
> that's not something I'd expect Vladimir to do :)
Yeah fence memory allocation should definitely be move before submitting the commands.
But that is clearly more work.
Thanks,
Christian.
>
> z
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence()
2026-04-15 7:56 ` Christian König
@ 2026-04-16 3:37 ` Zack Rusin
0 siblings, 0 replies; 5+ messages in thread
From: Zack Rusin @ 2026-04-16 3:37 UTC (permalink / raw)
To: Christian König
Cc: popov.nkv, bcm-kernel-feedback-list, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
lvc-project, stable, Ian Forbes
[-- Attachment #1: Type: text/plain, Size: 3173 bytes --]
On Wed, Apr 15, 2026 at 3:56 AM Christian König
<christian.koenig@amd.com> wrote:
>
> On 4/15/26 03:08, Zack Rusin wrote:
> > On Tue, Apr 14, 2026 at 9:25 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >>
> >> On 4/14/26 12:55, popov.nkv@gmail.com wrote:
> >>> From: Vladimir Popov <popov.nkv@gmail.com>
> >>>
> >>> If vmw_execbuf_fence_commands() call fails in
> >>> vmw_kms_helper_validation_finish(), it sets *p_fence = NULL. If
> >>> ctx->bo_list is not empty, the caller, vmw_kms_helper_validation_finish(),
> >>> passes the fence through a chain of functions to dma_fence_is_array(),
> >>> which causes a NULL pointer dereference in dma_fence_is_array():
> >>>
> >>> vmw_kms_helper_validation_finish() // pass NULL fence
> >>> vmw_validation_done()
> >>> vmw_validation_bo_fence()
> >>> ttm_eu_fence_buffer_objects() // pass NULL fence
> >>> dma_resv_add_fence()
> >>> dma_fence_is_container()
> >>> dma_fence_is_array() // NULL deref
> >>
> >> Well good catch, but that is clearly not the right fix.
> >>
> >> I'm not an expert for the vmwgfx code but in case of an error vmw_validation_revert() should be called an not vmw_kms_helper_validation_finish().
> >
> > To me the patch looks correct. This path is explicitly for submission
> > failure and does BO backoff plus vmw_validation_res_unreserve(ctx,
> > true). The backoff=true branch skips committing dirty-state /
> > backup-MOB changes, which is only correct if commands were not
> > committed. Here the commands have already been submitted; only fence
> > creation failed. So I think unlocking BO reservations without
> > attaching a fence, then letting vmw_validation_done() keep taking the
> > success path for resources is correct.
>
> Ah! I would just avoid adding more TTM exec code dependencies.
>
> We also have the always signaled stub fence for such use cases. How about that change here:
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> index e1f18020170a..8dcb8cd19e29 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> @@ -3843,7 +3843,7 @@ int vmw_execbuf_fence_commands(struct drm_file *file_priv,
> if (unlikely(ret != 0 && !synced)) {
> (void) vmw_fallback_wait(dev_priv, false, false, sequence,
> false, VMW_FENCE_WAIT_TIMEOUT);
> - *p_fence = NULL;
> + *p_fence = dma_fence_get_stub();
> }
>
> return ret;
Yeah, that would be an ideal cleanup, but it needs a lot more work.
The p_fence is a vmw_fence_obj so we'll need to write code that allows
creation of vmw_fence_obj with a signaled dma_fence and then plumb
that through the driver. We'll also have to change a bunch of places
(especially in older kms code) in vmwgfx that treat null fence as "the
device has already synchronized". It's the right path, but to fix this
particular issue I'd be happy to take Vladimir patch for now and
perhaps I'd ask Ian to put a proper cleanup on his todo.
z
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5414 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-16 3:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-14 10:55 [PATCH 15901/15901] drm/vmwgfx: fix NULL pointer dereference in vmw_validation_bo_fence() popov.nkv
2026-04-14 13:25 ` Christian König
2026-04-15 1:08 ` Zack Rusin
2026-04-15 7:56 ` Christian König
2026-04-16 3:37 ` Zack Rusin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox