Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
@ 2026-05-07  5:53 Ramesh Adhikari
  2026-05-07  6:47 ` Thomas Hellström
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Ramesh Adhikari @ 2026-05-07  5:53 UTC (permalink / raw)
  To: intel-xe, matthew.brost, thomas.hellstrom, rodrigo.vivi
  Cc: stable, Ramesh Adhikari

The xe_vm_bind_ioctl function accepts user-controlled num_binds without
bounds checking, allowing arbitrarily large memory allocations. This
follows the same vulnerability pattern that was fixed for num_syncs in
commit 8e461304009d ("drm/xe: Limit num_syncs to prevent huge allocations").

Add DRM_XE_MAX_BINDS (2048) limit and validate num_binds before allocation.

v2: Increased limit from 1024 to 2048 based on Mesa source analysis:
    - Mesa's maximum usage: 960 binds (conformance test dEQP-VK)
    - Confirmed by Intel Mesa developer in commit ba6bbdc
    - 2048 provides 2.13x safety margin while limiting allocation to 64KB
    - Prevents unbounded allocation (attacker could send 268M binds = 18.8GB)

Cc: stable@vger.kernel.org

Signed-off-by: Ramesh <adhikari.resume@gmail.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 5 +++++
 include/uapi/drm/xe_drm.h  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index a717a2b8dea..1ff66874f43 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3841,6 +3841,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		return -EINVAL;
 
 	err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
+
+	if (XE_IOCTL_DBG(xe, args->num_binds > DRM_XE_MAX_BINDS)) {
+		err = -EINVAL;
+		goto put_vm;
+	}
 	if (err)
 		goto put_vm;
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index ae2fda23ce7..e666b73c81d 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1606,6 +1606,7 @@ struct drm_xe_exec {
 	__u32 exec_queue_id;
 
 #define DRM_XE_MAX_SYNCS 1024
+#define DRM_XE_MAX_BINDS 2048
 	/** @num_syncs: Amount of struct drm_xe_sync in array. */
 	__u32 num_syncs;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
  2026-05-07  5:53 [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion Ramesh Adhikari
@ 2026-05-07  6:47 ` Thomas Hellström
  2026-05-07 14:28   ` Matthew Brost
  2026-05-09 15:42 ` Ramesh Adhikari
  2026-05-09 17:14 ` Ramesh Adhikari
  2 siblings, 1 reply; 6+ messages in thread
From: Thomas Hellström @ 2026-05-07  6:47 UTC (permalink / raw)
  To: Ramesh Adhikari, intel-xe, matthew.brost, rodrigo.vivi; +Cc: stable

On Thu, 2026-05-07 at 11:23 +0530, Ramesh Adhikari wrote:
> The xe_vm_bind_ioctl function accepts user-controlled num_binds
> without
> bounds checking, allowing arbitrarily large memory allocations. This
> follows the same vulnerability pattern that was fixed for num_syncs
> in
> commit 8e461304009d ("drm/xe: Limit num_syncs to prevent huge
> allocations").
> 
> Add DRM_XE_MAX_BINDS (2048) limit and validate num_binds before
> allocation.
> 
> v2: Increased limit from 1024 to 2048 based on Mesa source analysis:
>     - Mesa's maximum usage: 960 binds (conformance test dEQP-VK)
>     - Confirmed by Intel Mesa developer in commit ba6bbdc

Please use the standard way of referring to commits.

This is the maximum usage in the conformance suite. That commit does
not mention maximum usage for applications in the wild, for which we
can't have any regressions. 


>     - 2048 provides 2.13x safety margin while limiting allocation to
> 64KB
>     - Prevents unbounded allocation (attacker could send 268M binds =
> 18.8GB)

Referring to my previous email, it actually looks like most if not all
allocations in this path use __GFP_ACCOUNT | __GFP_RETRY_MAYFAIL |
__GFP_NOWARN, Did you actually verify that a malicious bind
significantly can exceed the cgroup limits?


> 
> Cc: stable@vger.kernel.org
> 
> Signed-off-by: Ramesh <adhikari.resume@gmail.com>
> ---
>  drivers/gpu/drm/xe/xe_vm.c | 5 +++++
>  include/uapi/drm/xe_drm.h  | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index a717a2b8dea..1ff66874f43 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3841,6 +3841,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev,
> void *data, struct drm_file *file)
>  		return -EINVAL;
>  
>  	err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
> +
> +	if (XE_IOCTL_DBG(xe, args->num_binds > DRM_XE_MAX_BINDS)) {
> +		err = -EINVAL;

If we end up concluding that this is indeed needed, we should return 
-ENOBUFS here to trigger a graceful retry.

Thanks,
Thomas


> +		goto put_vm;
> +	}
>  	if (err)
>  		goto put_vm;
>  
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index ae2fda23ce7..e666b73c81d 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -1606,6 +1606,7 @@ struct drm_xe_exec {
>  	__u32 exec_queue_id;
>  
>  #define DRM_XE_MAX_SYNCS 1024
> +#define DRM_XE_MAX_BINDS 2048
>  	/** @num_syncs: Amount of struct drm_xe_sync in array. */
>  	__u32 num_syncs;
>  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
  2026-05-07  6:47 ` Thomas Hellström
@ 2026-05-07 14:28   ` Matthew Brost
  2026-05-08  6:48     ` Ramesh Adhikari
  0 siblings, 1 reply; 6+ messages in thread
From: Matthew Brost @ 2026-05-07 14:28 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: Ramesh Adhikari, intel-xe, rodrigo.vivi, stable

On Thu, May 07, 2026 at 08:47:07AM +0200, Thomas Hellström wrote:
> On Thu, 2026-05-07 at 11:23 +0530, Ramesh Adhikari wrote:
> > The xe_vm_bind_ioctl function accepts user-controlled num_binds
> > without
> > bounds checking, allowing arbitrarily large memory allocations. This
> > follows the same vulnerability pattern that was fixed for num_syncs
> > in
> > commit 8e461304009d ("drm/xe: Limit num_syncs to prevent huge
> > allocations").
> > 
> > Add DRM_XE_MAX_BINDS (2048) limit and validate num_binds before
> > allocation.
> > 
> > v2: Increased limit from 1024 to 2048 based on Mesa source analysis:
> >     - Mesa's maximum usage: 960 binds (conformance test dEQP-VK)
> >     - Confirmed by Intel Mesa developer in commit ba6bbdc
> 
> Please use the standard way of referring to commits.
> 
> This is the maximum usage in the conformance suite. That commit does
> not mention maximum usage for applications in the wild, for which we
> can't have any regressions. 
> 

I still think 1k, 2k to artifically too low. The Vk interface for array
of binds doesn't have a limit nor do sync interface either. In case of
sync I believe we found a typical max usage of of something like 10 but
set artifical limit to 1k just be paranoid. I'd up the limit beyond 1k
or 2k to prevent seemly valid use cases from forcing a split fallback in
user space. Even if each individual bind maps to 4k internal (usually
this just a handfull of bytes for the PTE writes) - 2k binds would 8M of
temporary memory. Ofc we can increase this future but I really don't see
the downside of starting with something larger now.

Matt

> 
> >     - 2048 provides 2.13x safety margin while limiting allocation to
> > 64KB
> >     - Prevents unbounded allocation (attacker could send 268M binds =
> > 18.8GB)
> 
> Referring to my previous email, it actually looks like most if not all
> allocations in this path use __GFP_ACCOUNT | __GFP_RETRY_MAYFAIL |
> __GFP_NOWARN, Did you actually verify that a malicious bind
> significantly can exceed the cgroup limits?
> 
> 
> > 
> > Cc: stable@vger.kernel.org
> > 
> > Signed-off-by: Ramesh <adhikari.resume@gmail.com>
> > ---
> >  drivers/gpu/drm/xe/xe_vm.c | 5 +++++
> >  include/uapi/drm/xe_drm.h  | 1 +
> >  2 files changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index a717a2b8dea..1ff66874f43 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -3841,6 +3841,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev,
> > void *data, struct drm_file *file)
> >  		return -EINVAL;
> >  
> >  	err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
> > +
> > +	if (XE_IOCTL_DBG(xe, args->num_binds > DRM_XE_MAX_BINDS)) {
> > +		err = -EINVAL;
> 
> If we end up concluding that this is indeed needed, we should return 
> -ENOBUFS here to trigger a graceful retry.
> 
> Thanks,
> Thomas
> 
> 
> > +		goto put_vm;
> > +	}
> >  	if (err)
> >  		goto put_vm;
> >  
> > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > index ae2fda23ce7..e666b73c81d 100644
> > --- a/include/uapi/drm/xe_drm.h
> > +++ b/include/uapi/drm/xe_drm.h
> > @@ -1606,6 +1606,7 @@ struct drm_xe_exec {
> >  	__u32 exec_queue_id;
> >  
> >  #define DRM_XE_MAX_SYNCS 1024
> > +#define DRM_XE_MAX_BINDS 2048
> >  	/** @num_syncs: Amount of struct drm_xe_sync in array. */
> >  	__u32 num_syncs;
> >  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
  2026-05-07 14:28   ` Matthew Brost
@ 2026-05-08  6:48     ` Ramesh Adhikari
  0 siblings, 0 replies; 6+ messages in thread
From: Ramesh Adhikari @ 2026-05-08  6:48 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Thomas Hellström, intel-xe, rodrigo.vivi, stable

Hi Matthew,



When I was tracing through the code, I found that vm_bind_ioctl_ops_create
allocates about 160 bytes per bind (drm_gpuva_ops + xe_vma_op) in the loop,
and those allocations use GFP_KERNEL without __GFP_ACCOUNT. That's separate
from the main arrays you already have protected with __GFP_ACCOUNT.

At 2048 binds that's only 320KB unaccounted, which is why I thought it was
safe to start conservative. But you're right - at 64k binds it would still
only be about 10MB unaccounted, which is probably fine and won't force
unnecessary fallbacks.

Should I send a v4 with 64k? Or do you think the loop allocations I found
need a different approach?

Thanks,
Ramesh

On Thu, May 7, 2026 at 7:58 PM Matthew Brost <matthew.brost@intel.com> wrote:
>
> On Thu, May 07, 2026 at 08:47:07AM +0200, Thomas Hellström wrote:
> > On Thu, 2026-05-07 at 11:23 +0530, Ramesh Adhikari wrote:
> > > The xe_vm_bind_ioctl function accepts user-controlled num_binds
> > > without
> > > bounds checking, allowing arbitrarily large memory allocations. This
> > > follows the same vulnerability pattern that was fixed for num_syncs
> > > in
> > > commit 8e461304009d ("drm/xe: Limit num_syncs to prevent huge
> > > allocations").
> > >
> > > Add DRM_XE_MAX_BINDS (2048) limit and validate num_binds before
> > > allocation.
> > >
> > > v2: Increased limit from 1024 to 2048 based on Mesa source analysis:
> > >     - Mesa's maximum usage: 960 binds (conformance test dEQP-VK)
> > >     - Confirmed by Intel Mesa developer in commit ba6bbdc
> >
> > Please use the standard way of referring to commits.
> >
> > This is the maximum usage in the conformance suite. That commit does
> > not mention maximum usage for applications in the wild, for which we
> > can't have any regressions.
> >
>
> I still think 1k, 2k to artifically too low. The Vk interface for array
> of binds doesn't have a limit nor do sync interface either. In case of
> sync I believe we found a typical max usage of of something like 10 but
> set artifical limit to 1k just be paranoid. I'd up the limit beyond 1k
> or 2k to prevent seemly valid use cases from forcing a split fallback in
> user space. Even if each individual bind maps to 4k internal (usually
> this just a handfull of bytes for the PTE writes) - 2k binds would 8M of
> temporary memory. Ofc we can increase this future but I really don't see
> the downside of starting with something larger now.
>
> Matt
>
> >
> > >     - 2048 provides 2.13x safety margin while limiting allocation to
> > > 64KB
> > >     - Prevents unbounded allocation (attacker could send 268M binds =
> > > 18.8GB)
> >
> > Referring to my previous email, it actually looks like most if not all
> > allocations in this path use __GFP_ACCOUNT | __GFP_RETRY_MAYFAIL |
> > __GFP_NOWARN, Did you actually verify that a malicious bind
> > significantly can exceed the cgroup limits?
> >
> >
> > >
> > > Cc: stable@vger.kernel.org
> > >
> > > Signed-off-by: Ramesh <adhikari.resume@gmail.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_vm.c | 5 +++++
> > >  include/uapi/drm/xe_drm.h  | 1 +
> > >  2 files changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index a717a2b8dea..1ff66874f43 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -3841,6 +3841,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev,
> > > void *data, struct drm_file *file)
> > >             return -EINVAL;
> > >
> > >     err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
> > > +
> > > +   if (XE_IOCTL_DBG(xe, args->num_binds > DRM_XE_MAX_BINDS)) {
> > > +           err = -EINVAL;
> >
> > If we end up concluding that this is indeed needed, we should return
> > -ENOBUFS here to trigger a graceful retry.
> >
> > Thanks,
> > Thomas
> >
> >
> > > +           goto put_vm;
> > > +   }
> > >     if (err)
> > >             goto put_vm;
> > >
> > > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > > index ae2fda23ce7..e666b73c81d 100644
> > > --- a/include/uapi/drm/xe_drm.h
> > > +++ b/include/uapi/drm/xe_drm.h
> > > @@ -1606,6 +1606,7 @@ struct drm_xe_exec {
> > >     __u32 exec_queue_id;
> > >
> > >  #define DRM_XE_MAX_SYNCS 1024
> > > +#define DRM_XE_MAX_BINDS 2048
> > >     /** @num_syncs: Amount of struct drm_xe_sync in array. */
> > >     __u32 num_syncs;
> > >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
  2026-05-07  5:53 [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion Ramesh Adhikari
  2026-05-07  6:47 ` Thomas Hellström
@ 2026-05-09 15:42 ` Ramesh Adhikari
  2026-05-09 17:14 ` Ramesh Adhikari
  2 siblings, 0 replies; 6+ messages in thread
From: Ramesh Adhikari @ 2026-05-09 15:42 UTC (permalink / raw)
  To: matthew.brost; +Cc: thomas.hellstrom, intel-xe, rodrigo.vivi, stable

Hi Matthew and Thomas,

I apologize if you're receiving this message multiple times - I've tried 
sending replies twice before but they don't appear in the mailing list 
archives or on Patchwork, so I'm not sure if they reached you. I also 
attempted to send a v3 patch which similarly didn't appear. If you did 
receive my previous emails, I sincerely apologize for the duplicate messages.

When I was tracing through the code, I found that vm_bind_ioctl_ops_create
allocates about 160 bytes per bind (drm_gpuva_ops + xe_vma_op) in the loop,
and those allocations use GFP_KERNEL without __GFP_ACCOUNT. That's separate
from the main arrays you already have protected with __GFP_ACCOUNT.

At 2048 binds that's only 320KB unaccounted, which is why I thought it was
safe to start conservative. But you're right - at 64k binds it would still
only be about 10MB unaccounted, which is probably fine and won't force
unnecessary fallbacks.

Should I send a v4 with 64k? Or do you think the loop allocations I found
need a different approach?

Thanks,
Ramesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
  2026-05-07  5:53 [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion Ramesh Adhikari
  2026-05-07  6:47 ` Thomas Hellström
  2026-05-09 15:42 ` Ramesh Adhikari
@ 2026-05-09 17:14 ` Ramesh Adhikari
  2 siblings, 0 replies; 6+ messages in thread
From: Ramesh Adhikari @ 2026-05-09 17:14 UTC (permalink / raw)
  To: matthew.brost; +Cc: thomas.hellstrom, intel-xe, rodrigo.vivi, stable

Hi Matthew and Thomas,

I apologize if you receive this message multiple times - I had mailing list 
subscription issues that prevented my earlier replies from reaching the 
archives. I'm now properly subscribed and following up on the discussion.

When I was tracing through the code, I found that vm_bind_ioctl_ops_create
allocates about 160 bytes per bind (drm_gpuva_ops + xe_vma_op) in the loop,
and those allocations use GFP_KERNEL without __GFP_ACCOUNT. That's separate
from the main arrays you already have protected with __GFP_ACCOUNT.

At 2048 binds that's only 320KB unaccounted, which is why I thought it was
safe to start conservative. But you're right - at 64k binds it would still
only be about 10MB unaccounted, which is probably fine and won't force
unnecessary fallbacks.

Should I send a v4 with 64k? Or do you think the loop allocations I found
need a different approach?

Thanks,
Ramesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-09 17:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07  5:53 [PATCH v2] drm/xe: Add bounds check for num_binds to prevent memory exhaustion Ramesh Adhikari
2026-05-07  6:47 ` Thomas Hellström
2026-05-07 14:28   ` Matthew Brost
2026-05-08  6:48     ` Ramesh Adhikari
2026-05-09 15:42 ` Ramesh Adhikari
2026-05-09 17:14 ` Ramesh Adhikari

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox