* [PATCH v3] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
@ 2026-05-07 14:58 Ramesh Adhikari
0 siblings, 0 replies; 2+ messages in thread
From: Ramesh Adhikari @ 2026-05-07 14:58 UTC (permalink / raw)
To: intel-xe
Cc: matthew.brost, thomas.hellstrom, rodrigo.vivi, stable,
Ramesh Adhikari
The xe_vm_bind_ioctl function accepts user-controlled num_binds without
any bounds checking. I noticed this follows the same pattern that was
fixed for num_syncs in commit 8e461304009d.
While the main allocations (bind_ops, bos, ops arrays) use __GFP_ACCOUNT,
I found that vm_bind_ioctl_ops_create makes additional allocations in a
loop that don't:
- drm_gpuva_ops (16 bytes) at drm_gpuvm.c:2949
- xe_vma_op (144 bytes) at xe_vm.c:1318
Both use kzalloc_obj() which defaults to GFP_KERNEL without __GFP_ACCOUNT.
I traced through what happens with a large num_binds value. For 268M binds,
the loop at line 3971 runs 268M times, allocating 160 bytes per iteration.
That's about 43 GB allocated without cgroup accounting before the code even
hits the main allocation at line 4009 (which will fail because it exceeds
the 4MB kmalloc limit). So even though the big allocation fails, the damage
from the loop allocations already happened.
I'm adding a limit of 2048 binds, checked before any allocations happen.
I found that Mesa uses at most 960 binds in conformance tests (from commit
ba6bbdc291), so 2048 gives about 2x headroom. At 2048 binds, we're only
allocating 320KB in the loop, which seems reasonable.
I'm using -ENOBUFS instead of -EINVAL so Mesa can retry with smaller
batches if needed, as Thomas suggested.
A note on my methodology: I don't have Xe hardware, so this is based on
reading through the code and tracing the allocation paths. I verified the
struct sizes and followed the call chains manually. If I got something
wrong or missed something, please let me know - this is my first kernel
patch and I'm still learning.
v3: Changed to -ENOBUFS, moved the check earlier, added more details
about the allocations I found
v2: Bumped limit from 1024 to 2048 after looking at Mesa usage
Cc: stable@vger.kernel.org
Signed-off-by: Ramesh Adhikari <adhikari.resume@gmail.com>
---
drivers/gpu/drm/xe/xe_vm.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 1ff66874f43..1ab020cbdc1 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3840,12 +3840,14 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
if (XE_IOCTL_DBG(xe, !vm))
return -EINVAL;
- err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
-
+ /* Prevent unbounded allocations in vm_bind_ioctl_ops_create loop */
if (XE_IOCTL_DBG(xe, args->num_binds > DRM_XE_MAX_BINDS)) {
- err = -EINVAL;
+ err = -ENOBUFS;
goto put_vm;
}
+
+ err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
+
if (err)
goto put_vm;
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH v3] drm/xe: Add bounds check for num_binds to prevent memory exhaustion
@ 2026-05-08 6:45 Ramesh Adhikari
0 siblings, 0 replies; 2+ messages in thread
From: Ramesh Adhikari @ 2026-05-08 6:45 UTC (permalink / raw)
To: intel-xe
Cc: matthew.brost, thomas.hellstrom, rodrigo.vivi, stable,
Ramesh Adhikari
The xe_vm_bind_ioctl function accepts user-controlled num_binds without
any bounds checking. I noticed this follows the same pattern that was
fixed for num_syncs in commit 8e461304009d.
While the main allocations (bind_ops, bos, ops arrays) use __GFP_ACCOUNT,
I found that vm_bind_ioctl_ops_create makes additional allocations in a
loop that don't:
- drm_gpuva_ops (16 bytes) at drm_gpuvm.c:2949
- xe_vma_op (144 bytes) at xe_vm.c:1318
Both use kzalloc_obj() which defaults to GFP_KERNEL without __GFP_ACCOUNT.
I traced through what happens with a large num_binds value. For 268M binds,
the loop at line 3971 runs 268M times, allocating 160 bytes per iteration.
That's about 43 GB allocated without cgroup accounting before the code even
hits the main allocation at line 4009 (which will fail because it exceeds
the 4MB kmalloc limit). So even though the big allocation fails, the damage
from the loop allocations already happened.
I'm adding a limit of 2048 binds, checked before any allocations happen.
I found that Mesa uses at most 960 binds in conformance tests (from commit
ba6bbdc291), so 2048 gives about 2x headroom. At 2048 binds, we're only
allocating 320KB in the loop, which seems reasonable.
I'm using -ENOBUFS instead of -EINVAL so Mesa can retry with smaller
batches if needed, as Thomas suggested.
A note on my methodology: I don't have Xe hardware, so this is based on
reading through the code and tracing the allocation paths. I verified the
struct sizes and followed the call chains manually. If I got something
wrong or missed something, please let me know - this is my first kernel
patch and I'm still learning.
v3: Changed to -ENOBUFS, moved the check earlier, added more details
about the allocations I found
v2: Bumped limit from 1024 to 2048 after looking at Mesa usage
Cc: stable@vger.kernel.org
Signed-off-by: Ramesh Adhikari <adhikari.resume@gmail.com>
---
drivers/gpu/drm/xe/xe_vm.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 1ff66874f43..1ab020cbdc1 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3840,12 +3840,14 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
if (XE_IOCTL_DBG(xe, !vm))
return -EINVAL;
- err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
-
+ /* Prevent unbounded allocations in vm_bind_ioctl_ops_create loop */
if (XE_IOCTL_DBG(xe, args->num_binds > DRM_XE_MAX_BINDS)) {
- err = -EINVAL;
+ err = -ENOBUFS;
goto put_vm;
}
+
+ err = vm_bind_ioctl_check_args(xe, vm, args, &bind_ops);
+
if (err)
goto put_vm;
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-08 6:45 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-08 6:45 [PATCH v3] drm/xe: Add bounds check for num_binds to prevent memory exhaustion Ramesh Adhikari
-- strict thread matches above, loose matches on Subject: below --
2026-05-07 14:58 Ramesh Adhikari
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox